Natural language support in BigQuery

I want to create an application to obtain data in BigQuery from a user message. I saw that Gemini in BigQuery does exactly what I want, where I enter text and it generates SQL, queries data and generates graphs. However, I would like to use these resources outside of BigQuery Studio, such as in Python. Is there any way to do this? Or is there another feature in BigQuery that allows you to query data using natural language?

Solved Solved
6 2 162
1 ACCEPTED SOLUTION

Currently, Gemini's natural language to SQL capabilities within BigQuery are currently exclusive to BigQuery Studio. There's no official public API for direct use in Python applications. 

 While google-cloud-aiplatform and google-cloud-language are valuable for other AI/NLP tasks, they don't directly support natural language to SQL translation.

However, if you are looking to integrate this type of natural language to SQL translation functionality outside of BigQuery Studio, such as in a Python application, you will need to consider a few approaches:

  1. BigQuery API with Custom Implementation:

    • You can use the BigQuery API to execute SQL queries from your Python application. To implement a natural language to SQL conversion feature similar to Gemini, you would need to use Natural Language Processing (NLP) models to translate user queries into SQL statements. This might involve training a custom model or using pre-trained models from libraries like Hugging Face's transformers.
  2. Vertex AI and LLMs:

    • You might consider using Vertex AI, which provides tools to train and deploy machine learning models, including models that can handle language tasks. With the introduction of large language models (LLMs), you could potentially use a pre-trained model that understands both natural language and SQL syntax to create a bridge between user inputs and database queries.
  3. Using BigQuery ML for Predictions:

    • If your aim is also to include predictive analytics or machine learning based on your BigQuery data, BigQuery ML allows you to create and execute machine learning models directly within BigQuery using SQL-like queries. This won't translate natural language to SQL but can add advanced data processing capabilities to your application.

View solution in original post

2 REPLIES 2

Currently, Gemini's natural language to SQL capabilities within BigQuery are currently exclusive to BigQuery Studio. There's no official public API for direct use in Python applications. 

 While google-cloud-aiplatform and google-cloud-language are valuable for other AI/NLP tasks, they don't directly support natural language to SQL translation.

However, if you are looking to integrate this type of natural language to SQL translation functionality outside of BigQuery Studio, such as in a Python application, you will need to consider a few approaches:

  1. BigQuery API with Custom Implementation:

    • You can use the BigQuery API to execute SQL queries from your Python application. To implement a natural language to SQL conversion feature similar to Gemini, you would need to use Natural Language Processing (NLP) models to translate user queries into SQL statements. This might involve training a custom model or using pre-trained models from libraries like Hugging Face's transformers.
  2. Vertex AI and LLMs:

    • You might consider using Vertex AI, which provides tools to train and deploy machine learning models, including models that can handle language tasks. With the introduction of large language models (LLMs), you could potentially use a pre-trained model that understands both natural language and SQL syntax to create a bridge between user inputs and database queries.
  3. Using BigQuery ML for Predictions:

    • If your aim is also to include predictive analytics or machine learning based on your BigQuery data, BigQuery ML allows you to create and execute machine learning models directly within BigQuery using SQL-like queries. This won't translate natural language to SQL but can add advanced data processing capabilities to your application.

Thank you for the response and clarification.

I've started exploring the use of LLMs to generate SQL.