Metadata is essential for various data initiatives, from deploying NLP solutions and chat with your data applications to enabling semantic search and establishing a business glossary. However, the manual effort of documenting every data attribute within a platform can be time-consuming and error-prone.
At Google Cloud Next, we announced BigQuery Data Insights and Automated Metadata Generation, two powerful capabilities that simplify this process using Gen AI.
In this blog, you'll learn how to use these features to automatically generate metadata, uncover hidden insights, and enhance the way your organization interacts with data.
We'll also dive into real-world use cases that demonstrate how these capabilities can drive value across your organization.
To create, manage, and retrieve data insights, ask your administrator to grant you the following IAM roles:
To get read-only access to the generated insights, make sure your account has the following IAM role:
Imagine you're a business analyst, eager to explore data to answer key business questions or to build impactful data products. You’ve finally gained access to the data warehouse, but now you’re faced with countless tables, unsure of where to even start. You might be asking yourself: Am I missing valuable insights? Could there be hidden patterns that offer a competitive edge?
This is where BigQuery Data Insights comes in. It proactively surfaces meaningful insights from your data, revealing both the expected patterns and the hidden gems you might not have considered.
Let’s see how BigQuery Data Insights makes that possible.
Step 1: Open BigQuery Studio
In the Google Cloud Console, navigate to BigQuery. Choose the dataset and table you'd like to analyze, then click Generate insights for free.
Figure 1 : BQ studio - Insights
Step 2: Select a region
Choose the region where your insights and metadata should be stored and processed. Click Generate to start the analysis.
Figure 2 : Insights region
Step 3: Review the insights
After a few minutes, BigQuery presents a curated list of insights tailored to your data.
Figure 3 :Generated insights
Step 4: Expand and explore
You can review all the generated insights and expand the ones that catch your interest. For each expanded insight, BigQuery presents the corresponding SQL query.
Let’s say you’re curious about customer lifetime value, and this particular insight grabs your attention: “Find the customer with the highest total revenue generated, and then calculate the percentage of their total revenue that comes from each product they purchased.”
Simply expand that insight to reveal the full SQL query. Yes, it’s a pretty complex query you’d normally have to write yourself. You’re welcome 😉.
Figure 4 : Insight and SQL statement
Step 5: Run
When you click the Copy to query button, your screen will split, and a new query window will open with the SQL statement. All you need to do is hit the Run button to generate those amazing insights!
Figure 5 : Run SQL statement and ask a follow-up
Step 6: Ask follow-up questions
To uncover more insights, click Ask a follow-up. This will open Data Canvas, where you can explore your data conversationally using natural language. Thanks to Gemini's guidance via the Canvas assistant, you won’t have to start from scratch. We provide helpful example questions to inspire you, or you can ask anything that comes to mind.
Figure 6 : Ask a follow-up in Data Canvas
Beyond the powerful insights generation, BigQuery has also introduced another significant capability: automated metadata generation.
This feature intelligently analyzes your data, regardless of whether column names are descriptive or generic, and creates comprehensive descriptions at both the column and table levels.
Imagine the immense time saved compared to manually documenting and maintaining every attribute in your data platform!
Here's how it works: Immediately after you click the Generate insights for free button (Figure 1 : BQ studio - Insights), data insights are generated, including descriptions of the columns and the table.
Note: This feature is currently in Private Preview. To request access, please sign up here.
To view the detailed metadata for each column, click View column descriptions.Figure 7 : Table and columns description
The following window will display the generated description, allowing you to check its accuracy. If you want to make any edits or add more information, you can do so here. When you're finished reviewing and updating, click Save the details to finalize and save your modifications.Figure 8 : Review and update table and columns description
After saving, navigate back to the Details tab to view the updated table description.Figure 9 : Updated table description
To view the column descriptions, navigate to the Schema tab.
Figure 10 : Updated column descriptions
This powerful feature unlocks a range of capabilities across data discovery and understanding:
Searching and discovering data assets is a critical need for many organizations. Traditional keyword-based search can fall short, making it difficult to find relevant data. With automated metadata generation and Gemini’s capabilities built into BigQuery, users can now benefit from semantic search, where the engine understands the meaning behind your query, not just the exact keywords.
Figure 11, Semantic search in Dataplex, illustrates how the generated metadata enables semantic search within Dataplex.Figure 11 : Semantic search in Dataplex
Another key challenge for organizations is ensuring and maintaining a shared understanding of the business terms across the enterprise. Building a business glossary manually is time-consuming and complex. With automated metadata generation, this process becomes simpler, and more powerful.
Although automatic synchronization isn't currently available, the generated attribute definitions still offer substantial benefits and time savings.
Thanks to the generated descriptions of tables and columns, large language models (LLMs) can better understand your data. They can identify relevant fields, understand relationships between entities, and significantly improve the translation from natural language to SQL.
You can also leverage the available APIs to build your own applications, enabling users to chat with your data and gain insights through conversational interfaces.
Figure 12 below shows how the metadata generated enables NLP to SQL in BigQuery Studio.Figure 12 : Natural language to SQL
BigQuery’s Gen AI-powered metadata and insights generation dramatically reduces the time and effort needed to understand and act on your data. Whether you’re a data analyst, engineer, or business user, these capabilities help you go from raw tables to real insights, without writing complex SQL or starting from scratch. Try it out now!
Additional resources :