We recently launched a bi-monthly BigQuery newsletter to give users a roundup of the latest announcements, innovations, code samples and learning resources. We’re always building something new, often based on your feedback, so how about a single place for you to find out what we’ve built to (maybe) make your life easier?
In 2024, we launched over 150 features designed to streamline diverse analytics workloads, simplify data-to-AI workflows, and enhance productivity through Gemini agents. Our goal is to provide you with learning resources, product updates, and more to help you make the most out of BigQuery. Checkout this month’s featured content.
Google named a Leader in the 2024 Gartner® Magic Quadrant™ for Cloud Database Management Systems
Google was positioned furthest in vision among all vendors evaluated, “with the largest growth and highest year-over-year growth among the leading providers in recent years,” according to Gartner. Download the complimentary report.
Google Cloud named a Leader in the 2024 Gartner® Magic Quadrant™ for Data Integration Tools
This recognition reflects our commitment to continuous customer innovation in areas such as unified data to AI governance, flexible and accessible data engineering experiences, and AI-powered data integration capabilities. Download the complimentary report.
Use Gemini models in BigQuery
BigQuery ML now supports creating remote models based on the Vertex AI gemini-2.0-flash-exp. Perform generative AI tasks such as audio transcription and document classification, using the ML.GENERATE_TEXT function for content stored in BigQuery object tables.
Build RAG workflows in BigQuery with Document AI
New capabilities in BigQuery and Document AI let you effortlessly parse and analyze documents. Using SQL you can extract key information, generate embeddings, and combine those results with prompts to LLM models to help accurately interpret results in no time. This streamlined approach to document processing will enable you to unlock the power of retrieval-augmented generation (RAG) directly in BigQuery.
AI-powered data preparation
BigQuery AI-assisted data preparation is now in preview. It leverages AI to analyze your data and provide intelligent suggestions for cleaning, transforming, and enriching it. BigQuery data preparation is part of Gemini in BigQuery, a set of intelligent AI-powered capabilities including assistive experiences, operations, and optimization features.
Automatic discovery and cataloging
Dataplex automatic discovery, available in public preview, lets you scan data in Cloud Storage buckets to extract and catalog metadata. Automatic discovery creates tables you can use for analytics and AI, and catalogs that data in Dataplex Catalog.
Enhanced Dataproc Serverless runtimes
Dataproc Serverless, Google Cloud’s serverless Spark offering, now has expanded libraries including XGBoost, Hugging Face Transformers, and PyTorch, eliminating the hassle of manual configuration.
Dataproc Serverless is now faster, easier, and smarter
New capabilities including native query execution, real-time job progress tracking, troubleshooting of batch jobs, and autotuning and assisted troubleshooting with Gemini make running Dataproc Serverless even faster, easier, and more intelligent.
Faster data exploration with semantic search
New Dataplex Semantic Search enhances data exploration and retrieval within BigQuery data canvas by enabling you to use natural language queries to find relevant assets, resulting in faster, smarter, and more intuitive data exploration while improving the efficiency of data analysis workflows.
BigQuery managed disaster recovery is now generally available
Disaster recovery provides coordinated failover of compute and storage, enabling business continuity in the unlikely event of a total regional infrastructure outage.
Synthetic data generation with Gretel and BigQuery DataFrames
Dive into this technical how-to for synthetic data generation using BigQuery DataFrames and Gretel to drive data workflows, while ensuring high-quality data, privacy protection, and adherence to compliance requirements.
Create an Apache Iceberg table with BigQuery
Learn how to create, load, manage, and query BigQuery tables for Apache Iceberg that offer the same fully managed experience as BigQuery tables with additional features like autonomous storage optimizations, clustering, and high-throughput streaming ingestion.
How to overcome streaming analytics challenges
PayPal migrated the streaming logs-based pipelines of their observability platform from a self-managed Apache Flink-based infrastructure to Dataflow – and helped solve several challenges they were experiencing in the process. PUMA used BigQuery and ML to identify advanced audiences based on high purchase propensity, and achieved a 149.8% increase in click-through rate and significant improvements in conversion rates and average order value as a result.
Featured Learning Path
Cloud Skills Boost: Baseline Data, ML, and AI. Big data, machine learning, and artificial intelligence are today’s hot computing topics, but these fields are quite specialized and introductory material is hard to come by. This introductory-level quest will help you take your first steps with tools like BigQuery, Cloud Speech API, and Vertex AI.
We want to hear from you! Share your BigQuery tips with the Google Cloud Community, join our Google Cloud Innovators program, or share feedback on BigQuery with our engineering team. Hoping to see something that wasn’t included? Drop a comment below on what other content you want to see from us.