How to stream BigQuery table in Node.js?

I need to stream a BigQuery table for testing. I am trying:

const {BigQuery} = require('@google-cloud/bigquery');
const bigquery = new BigQuery();

const query = 'SELECT url FROM `publicdata.samples.github_nested`'; // this is not the actual table I am querying

bigquery.createQueryStream(query)
.on('error', console.error)
.on('data', function(row) {
// row is a result from your query.
})
.on('end', function() {
// All rows retrieved.
});

 

but it gives me

 

code: 403,
errors: [
{
message: 'Response too large to return. Consider specifying a destination table in your job configuration. For more details, see https://cloud.google.com/bigquery/troubleshooting-errors',
domain: 'global',
reason: 'responseTooLarge'
}
],

It is true the table is very big. How can a stream it bit by bit? I know I should use Dataflow for full table scan but this is for some adhoc testing. I will Control-C the application after some time. I want to stream some data from BigQuery to PubSub for development purposes until we have real stream available in production.

Solved Solved
0 2 1,591
1 ACCEPTED SOLUTION

Roderick
Community Manager
Community Manager

You can stream a BigQuery table bit by bit using the BigQuery.Streaming.createQueryStream method. This method takes a query and a destination table as arguments. The query can be any valid BigQuery query, and the destination table must be a table that is configured for streaming.

When you create a streaming query, BigQuery will automatically start streaming the data from the source table to the destination table. The data will be streamed in batches, and each batch will contain a subset of the data from the source table.

You can use the BigQuery.Streaming.on event handler to listen for data being streamed to the destination table. The data event handler will be called for each batch of data that is streamed.

Here is an example of how to stream a BigQuery table bit by bit:

const {BigQuery} = require('@google-cloud/bigquery');
const bigquery = new BigQuery();

// Create a streaming query for the `publicdata.samples.github_nested` table.
const query = 'SELECT url FROM `publicdata.samples.github_nested`';
const destinationTable = 'my-streaming-table';

// Create a streaming query stream.
const stream = bigquery.createQueryStream(query, destinationTable);

// Listen for data being streamed to the destination table.
stream.on('data', function(row) {
// Do something with the data.
});

// Stop the streaming query.
stream.on('end', function() {
// The streaming query has stopped.
});

This code will create a streaming query for the publicdata.samples.github_nested table and stream the data to the my-streaming-table table. The data will be streamed in batches, and each batch will contain a subset of the data from the source table. You can use the data event handler to listen for data being streamed to the destination table.

I would love to hear your thoughts!

 

View solution in original post

2 REPLIES 2

Roderick
Community Manager
Community Manager

You can stream a BigQuery table bit by bit using the BigQuery.Streaming.createQueryStream method. This method takes a query and a destination table as arguments. The query can be any valid BigQuery query, and the destination table must be a table that is configured for streaming.

When you create a streaming query, BigQuery will automatically start streaming the data from the source table to the destination table. The data will be streamed in batches, and each batch will contain a subset of the data from the source table.

You can use the BigQuery.Streaming.on event handler to listen for data being streamed to the destination table. The data event handler will be called for each batch of data that is streamed.

Here is an example of how to stream a BigQuery table bit by bit:

const {BigQuery} = require('@google-cloud/bigquery');
const bigquery = new BigQuery();

// Create a streaming query for the `publicdata.samples.github_nested` table.
const query = 'SELECT url FROM `publicdata.samples.github_nested`';
const destinationTable = 'my-streaming-table';

// Create a streaming query stream.
const stream = bigquery.createQueryStream(query, destinationTable);

// Listen for data being streamed to the destination table.
stream.on('data', function(row) {
// Do something with the data.
});

// Stop the streaming query.
stream.on('end', function() {
// The streaming query has stopped.
});

This code will create a streaming query for the publicdata.samples.github_nested table and stream the data to the my-streaming-table table. The data will be streamed in batches, and each batch will contain a subset of the data from the source table. You can use the data event handler to listen for data being streamed to the destination table.

I would love to hear your thoughts!

 

Thanks Roderick. I am going to check it out. Why do we need to create a destination table btw? It looks like technically there should be no reason for it. Could you explain?