Hi
I'm trying to migrate Kafka JSON messages to BigQuery, including Kafka `key` value. I'm following docs here: https://cloud.google.com/dataflow/docs/guides/templates/provided/kafka-to-bigquery . The `key` is not part of the JSON message itself, but is added by Kafka to every message (Kafka metadata like key, partition, offset, timestamp, etc.). The messages are migrated properly (the column `msg` contains proper data), but the `_key` column is null for all rows. Any ideas what could be wrong? I run the flex template like this:
```
gcloud dataflow flex-template run kafka-to-bq \
--template-file-gcs-location gs://dataflow-templates/latest/flex/Kafka_to_BigQuery_Flex \
--additional-experiments=use_runner_v2 \
--region europe-west1 \
--service-account-email=<SA_MAIL> \
--network=<NETWORK> \
--subnetwork=<SUBNETWORK> \
--additional-experiments=use_network_tags=<NET_TAG> \
--additional-experiments=use_network_tags_for_flex_templates=<NET_TAG> \
--parameters '^&^readBootstrapServerAndTopic=<KAFKA_SERVERS>;<TOPIC>' \
--parameters outputTableSpec=<PROJECT.DATASET.TABLE>,\
outputDeadletterTable=<PROJECT.DATASET2.TABLE2>,\
writeDisposition=WRITE_APPEND,\
writeMode=SINGLE_TABLE_NAME,\
useBigQueryDLQ=false,\
messageFormat=JSON,\
kafkaReadAuthenticationMode=NONE,\
javascriptTextTransformGcsPath=gs://<BUCKET>/js/udf.js,\
javascriptTextTransformFunctionName=transformRow,\
persistKafkaKey=true
```
My UDF function looks like (for simplicity):
```
function transformRow(inStr) {
return JSON.stringify({'msg': inStr})
}
```
My BigQuery table:
```
CREATE OR REPLACE TABLE `<PROJECT.DATASET.TABLE>`
(
_key BYTES,
msg STRING
);
```
What I tried so far:
- removing UDF function and creating BigQuery table with columns for all JSON fields. Result: all the columns are populated properly (with the respective JSON fields), but the column `_key` is still empty.
Thank you for any tips/help.
Hi @Kossak,
Welcome to the Google Cloud Community!
It seems that the Kafka key isn't being successfully migrated to your BigQuery table, resulting in the ‘_key’ column being empty.
Here are a few workarounds you may want to explore that might help resolve your use case:
You may refer to this documentation, which might help you understand how to Write data from Kafka to BigQuery with Dataflow.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.