I need to create a Datastream to pull data from Postgresql (CloudSQL) and insert it into BigQuery. However, when not specifying primary keys, the BQ tables are automatically created with all columns as clustering keys. I'd like to change that and either: do not add any clustering key to BQ or specify which PG column is the primary key. When I try to do that by passing the pg columns and the pk flag, I run into an issue. I also can find the PK option in the console UI.
Here is what I've tried so far:
gcloud datastream streams create test \ --location=us-central1 \ --display-name=my-stream \ --source=source_test_connection_profile_id2 \ --postgresql-source-config=source_config.json \ --destination=destination_test_connection_profile_id2 \ --bigquery-destination-config=destination_config.json \ --backfill-none
source_config.json
{ "includeObjects": { "postgresqlSchemas": [ { "postgresqlTables": [ { "postgresqlColumns": [ { "column": "id", "primaryKey": true }, { "column": "name" }, { "column": "address" } ], "table": "buildings" } ], "schema": "public" } ] }, "publication": "test_publication", "replicationSlot": "test_replication" }
destination_config.json
{ "dataFreshness": "900s", "sourceHierarchyDatasets": { "datasetTemplate": { "location": "us-central1" } } }
This is the response:
{ "protoPayload": { "@type": "type.googleapis.com/google.cloud.audit.AuditLog", "status": { "code": 13, "message": "An unknown error has occurred" }, "authenticationInfo": { "principalEmail": "email", "principalSubject": "user:email" }, "requestMetadata": { "requestAttributes": {}, "destinationAttributes": {} }, "serviceName": "datastream.googleapis.com", "methodName": "google.cloud.datastream.v1.Datastream.CreateStream", "resourceName": "project", "resourceLocation": { "currentLocations": [ "us-central1" ] } }, "insertId": "id", "resource": { "type": "audited_resource", "labels": { "method": "google.cloud.datastream.v1.Datastream.CreateStream", "service": "datastream.googleapis.com", "project_id": "projectid" } }, "timestamp": "2023-03-01T18:45:12.357362572Z", "severity": "ERROR", "logName": "projects/projectid/logs/cloudaudit.googleapis.com%2Factivity", "operation": { "id": "projects/projectid/locations/us-central1/operations/operation-2-5f5db1ca53242-6b60b33a-7800af10", "producer": "datastream.googleapis.com", "last": true }, "receiveTimestamp": "2023-03-01T18:45:12.656614692Z" }
any advice? any other way to create bq tables w/o clustering key?