Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

How to de-identify data before logging to BigQuery Table

Hi Everyone, 

Could anyone help me on how can I de-identify PII ( Personal information like Credit Card number) before getting logged to big Query table . 

I have been thinking of creating a pub/sub topic to get triggered on every conversation request which will trigger a cloud function for de-identifying the user and agent Utterances  and insert to BigQuery Table , but unable to find a way to trigger event on every conversation. 

Please provide any resources or documents or the same. 

Thanks. 




0 1 361
1 REPLY 1

I would recommend to check Google's Cloud Data Loss Prevention API:


Fully managed service designed to help you discover, classify, and protect your most sensitive data.

Also here is a documentation discussing a pipeline de identification of large datasets that might be helpful for your use case.

https://cloud.google.com/architecture/de-identification-re-identification-pii-using-cloud-dlp