Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Sensitive Data Protection (DLP) Discovery/Scan Configuration/Action/Pub Sub

Hi Experts, 
I am quite new with Sensitive Data Protection (DLP). 
While creating a scan configuration, in the Action section, there is a choice to "Publish to Pub/Sub". 

In a Cloud Function in Python, I tried to parse the pub sub message while a scan profile is generated by:

 

base64.b64.decode(cloud_event).data["message"]["data"]

 

The parse result included all the information about the scan result. However, I was confused by the format of the result. It contained many meaningless characters to make the output unreadable by script. 

 

b'\n\xa9\x14\n\\projects/my-project123/locations/asia-northeast1/tableDataProfiles/xxxxxx\x12^projects/my-project123/locations/asia-northeast1/projectDataProfiles/xxxxxx\x1aW//bigquery.googleapis.com/projects/my-project123/datasets/SDP_test/tables/table62*\x02\x08\x142\x02\x08\x14:\xd0\x0f\x12\xe2\x0e\n\x10\n\x0eADVERTISING_ID\n\x05\n\x03AGE\n\x16\n\x14ARGENTINA_DNI_NUMBER\n\x1b\n\x19AUSTRALIA_TAX_FILE_NUMBER\n!\n\x1fBELGIUM_NATIONAL_ID_CARD_NUMBER\n\x13\n\x11BRAZIL_CPF_NUMBER\n \n\x1eCANADA_SOCIAL_INSURANCE_NUMBER.....,US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER\n\x1b\n\x19US_SOCIAL_SECURITY_NUMBER\n\x1f\n\x1dVEHICLE_IDENTIFICATION_NUMBER\n\x16\n\x14VENEZUELA_CDI_NUMBER\n\x14\n\x12WEAK_PASSWORD_HASH\n\x0c\n\nAUTH_TOKEN\n\x11\n\x0fAWS_CREDENTIALS\n\x12\n\x10AZURE_AUTH_TOKEN\n\x13\n\x11BASIC_AUTH_HEADER\n\x10\n\x0eENCRYPTION_KEY\n\r\n\x0bGCP_API_KEY\n\x11\n\x0fGCP_CREDENTIALS\n\x10\n\x0eJSON_WEB_TOKEN\n\r\n\x0bHTTP_COOKIE\n\x0c\n\nXSRF_TOKEN\x10\x03\x1a\x00\x1ai*\x13my-project123:Rprojects/my-project123/locations/global/inspectTemplates/xxxxxxxxxxxx\x0c\x08\xc9\xdc\x8c\xb5\x06\x10\xc0\....\x01\x13my-project123\xca\x01\x08SDP_test\xd2\x01\x07table62\xda\x01\x15\n\x13\n\rEMAIL_ADDRESS\x1a\x02\x08\x14\xda\x01\x13\n\x11\n\x0bPERSON_NAME\x1a\x02\x08\x14\xda\x01\x14\n\x12\n\x0cPHONE_NUMBER\x1a\x02\x08\x14\xe2\x01\x15\n\x13\n\rEMAIL_ADDRESS\x1a\x02\x08\x14\xe2\x01\x13\n\x11\n\x0bPERSON_NAME\x1a\x02\x08\x14\xe2\x01\x14\n\x12\n\x0cPHONE_NUMBER\x1a\x02\x08\x14\xe2\x01\x16\n\x14\n\x0eSTREET_ADDRESS\x1a\x02\x08\x14\xea\x01\x0fasia-northeast1\xa2\x02\x17\n\x15google/bigquery/table\x10\x01'

 

Am I wrong to deal with the pub sub message?

Any feedback is appreciated.

Lee

Solved Solved
0 1 205
1 ACCEPTED SOLUTION

Resolved by using dlp_v2 to parse the converted string from base64. 

import base64
from google.cloud import dlp_v2

dlp_msg = dlp_v2.DataprofilePubSubMessage()
dlp_msg.ParseFromString(base64.b64decode(cloud_event.data["message"]["data"]))
print(dlp_msg)

 

View solution in original post

1 REPLY 1

Resolved by using dlp_v2 to parse the converted string from base64. 

import base64
from google.cloud import dlp_v2

dlp_msg = dlp_v2.DataprofilePubSubMessage()
dlp_msg.ParseFromString(base64.b64decode(cloud_event.data["message"]["data"]))
print(dlp_msg)