Hi Experts,
I am quite new with Sensitive Data Protection (DLP).
While creating a scan configuration, in the Action section, there is a choice to "Publish to Pub/Sub".
In a Cloud Function in Python, I tried to parse the pub sub message while a scan profile is generated by:
base64.b64.decode(cloud_event).data["message"]["data"]
The parse result included all the information about the scan result. However, I was confused by the format of the result. It contained many meaningless characters to make the output unreadable by script.
b'\n\xa9\x14\n\\projects/my-project123/locations/asia-northeast1/tableDataProfiles/xxxxxx\x12^projects/my-project123/locations/asia-northeast1/projectDataProfiles/xxxxxx\x1aW//bigquery.googleapis.com/projects/my-project123/datasets/SDP_test/tables/table62*\x02\x08\x142\x02\x08\x14:\xd0\x0f\x12\xe2\x0e\n\x10\n\x0eADVERTISING_ID\n\x05\n\x03AGE\n\x16\n\x14ARGENTINA_DNI_NUMBER\n\x1b\n\x19AUSTRALIA_TAX_FILE_NUMBER\n!\n\x1fBELGIUM_NATIONAL_ID_CARD_NUMBER\n\x13\n\x11BRAZIL_CPF_NUMBER\n \n\x1eCANADA_SOCIAL_INSURANCE_NUMBER.....,US_INDIVIDUAL_TAXPAYER_IDENTIFICATION_NUMBER\n\x1b\n\x19US_SOCIAL_SECURITY_NUMBER\n\x1f\n\x1dVEHICLE_IDENTIFICATION_NUMBER\n\x16\n\x14VENEZUELA_CDI_NUMBER\n\x14\n\x12WEAK_PASSWORD_HASH\n\x0c\n\nAUTH_TOKEN\n\x11\n\x0fAWS_CREDENTIALS\n\x12\n\x10AZURE_AUTH_TOKEN\n\x13\n\x11BASIC_AUTH_HEADER\n\x10\n\x0eENCRYPTION_KEY\n\r\n\x0bGCP_API_KEY\n\x11\n\x0fGCP_CREDENTIALS\n\x10\n\x0eJSON_WEB_TOKEN\n\r\n\x0bHTTP_COOKIE\n\x0c\n\nXSRF_TOKEN\x10\x03\x1a\x00\x1ai*\x13my-project123:Rprojects/my-project123/locations/global/inspectTemplates/xxxxxxxxxxxx\x0c\x08\xc9\xdc\x8c\xb5\x06\x10\xc0\....\x01\x13my-project123\xca\x01\x08SDP_test\xd2\x01\x07table62\xda\x01\x15\n\x13\n\rEMAIL_ADDRESS\x1a\x02\x08\x14\xda\x01\x13\n\x11\n\x0bPERSON_NAME\x1a\x02\x08\x14\xda\x01\x14\n\x12\n\x0cPHONE_NUMBER\x1a\x02\x08\x14\xe2\x01\x15\n\x13\n\rEMAIL_ADDRESS\x1a\x02\x08\x14\xe2\x01\x13\n\x11\n\x0bPERSON_NAME\x1a\x02\x08\x14\xe2\x01\x14\n\x12\n\x0cPHONE_NUMBER\x1a\x02\x08\x14\xe2\x01\x16\n\x14\n\x0eSTREET_ADDRESS\x1a\x02\x08\x14\xea\x01\x0fasia-northeast1\xa2\x02\x17\n\x15google/bigquery/table\x10\x01'
Am I wrong to deal with the pub sub message?
Any feedback is appreciated.
Lee
Solved! Go to Solution.
Resolved by using dlp_v2 to parse the converted string from base64.
import base64
from google.cloud import dlp_v2
dlp_msg = dlp_v2.DataprofilePubSubMessage()
dlp_msg.ParseFromString(base64.b64decode(cloud_event.data["message"]["data"]))
print(dlp_msg)
Resolved by using dlp_v2 to parse the converted string from base64.
import base64
from google.cloud import dlp_v2
dlp_msg = dlp_v2.DataprofilePubSubMessage()
dlp_msg.ParseFromString(base64.b64decode(cloud_event.data["message"]["data"]))
print(dlp_msg)