Masking xml payload before sending into to logs

Hello,

We have a requirement where the xml payload data being sent to logs need to be masked. Is there is a easy way of doing the masking of data. I am not referring to the masking option available on the trace, but needing to mask the data going into logs?

Thanks

 

Solved Solved
0 2 444
1 ACCEPTED SOLUTION

I understand what you're asking.

I can think of two ways to accomlpish what you describe. One is a special-case arrangement, and the other is a more general solution.

  1. special solution. If you have a specific XML payload, and you know the specific fields (elements or attributes) you want to mask, then you could apply an XSL transform to that specific payload to mask those elements. For example, you might start with <Email>marcia@example.com</Email> and end with <Email>EMAIL_REDACTED</Email>, and so on, with other specific elements in the XML document. The XSL to do this is relatively simple, if you are handy with XSL. This won't scale well, if you have lots of different kinds of XML payloads, and those payloads vary over time and so on.
  2. general solution. Google Cloud offers the Data Loss Prevention API, which is part of Sensitive Data Protection. You can invoke that from within the API proxy to mask data in your XML document. Pseudo code for a REST request to do this is here.

    In an Apigee policy, you would need a ServiceCallout, configured to send a POST to dlp.googleapis.com/v2/projects/PROJECT/content:deidentify , with the specific required payload. Basically you'd wrap your XML into this kind of JSON:

    {
      "inspectConfig": {
        "infoTypes": [ { "name": "EMAIL_ADDRESS" }, { "name": "PHONE_NUMBER" }, { "name": "URL" } ... ]
      },
      "deidentifyTemplateName": "projects/your-project-here/deidentifyTemplates/3816555555",
      "item": {
        "value": "XML-GOES-HERE"
      }
    }
    ​

    To do that you need to escape the XML, via a Message Template with escapeJSON. This will do the right thing with any quotes that appear in the XML.

    When I did that with this request payload:

    {
      "inspectConfig": {
        "infoTypes": [ { "name": "EMAIL_ADDRESS" }, { "name": "PHONE_NUMBER" }, { "name": "URL" } ]
      },
      "deidentifyTemplateName": "projects/PROJECT/deidentifyTemplates/TEMPLATEID",
      "item": {
        "value": "<doc> <Name>Marcia</Name> <URL>https://marcia.com/my-home-page</URL>  <Email>marcia@example.com</Email>  <Phone>412-343-0919</Phone> </doc>"
      }
    }
    

    ...I get this in response....

    {
      "item": {
        "value": "<doc> <Name>Marcia</Name> <URL>https://***********************</URL>  <Email>******@*******.com</Email>  <Phone>***-***-***9</Phone> </doc>"
      },
      ...
    

    How that all works: the de-identify template I created will mask URLs, phone numbers, and email addresses, each with a different approach. And my template DOES NOT mask first names or last names. The user of DLP has full control over how this works. You can choose the way things get masked. To get the XMl out you would then use a message template like  {jsonPath($.item.value,dlpResponse.content)} .

View solution in original post

2 REPLIES 2

I understand what you're asking.

I can think of two ways to accomlpish what you describe. One is a special-case arrangement, and the other is a more general solution.

  1. special solution. If you have a specific XML payload, and you know the specific fields (elements or attributes) you want to mask, then you could apply an XSL transform to that specific payload to mask those elements. For example, you might start with <Email>marcia@example.com</Email> and end with <Email>EMAIL_REDACTED</Email>, and so on, with other specific elements in the XML document. The XSL to do this is relatively simple, if you are handy with XSL. This won't scale well, if you have lots of different kinds of XML payloads, and those payloads vary over time and so on.
  2. general solution. Google Cloud offers the Data Loss Prevention API, which is part of Sensitive Data Protection. You can invoke that from within the API proxy to mask data in your XML document. Pseudo code for a REST request to do this is here.

    In an Apigee policy, you would need a ServiceCallout, configured to send a POST to dlp.googleapis.com/v2/projects/PROJECT/content:deidentify , with the specific required payload. Basically you'd wrap your XML into this kind of JSON:

    {
      "inspectConfig": {
        "infoTypes": [ { "name": "EMAIL_ADDRESS" }, { "name": "PHONE_NUMBER" }, { "name": "URL" } ... ]
      },
      "deidentifyTemplateName": "projects/your-project-here/deidentifyTemplates/3816555555",
      "item": {
        "value": "XML-GOES-HERE"
      }
    }
    ​

    To do that you need to escape the XML, via a Message Template with escapeJSON. This will do the right thing with any quotes that appear in the XML.

    When I did that with this request payload:

    {
      "inspectConfig": {
        "infoTypes": [ { "name": "EMAIL_ADDRESS" }, { "name": "PHONE_NUMBER" }, { "name": "URL" } ]
      },
      "deidentifyTemplateName": "projects/PROJECT/deidentifyTemplates/TEMPLATEID",
      "item": {
        "value": "<doc> <Name>Marcia</Name> <URL>https://marcia.com/my-home-page</URL>  <Email>marcia@example.com</Email>  <Phone>412-343-0919</Phone> </doc>"
      }
    }
    

    ...I get this in response....

    {
      "item": {
        "value": "<doc> <Name>Marcia</Name> <URL>https://***********************</URL>  <Email>******@*******.com</Email>  <Phone>***-***-***9</Phone> </doc>"
      },
      ...
    

    How that all works: the de-identify template I created will mask URLs, phone numbers, and email addresses, each with a different approach. And my template DOES NOT mask first names or last names. The user of DLP has full control over how this works. You can choose the way things get masked. To get the XMl out you would then use a message template like  {jsonPath($.item.value,dlpResponse.content)} .

Thanks Dino