Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Expense Parser issue importing labels

wissil
New Member

We're trying to import labels for the Expense Parser, and this is the example JSON request we're using:

{
"uri": "",
"mimeType": "application/pdf",
"text": "11/10/2021, 16:42\nOffice DEPOT\nOfficeMax\nLOS GATOS (408) 356-3757\n05/30/2020 9:42 AM\nV2VT5X3P5XY56YX66\nSALE\n5379432 PRNTER, ET-4760\nSubtotal:\n***\nSales Tax:\nTotal:\nVisa\n950-1-1844-473229-20.5.2\noffice-depot-redacted.png\nAUTH CODE 083396\nTDS Chip Read\nAID A0000000031010 CITI VISA\nTVR 0800008000\nCVS PIN Verified\n499.99 SS\n90499.99\n19 45.00\n544.99\n544.99\nhttps://mail.google.com/chat/u/0/#chat/dm/qimyvgAAAAE\n5730827812\nPlease create your online rewards\naccount at officedepot.com/rewards.\nYou must complete your account to\nclaim your rewards and view your\nstatus.\nShop online at www.officedepot.com\nWE WANT TO HEAR FROM YOU!\nVisit survey.officedepot.com\nand enter the survey code below:\n1508 QP9G OY41\n****\n**\n****\n1/1",
"page": [],
"entities": [
{
"mentionText": "receipt_number",
"type": "receipt_number"
},
{
"mentionText": "amount",
"type": "total_amount"
},
{
"mentionText": "currency",
"type": "currency"
},
{
"mentionText": "date",
"type": "receipt_date"
}
]
}

This errors out in the following way:

“inputGcsSource”: “gs://receipts321/jsonimport/output 2/2301149.json”,
        “status”: {
          “code”: 3,
          “message”: “Request contains an invalid argument.”
        }

It doesn't give any details what about what is wrong in the request. Any help appreciated!

0 1 224
1 REPLY 1

Hi @wissil,

Welcome to Google Cloud Community!

Based on the information that you have provided, you are importing pre-labeled documents by sending a processing request to the expense parser processor. If this is correct, I recommend that you follow this example in the documentation on how to send a request and what are the contents of the request data.: https://cloud.google.com/document-ai/docs/send-request#async-processor

Here are the contents of the request data:
 - skipHumanReview: Set this to true or false depending if you want to disable or enable Human review, this is only supported by the Human-in-the-Loop processors.
 - MIME_TYPE: One of the following options in MIME type.
 - OUTPUT_BUCKET_FOLDER: saves the output files to the specified bucket/path. 
 - INPUT_BUCKET_FOLDER: This will read the input files from the specified bucket/path

{
  "inputDocuments": {
    "gcsPrefix": {
      "gcsUriPrefix": "INPUT_BUCKET_FOLDER"
    }
  },
  "documentOutputConfig": {
    "gcsOutputConfig": {
      "gcsUri": "OUTPUT_BUCKET_FOLDER",
      "fieldMask": "FIELD_MASK"
    }
  },
  "skipHumanReview": BOOLEAN
}

Here is how you can send your request, please note that you need to change the Location, Project ID and Location ID in the endpoint. 

curl -X POST \
    -H "Authorization: Bearer $(gcloud auth print-access-token)" \
    -H "Content-Type: application/json; charset=utf-8" \
    -d @request.json \
    "https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:batchProcess"

For more information regarding importing pre-labeled documents, you can visit this link: 
https://cloud.google.com/document-ai/docs/workbench/label-documents#import-labels

Hope this helps!