We're trying to import labels for the Expense Parser, and this is the example JSON request we're using:
{
"uri": "",
"mimeType": "application/pdf",
"text": "11/10/2021, 16:42\nOffice DEPOT\nOfficeMax\nLOS GATOS (408) 356-3757\n05/30/2020 9:42 AM\nV2VT5X3P5XY56YX66\nSALE\n5379432 PRNTER, ET-4760\nSubtotal:\n***\nSales Tax:\nTotal:\nVisa\n950-1-1844-473229-20.5.2\noffice-depot-redacted.png\nAUTH CODE 083396\nTDS Chip Read\nAID A0000000031010 CITI VISA\nTVR 0800008000\nCVS PIN Verified\n499.99 SS\n90499.99\n19 45.00\n544.99\n544.99\nhttps://mail.google.com/chat/u/0/#chat/dm/qimyvgAAAAE\n5730827812\nPlease create your online rewards\naccount at officedepot.com/rewards.\nYou must complete your account to\nclaim your rewards and view your\nstatus.\nShop online at www.officedepot.com\nWE WANT TO HEAR FROM YOU!\nVisit survey.officedepot.com\nand enter the survey code below:\n1508 QP9G OY41\n****\n**\n****\n1/1",
"page": [],
"entities": [
{
"mentionText": "receipt_number",
"type": "receipt_number"
},
{
"mentionText": "amount",
"type": "total_amount"
},
{
"mentionText": "currency",
"type": "currency"
},
{
"mentionText": "date",
"type": "receipt_date"
}
]
}
This errors out in the following way:
“inputGcsSource”: “gs://receipts321/jsonimport/output 2/2301149.json”,
“status”: {
“code”: 3,
“message”: “Request contains an invalid argument.”
}
It doesn't give any details what about what is wrong in the request. Any help appreciated!
Hi @wissil,
Welcome to Google Cloud Community!
Based on the information that you have provided, you are importing pre-labeled documents by sending a processing request to the expense parser processor. If this is correct, I recommend that you follow this example in the documentation on how to send a request and what are the contents of the request data.: https://cloud.google.com/document-ai/docs/send-request#async-processor
Here are the contents of the request data:
- skipHumanReview: Set this to true or false depending if you want to disable or enable Human review, this is only supported by the Human-in-the-Loop processors.
- MIME_TYPE: One of the following options in MIME type.
- OUTPUT_BUCKET_FOLDER: saves the output files to the specified bucket/path.
- INPUT_BUCKET_FOLDER: This will read the input files from the specified bucket/path
{
"inputDocuments": {
"gcsPrefix": {
"gcsUriPrefix": "INPUT_BUCKET_FOLDER"
}
},
"documentOutputConfig": {
"gcsOutputConfig": {
"gcsUri": "OUTPUT_BUCKET_FOLDER",
"fieldMask": "FIELD_MASK"
}
},
"skipHumanReview": BOOLEAN
}
Here is how you can send your request, please note that you need to change the Location, Project ID and Location ID in the endpoint.
curl -X POST \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json; charset=utf-8" \
-d @request.json \
"https://LOCATION-documentai.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/processors/PROCESSOR_ID:batchProcess"
For more information regarding importing pre-labeled documents, you can visit this link:
https://cloud.google.com/document-ai/docs/workbench/label-documents#import-labels
Hope this helps!