Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Failure to enable imageless mode for Document AI form parser

I am passing a PDF to the form parser processor (most recent stable release) in an apps script.  No matter what I do, I cannot seem to enable the imageless option to permit synchronous processing of PDFs of length 16-30 pages.  I get a server error 500 with the message:

"This processor version only supports up to 15 pages per document shard. If using the UI or sync pre..."

Below is my code.  I am a novice at this, so it is entirely possible that I have a fundamental misunderstanding of how this should work.

 

function callDocumentAI(file, imageless = false) {
const token = ScriptApp.getOAuthToken();
const url = `https://${PROJECT_LOCATION}-documentai.googleapis.com/v1/projects/${PROJECT_ID}` +
`/locations/${PROJECT_LOCATION}/processors/${FORM_PARSER_PROCESSOR_ID}:process`;
const inlineDoc = {
content: Utilities.base64Encode(file.getBlob().getBytes()),
mimeType: 'application/pdf'
};
// Build request body
const body = { inlineDocument: inlineDoc };
if (imageless) {
// Enable imageless mode flag
body.imageless_mode = true;
}

// Prepare HTTP options
const options = {
method: 'post',
contentType: 'application/json',
headers: { Authorization: `Bearer ${token}` },
payload: JSON.stringify(body)
};

const resp = UrlFetchApp.fetch(url, options);
return JSON.parse(resp.getContentText());
}
 
Thank you in advance for any insight you might have.
0 1 41
1 REPLY 1

Hi oregonoutpt,

Welcome to the Google Cloud Community!

Here are some suggestions you can try that might help resolve your issue:

  • Ensure that the PROJECT_LOCATION variable and the FORM_PARSER_PROCESSOR_ID variable in the Google Apps Script code are accurately set and correspond to the correct Google Cloud project and the intended Document AI Form Parser processor. Even a small typo can cause this error.
  • You can try to modify the line in the Apps Script code where the imageless mode is enabled from body.imageless_mode = true; to body.imagelessMode = true;
  • Try processing a different PDF document that has between 16 and 30 pages, is well-organized, and mostly contains text. This will help determine if the issue is related to a specific document.
  • If, after correcting the case of imagelessMode and ensuring it's enabled correctly, you still encounter the page limit, then consider using asynchronous (batch) processing as an alternative. This approach is designed to handle larger documents and breaks them into smaller chunks for processing.
  • Also, make sure that your Google Apps Script project has the necessary permissions to access the Document AI API.

For more detailed information, you can explore these documentations: 

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.