Announcements
This site is in read only until July 22 as we migrate to a new platform; refer to this community post for more details.
Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Document AI extraction with individual page selector failure

 I have a Node.js script, that worked before today. It processes one page of my multipage PDF (8 pages for example) file and extracts from it entities with Custom Extractor.

 

 

  const name = `projects/${projectId}/locations/${location}/processors/${processorId}`;
  const imageFile = await readFile(filePath);
  const encodedImage = Buffer.from(imageFile).toString('base64');

  const request = {
    name,
    rawDocument: {
       content: encodedImage,
       mimeType: 'application/pdf',
    },
    skipHumanReview: true,
    fieldMask: { paths: ['entities'] },
    imagelessMode: true,
    processOptions: {
      individualPageSelector: {
        pages: [1]
      }   
     }
  };

  let result;
  try {
    [result] = await client.processDocument(request);
  } catch (error) {
    console.error(error);
    return;
  }

 

 

 Now I have an issue with individualPageSelector option. It fails if the page is other than "1".

For example:

 

 

individualPageSelector: {
    pages: [1] //WORKS
}   
individualPageSelector: {
    pages: [2] //FAIL WITH CODE 400 Request contains an invalid argument.
}
individualPageSelector: {
    pages: [1, 2] //WORKS but the result of the result of extracting multiple pages is not the same as one page
}

 

individualPageSelector: {
    pages: [3] //FAIL WITH CODE 400 Request contains an invalid argument.
}

I assure my PDF file has second page and third example confirms it.

This strange behavior just appeared today. Is it possible that this is a problem with the latest update of Document AI?

0 0 31