I have a Node.js script, that worked before today. It processes one page of my multipage PDF (8 pages for example) file and extracts from it entities with Custom Extractor.
const name = `projects/${projectId}/locations/${location}/processors/${processorId}`;
const imageFile = await readFile(filePath);
const encodedImage = Buffer.from(imageFile).toString('base64');
const request = {
name,
rawDocument: {
content: encodedImage,
mimeType: 'application/pdf',
},
skipHumanReview: true,
fieldMask: { paths: ['entities'] },
imagelessMode: true,
processOptions: {
individualPageSelector: {
pages: [1]
}
}
};
let result;
try {
[result] = await client.processDocument(request);
} catch (error) {
console.error(error);
return;
}
Now I have an issue with individualPageSelector option. It fails if the page is other than "1".
For example:
individualPageSelector: {
pages: [1] //WORKS
}
individualPageSelector: {
pages: [2] //FAIL WITH CODE 400 Request contains an invalid argument.
}
individualPageSelector: {
pages: [1, 2] //WORKS but the result of the result of extracting multiple pages is not the same as one page
}
individualPageSelector: {
pages: [3] //FAIL WITH CODE 400 Request contains an invalid argument.
}
I assure my PDF file has second page and third example confirms it.
This strange behavior just appeared today. Is it possible that this is a problem with the latest update of Document AI?