Enterprise document ocr returning empty "content" ...

ivanherselman · 10-17-2024 12:54 PM

Hi guys,

Hoping somebody can help me. I am doing a callout to the enterprise document OCR. My understanding is that the content key in the json response is supposed to return the text extracted from that block/space in the document. It is however being returned empty. See extract from the api response :

"pages": [

{

"detectedLanguages": [

{

"languageCode": "en",

"confidence": 0.9901396632194519

}

],

"blocks": [

{

"detectedLanguages": [],

"layout": {

"textAnchor": {

"textSegments": [

{

"startIndex": "0",

"endIndex": "28"

}

],

"content": ""

},

"confidence": 0.9784985184669495,

"boundingPoly": {

"vertices": [

{

"x": 1199,

"y": 39

},

{

"x": 1480,

"y": 37

},

{

"x": 1481,

"y": 117

},

{

"x": 1200,

"y": 119

}

],

"normalizedVertices": [

{

"x": 0.7132658958435059,

"y": 0.01640033721923828

},

{

"x": 0.8804283142089844,

"y": 0.015559293329715729

},

{

"x": 0.8810232281684875,

"y": 0.049201007932424545

},

{

"x": 0.713860809803009,

"y": 0.0500420518219471

}

]

},

"orientation": "PAGE_UP"

},

"provenance": null

}, ---- Is this expected behaviour? Hoping somebody can shed some light on the matter.

ibaui

Hi @ivanherselman,

Welcome to Google Cloud Community!

From the JSON response you've shared, it seems that the ‘content’ field is empty for the text block, which isn't expected behavior. The content key in the JSON response of the Enterprise Document OCR API should indeed return the text extracted from the corresponding block or space in the document. However, if it's returning empty, here are some potential reasons and suggestions you might consider to address the issue:

OCR Configuration: Ensure that your OCR configuration is set up correctly and that you’re using the appropriate settings for your document type.

Block Content is Not Text: The block might contain elements that are not considered text, such as images, drawings, or other non-textual content. In such cases, the content key will be empty. To verify this, check the bounding box of the block and visually inspect the content within it. If it's primarily non-textual, it's expected for the content key to be empty.

Image Quality: Poor image quality or resolution can significantly affect OCR accuracy. Ensure the document image is clear, well-lit, and of high quality. Also, make sure that the uploaded image follows the supported formats stated here.

Language Support: Verify that the language code you are using is correct and supported by the OCR service.

You can also visit the following documentation for more details:

I hope the above information is helpful.

Enterprise document ocr returning empty "content" key in api response for text segments