Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Layout Parser in Document AI: Boundin box issue

I'm using the Layout Parser in Document AI, but it doesn't seem to return bounding box details. Is there a way to retrieve these details from the API?

Currently, the response I receive looks like this:

{
  "document": {
    "documentLayout": {
      "blocks": [
        {
          "blockId": "1",
          "textBlock": {
            "text": "Firefox",
            "type": "header"
          },
          "pageSpan": {
            "pageStart": 1,
            "pageEnd": 1
          }
        }
      ]
    }
  }
}

 

4 REPLIES 4

Hi @vguleria,

Welcome to Google Cloud Community!

Here's what you should look for:

  • boundingPoly: This is the most common field name for the bounding box. It usually contains normalized coordinates relative to the page. 
  • layout (Optional): In some older versions or slightly different API implementations, bounding box information might be nested within a layout object.

You may try these approaches to fix your code:

  1. Check your API response carefully: Examine the complete JSON response from the Document AI Layout Parser API. Search for "boundingPoly" or "layout" within the blocks array. The boundingPoly is usually a direct sibling of blockId, textBlock, and pageSpan.
  2. Update your code to access the bounding box: Once you've located the boundingPoly field, update your code to extract the vertex coordinates. 

Additionally, you may check these links:

I also found this case that you might find useful.

I hope the above information is helpful.

Hello friend, I also have the same problem as you. You have now solved the above problem. Can you please tell me how to fix them if you have solved them.

Has someone found solution to this original question?

It appears that you `returnBoundingBoxes` is exposed in `layoutConfig` but I'm still not seeing it in the return payload.  `returnImages` is working as expected. 

 const request: protos.google.cloud.documentai.v1beta3.IProcessRequest = {
    name,
    rawDocument: {
      content: documentContent,
      mimeType,
    },
    processOptions: {
      layoutConfig: {
        returnBoundingBoxes: true, // not working
        returnImages: true,
      },
    },
  };

  const [result] = await client.processDocument(request);

@ruthsekiAny ideas?