Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Document AI to detect shaded in text boxes.

Hello, I am attempting to use Document AI to extract data from pdf forms.

The problem is that there are some questions where the answer is shaded in.

Here is an example:

Screenshot 2024-01-23 at 3.42.30 PM.png

I have tried the form parser, OCR and a custom extractor but I am unable to detect whether or not a box is shaded in. Document AI only recognizes the text itself. For example in the output it detects "YES" and "NO" but there is no way to tell which box is shaded in.  

Does anyone know if it would be possible to detect the shaded in boxes? 

2 1 233
1 REPLY 1

Hi @AkshatPrakash1

Welcome and thank you for reaching out to our community.

I believe Document AI is working as intended based from your description as it still detected the texts enclosed in the boxes.

You need a different approach to achieve your use case, consider employing Cloud Vision API for this scenario. 

Hope this helps.