Hello, I am attempting to use Document AI to extract data from pdf forms.
The problem is that there are some questions where the answer is shaded in.
Here is an example:
I have tried the form parser, OCR and a custom extractor but I am unable to detect whether or not a box is shaded in. Document AI only recognizes the text itself. For example in the output it detects "YES" and "NO" but there is no way to tell which box is shaded in.
Does anyone know if it would be possible to detect the shaded in boxes?
Hi @AkshatPrakash1,
Welcome and thank you for reaching out to our community.
I believe Document AI is working as intended based from your description as it still detected the texts enclosed in the boxes.
You need a different approach to achieve your use case, consider employing Cloud Vision API for this scenario.
Hope this helps.