Announcements
This site is in read only until July 22 as we migrate to a new platform; refer to this community post for more details.
Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Parsing single digit in table with Form parser

I am trying to scan the following image with the form parser (pretrained-form-parser-v2.0-2022-11-10) 

but it does not parse the single-digit number correctly. Any good idea to work around this issue? Can I expect this will be solved in the next coming version?

DocAITest-sheet.pngdoc-ai-test.png

0 2 361
2 REPLIES 2

Aris_O
Former Googler

Hi @anonaka

Welcome to Google Cloud Community.

The Form Parser employs OCR (Optical Character Recognition) to find text in images, therefore you might want to fiddle with the image pretreatment parameters to improve the accuracy of the OCR. To increase OCR accuracy, experiment with changing the picture preprocessing settings for brightness, contrast, or sharpness.

  • Try training a custom model using your own labeled data if the Form Parser is having trouble correctly interpreting the single-digit number in your image.
  • You may fine-tune the OCR to better recognize particular types of text in your photographs by training a custom model.
  • Create your own labeled data and train a custom model if the Form Parser is having trouble correctly interpreting the single-digit number in your image.
  • You can improve OCR's ability to identify particular forms of text in your photographs by training a custom model.


Here are some documentation that might help you.
https://cloud.google.com/vision/docs?_gl=1*p539ow*_ga*MTM5Mjc1MzQzNS4xNjc2NjU1Njg2*_ga_WH2QY8WWF5*MT...

https://cloud.google.com/functions/docs/tutorials/ocr?_gl=1*huz756*_ga*MTM5Mjc1MzQzNS4xNjc2NjU1Njg2*...

Unfortunately, the current custom parser does not recognize Japanese text which is mandatory for my project. I am waiting for the next update release.