Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Checkbox in custom Form parser not working

hi, I am using custom Document Ai Processor to label and train on my document, but facing issue that every time it missed checkboxes. I have labelled 40 documents for train and 10 for test. Even after training the  f1 score of check boxes are 0. Can anyone please help e out in this. Thankstrain file.png

1 6 1,511
6 REPLIES 6

Hi @Hami1437 ,

Welcome back to Google Cloud Community,

Training a machine learning model can be a complex and iterative process. It may take several configurations to get a good performance on checkbox detection.

There are some possible reasons why your model is not detecting the checkbox.

You may try to perform this check to see the issue:

1. Check your labeling: Double-check your labeling to make sure that you have correctly labeled all of the checkbox in your training set.

2. Fine-tune on a smaller set of examples that are similar to your target domain. This can help the model learn to recognize the checkbox more accurately.

3. Try adding more number of labeled examples in your training set.

4. Check that your training set includes a variety of different types of documents and checkboxes.

5. Try experimenting with different model architectures to see if you can improve the calibration on checkbox detection.

6. Increase configuring the complexity of your model: If none of the above steps works, you may try on increasing the complexity of your model.

Here are some reference that might help you:
https://cloud.google.com/recaptcha-enterprise/docs/instrument-web-pages-with-checkbox?_ga=2.21438216...

https://cloud.google.com/vision-ai/docs/object-detector-model?_ga=2.214382167.-1392753435.1676655686

https://cloud.google.com/document-ai/docs/overview?_ga=2.214382167.-1392753435.1676655686

Thanks for reply. Actually, I am using Document AI custom processor and have labeled 50 the documents. and the overall Score of trained model is almost 0.94 but in this there are 3 checkboxes all them have 0 score and never extracted. And I have labelled them same as in the documentation. And one more thing is that i have same template documents. Attaching the Imagetrain file.png of document. Thanks 

Just to give additional information @Aris_O. We have labelled about 50 documents with checkbox, although the overall f-score is good but for checkbox the f-score is 0

The above image is of labelling, the data is dummy right now. @Aris_O can you please help us out what exactly are we doing wrong here.  

have you found any solution? Same problem with mine too

 

Facing the same issue. Despite having a very high f1 score, the model is unable to detect checkboxes. 
Even if it does detect a checkbox successfully(detected 2 out of 30 in my case), while calling the API, it does not return the detected value( i.e checked or unchecked). Just returns the label name and confidence, which is always lesser than 5%. 

I also face a similar issue. It seems that if the checkbox is too small then DocumentAI just can't detect it as a checkbox.