Hi Team,
My use case: I want to extract handwritten content from pdf along with some checkbox data using Google Vertex LLM API.
I tried changing prompts and version of gemini models. But it is not giving accurate results. How we can get accuracy in check box extraction. Even for wrong checkbox extraction it is giving 1.0 confidence.
Kindly help.
Regards
Hi @PravinShelake15,
Welcome to Google Cloud Community!
One of Google Vertex AI's Gemini’s use case is for text understanding and generation, not direct image analysis. The high confidence score despite incorrect results indicates a limitation of the model: it's confident in its prediction, even if that prediction is wrong. The model isn't truly understanding the visual data; it's making educated guesses based on patterns it's learned.
Instead of using Google Vertex AI's Gemini, I suggest trying Document AI.
Document AI is specifically designed for extracting information from documents, including handwritten text and form fields like checkboxes. It leverages advanced computer vision and machine learning models optimized for this task. It also includes powerful OCR capabilities, making the text extraction process more accurate.
Here’s how to use Document AI for your use case:
Here are some Google Cloud Document AI documentation for detailed instructions on setting up your project, creating a processor, and using the API:
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |