Hi All,
We are currently using Document AI for form parsing some PDF document and half of times the default former processor either missing a col or messed up some col structure.
Let's say the expected file header
Sales | Dollar Volume | Average Price
For example, I saw cases like
1. Missing Header
Sales|Average Price
2. Wrong structure
SalesDollar|Volume|Average Price
The content of first two cols are messed up as well. The cell could be missing value or incomplete value.
Any recommendation to improve this? If no easy way, any guidance with examples to train or deploy one's own form processor? PS: the document has the same structure.
You can improve the data results by using Document AI Parser with AI Platform Notebooks. Also you can use Vision AI to create your own Parser.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |