Re: Document AI Form Processor Parse Table Structu...

ZinY · 08-01-2022 08:17 PM

Hi All,

We are currently using Document AI for form parsing some PDF document and half of times the default former processor either missing a col or messed up some col structure.

Let's say the expected file header

Sales | Dollar Volume | Average Price

For example, I saw cases like

1. Missing Header

Sales|Average Price

2. Wrong structure

SalesDollar|Volume|Average Price

The content of first two cols are messed up as well. The cell could be missing value or incomplete value.

Any recommendation to improve this? If no easy way, any guidance with examples to train or deploy one's own form processor? PS: the document has the same structure.

josegutierrez

You can improve the data results by using Document AI Parser with AI Platform Notebooks. Also you can use Vision AI to create your own Parser.

Document AI Form Processor Parse Table Structure Incorrectly