Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Extract data from previous page

Hello community!

I'm trying to create a custom processor for some particular bill of lading we receive from a supplier.

Each document (pdf) contains several pages which are just the scans of the original bill of lading.

Each bill of lading contains the following data:

  • Package number: there can be several packages in each shipping. Each of them can contain one or more products
  • Product Code
  • Quantity
  • Product Description

which, visually, are structured as follows:

Package number: 123456
- Prod cod 1 | Qty | Description
- Prod cod 2 | Qty | Description
- Prod cod 3 | Qty | Description
- Prod cod 4 | Qty | Description

etc.

The rows span across several page. Here's an example:

--- PAGE 1 ---

Package number: 123456
- Prod cod 1 | Qty | Description
- Prod cod 2 | Qty | Description

--- END OF PAGE 1 ---

---- PAGE 2 ---

- Prod cod 3 | Qty | Description
- Prod cod 4 | Qty | Description

Package number: 78910
- Prod cod 5 | Qty | Description

--- END OF PAGE 2 ---

As you can see in the example, the Package number for "Prod. cod 3" an "Prod. cod 4" (which is "123456") is not present on page 2.

So my question is: how can I tell the processor something like "if package number is not present above the product code, then take the latest Package Number from the previous row OR, in case the Package Number is not present in the previous row, check the last package number of the previous page"?

Hope the question is clear enough, unfortunately I can't provide the original document.

Thank you!

2 2 183
2 REPLIES 2