Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Trying to understand the role of Document AI within Gen AI RAG architecture

I'm relatively new to AI and still trying to wrap my head around some concepts. I've followed this lab which uses Document AI OCR to scan PDF's and be able to have conversations against them.

https://cloud.google.com/blog/products/ai-machine-learning/ask-your-documents-document-ai-and-palm2-... 

What I'm trying to understand is, what if I have an entire shared Google Drive or GCS bucket that I want a broad user community to be able to have conversations against. Do all of those documents need to go through Doc AI first? How are new documents handled?

2 1 2,060
1 REPLY 1

Hi @JasonC

Welcome and thank you for reaching out to our community.

Here are the things you need to consider when working on data collections and user interactions.

Processing of Documents:

  • Existing:  
     - Full OCR extraction: This will run all the documents from your drive or storage bucket to Document AI to extract data
     - Light extraction: This fetch basic information like document type, title and author that is used for quick categorization 
  • New:
     - Setup scheduled processing or a trigger to capture newly added documents in your drive or storage bucket
     - Apply your chosen processing method, either full OCR or light extraction, depending on your needs
     - Update your database and indexes by adding the extracted information from your new documents

Storage:

  • You can store the extracted information to a database like Cloud SQL or BigQuery
  • You can also create indexes for faster information search and retrieval based from supplied keywords

Considerations:

  • Document AI pricing: Full OCR costs higher than light extraction methods, be sure of the functions you want to serve your customers. 

I found a post that discusses Document AI Extraction, you can check it out as the solution might be helpful to you.