RAG on GCP

sailochana · 12-02-2024 03:03 AM

HI Folks ,
I was looking for RAG implementation in GCP so where ,I come across a small confusion i.e

I found two possibilities
1: rag api
For rag api in gcp requires vertexai api ,
2: RAG implementaion on GCP
For rag flow using document ai ,google storage ,embeddings and generation models we need document ai api ,cloud storage api,vertexai api
in this case which is more feasible ,
if rag api fullfils all requirements in single vertexai api what is the use of rag flow
can anyone explain in better way and reasons for these approaches
Thanks .
@

ruthseki

Hi @sailochana,

Welcome to Google Cloud Community!

Let's take a look at the differences between the two approaches to building a Retrieval Augmented Generation (RAG) system on Google Cloud Platform (GCP): using the Vertex AI RAG API directly, or building a custom RAG pipeline using individual GCP services and when each approach is more feasible:

Vertex AI RAG API is a higher-level, managed service. It simplifies RAG implementation by abstracting away much of the underlying infrastructure and complexity. You provide your documents, and the API handles embedding generation, retrieval, and integration with a large language model (LLM) for generation.

Some of its advantages are:

Ease of use: Significantly faster development and deployment. Less code to write and maintain.
Managed infrastructure: GCP handles scaling, reliability, and updates to the underlying components.
Simplified workflow: A single API call handles the entire RAG process.

While some of its disadvantages are:

Less control: You have limited control over the individual components (embedding model, retrieval method, LLM). You might not be able to optimize as precisely for your specific needs.
Potentially higher cost: While it simplifies things, it might be more expensive than a custom solution if you can highly optimize your individual components.
Limited customization: You're constrained by the options offered by the API.

Custom RAG Pipeline (Document AI, Cloud Storage, Vertex AI)

You build your RAG system from the ground up using individual GCP services. This gives you granular control over every step. You'd typically use:

Document AI: For extracting text from various document formats.
Cloud Storage: To store your documents.
Vertex AI Embeddings: To generate embeddings for your documents.
Vertex AI Generative Models: To generate text responses based on the retrieved context.
A vector database (e.g., Cloud Spanner, Firestore): To efficiently store and search embeddings.

Some of its advantages are:

Fine-grained control: You can choose the optimal embedding model, retrieval method, and LLM based on your specific needs and data.
Potential cost optimization: If you carefully select and optimize individual components, you might achieve lower costs than the managed RAG API.
Flexibility and extensibility: You can easily adapt and extend the pipeline to integrate with other services or custom components.

While some of its disadvantages are:

Increased complexity: Requires significantly more engineering effort, expertise, and time to build and maintain.
Higher operational overhead: You are responsible for managing the entire infrastructure and its scalability.
More error-prone: More moving parts increase the chance of errors and require more robust monitoring and error handling.

Which is more feasible?

The best approach depends on your priorities and resources:

Choose the Vertex AI RAG API if:

You need a quick, easy-to-implement solution.
You prioritize ease of use and managed infrastructure over granular control and cost optimization.
You don't have extensive expertise in building and managing complex machine learning pipelines.

Choose the custom RAG pipeline if:

You need fine-grained control over every aspect of the RAG system.
You require very specific optimizations for your data and application.
You have the engineering expertise to build and maintain a complex pipeline.
Cost optimization is a critical factor.

Additionally, you may check this article for more information.

I hope the above information is helpful.