RAG | Vector Search | Vertex AI Search | Grounding - Page 2

hardikitis · 01-31-2025 02:23 AM

My Usecase:

Uploading 500GB of documents(structured and unstructured) in a vectordb/datastore.
Able to query on provided document_ids only. (to make it multi user)
Conversational Style (a thread kinda, if the user comes after a year, they must be able to continue on that thread if they want)
Importing documents: pub-sub implementation for real-time ingestion updates of each documents.

What I've tried

Vector Search: https://cloud.google.com/vertex-ai/generative-ai/docs/use-vertexai-vector-search
- It's costly, I need to keep my machine on even when a user ain't querying. Doesn't support xlsx, and bigger document sizes. Can't create an index other than 768 dim embeddings as corpus doesn't support. Doesn't provide page number in retrieval.
- Few Observations:
  - to reindex same file name, use file path, instead of folder path in gcs.
  - same name corpus can be created
  - don't know how many vectors an index can hold (10B according to pricing calculator)
  - no machine is assigned when deployed an index (idk what happened exactly)
Vertex AI Search X RAG: Colab Docs
- Since vector search index was costly for me, I switched to data stores in Vertex AI Search as backend. But Rag here doesn't support any tweaking with data stores like import using corpus, listing file_ids, etc. In docs its been mentioned that one can pass rag_file_ids to let it search with those vectors only but it throws error

Rag file ids are not supported for Vertex AI Search

instead. What's the purpose of using corpus then?

It created a schema automatically which is visible in activity on console, how can i make use of this schema, can i put my metadata to filter my file ids? (Please, just don't share links, give some explanation as well.)
Is there any cost for creating corpus, if i have attached data store to it?
Vertex AI Search X Grounding API: Docs

import vertexai

from vertexai.preview.generative_models import (
    GenerationConfig,
    GenerativeModel,
    Tool,
    grounding,
)

# TODO(developer): Update and un-comment below lines
PROJECT_ID = "PRO"
data_store_id = "PRO_ID"

vertexai.init(project=PROJECT_ID, location="us-central1")

model = GenerativeModel("gemini-1.5-flash-001")

tool = Tool.from_retrieval(
    grounding.Retrieval(
        grounding.VertexAISearch(
            datastore=data_store_id,
            project=PROJECT_ID,
            location="global",
        )
    )
)

prompt = "What is the name of the company?"
response = model.generate_content(
    prompt,
    tools=[tool],
    generation_config=GenerationConfig(
        temperature=0.0,
    ),
)

print(response.text)
print(response)

this works quite well, but I want to achieve rag_file_ids functionality, is it possible here?
How can i apply pub-sub to datastore import?
Whats the best way to achieve conversational style? (multi-turn?)
How to calculate cost for this combo? Vertex AI Search x Grounding API (pricing page has two kinda so i am confused)
Miscelleneous
- Can i create a persona for each ingested file? If yes, provide some snippet example. [https://cloud.google.com/generative-ai-app-builder/docs/filter-search-metadata]
- Does Conversational Agents help for my use case with Generative Fallback?
- How does this thing work for rag_file_ids? Grounding Multi Turn Cost for this? [data store + i/p and o/p tokens?]
- What's difference between third and fourth row in the image? Share links to both the thing from your docs.
- Is the Grounding API built into the Grounded Generation API? When should each be used, and which one corresponds to the above shared links?
- Recommend what I should use.

Miscellaneous questions can be vague as i haven't deep dived into them. Thanks in advance.