My Usecase:
- Uploading 500GB of documents(structured and unstructured) in a vectordb/datastore.
- Able to query on provided document_ids only. (to make it multi user)
- Conversational Style (a thread kinda, if the user comes after a year, they must be able to continue on that thread if they want)
- Importing documents: pub-sub implementation for real-time ingestion updates of each documents.
What I've tried
- Vector Search: https://cloud.google.com/vertex-ai/generative-ai/docs/use-vertexai-vector-search
- It's costly, I need to keep my machine on even when a user ain't querying. Doesn't support xlsx, and bigger document sizes. Can't create an index other than 768 dim embeddings as corpus doesn't support. Doesn't provide page number in retrieval.
- Few Observations:
- to reindex same file name, use file path, instead of folder path in gcs.
- same name corpus can be created
- don't know how many vectors an index can hold (10B according to pricing calculator)
- no machine is assigned when deployed an index (idk what happened exactly)
- Vertex AI Search X RAG: Colab Docs
- Since vector search index was costly for me, I switched to data stores in Vertex AI Search as backend. But Rag here doesn't support any tweaking with data stores like import using corpus, listing file_ids, etc. In docs its been mentioned that one can pass rag_file_ids to let it search with those vectors only but it throws error
Rag file ids are not supported for Vertex AI Search
instead. What's the purpose of using corpus then?
- It created a schema automatically which is visible in activity on console, how can i make use of this schema, can i put my metadata to filter my file ids? (Please, just don't share links, give some explanation as well.)
- Is there any cost for creating corpus, if i have attached data store to it?
- Vertex AI Search X Grounding API: Docs
import vertexai
from vertexai.preview.generative_models import (
GenerationConfig,
GenerativeModel,
Tool,
grounding,
)
# TODO(developer): Update and un-comment below lines
PROJECT_ID = "PRO"
data_store_id = "PRO_ID"
vertexai.init(project=PROJECT_ID, location="us-central1")
model = GenerativeModel("gemini-1.5-flash-001")
tool = Tool.from_retrieval(
grounding.Retrieval(
grounding.VertexAISearch(
datastore=data_store_id,
project=PROJECT_ID,
location="global",
)
)
)
prompt = "What is the name of the company?"
response = model.generate_content(
prompt,
tools=[tool],
generation_config=GenerationConfig(
temperature=0.0,
),
)
print(response.text)
print(response)
- this works quite well, but I want to achieve rag_file_ids functionality, is it possible here?
- How can i apply pub-sub to datastore import?
- Whats the best way to achieve conversational style? (multi-turn?)
- How to calculate cost for this combo? Vertex AI Search x Grounding API (pricing page has two kinda so i am confused)
- Miscelleneous
Miscellaneous questions can be vague as i haven't deep dived into them. Thanks in advance.