My Usecase:
What I've tried
Rag file ids are not supported for Vertex AI Search
instead. What's the purpose of using corpus then?
import vertexai
from vertexai.preview.generative_models import (
GenerationConfig,
GenerativeModel,
Tool,
grounding,
)
# TODO(developer): Update and un-comment below lines
PROJECT_ID = "PRO"
data_store_id = "PRO_ID"
vertexai.init(project=PROJECT_ID, location="us-central1")
model = GenerativeModel("gemini-1.5-flash-001")
tool = Tool.from_retrieval(
grounding.Retrieval(
grounding.VertexAISearch(
datastore=data_store_id,
project=PROJECT_ID,
location="global",
)
)
)
prompt = "What is the name of the company?"
response = model.generate_content(
prompt,
tools=[tool],
generation_config=GenerationConfig(
temperature=0.0,
),
)
print(response.text)
print(response)
Is the Grounding API built into the Grounded Generation API? When should each be used, and which one corresponds to the above shared links?
Miscellaneous questions can be vague as i haven't deep dived into them. Thanks in advance.
Hi @hardikitis,
Welcome to the Google Cloud Community!
I see you have a detailed set of questions regarding your document search implementation using Vertex AI Search and related services. This includes vector search, data stores, the Grounding API, and Pub/Sub integration. I understand you've also included details on cost, multi-user access, and conversational aspects. Let's go through each of your questions for possible solutions.
1. What's the purpose of using corpus?
When using Vertex AI Search data stores with the RAG pipeline, direct filtering based on rag_file_ids is not supported. The corpus is the underlying storage and indexing mechanism. Even if you create a data store, the corpus is still there. Data stores provide an abstraction layer to interact with the corpus, not a method to bypass it.
2. Automatic Schema
3. Corpus Costs with Data Store
In Your Scenario, when you attach your data store to the corpus and index your documents, you are not bypassing costs. The underlying corpus is what's being used to store and process your documents. The cost will depend on the amount of indexed data, the amount of storage, and amount of queries you are performing.
4. Vertex AI Search x Grounding API and rag_file_ids
While you can't use rag_file_ids directly, you can achieve the same outcome by using the alternative approach we discussed:
5. Pub/Sub for Data Store Import
Create a Pub/Sub topic to publish updates about new documents. Then, set up a Cloud Function subscriber that listens for these messages. When a new document is uploaded or updated, the Cloud Function is triggered, and it uses the Vertex AI Search API to ingest, update, or delete the document in your data store.
6. Conversational Style (Multi-Turn)
Store previous prompts and responses for each user/thread in a database or cache. Then, include this history in your next prompt. Be mindful of context window limits.
7. Cost Calculation (Vertex AI Search x Grounding API)
These costs are usually per 1,000 requests.The input prompt cost is related to the token processing of the prompt for the model, while Data Retrieval is the request to Vertex AI Search.
8. File Personas (Metadata)
Yes, you can create personas by adding metadata such as "file_category," "author," "user_group". During querying, you can use filters on these metadata fields to tailor results to specific use cases.
9. Conversational Agents and Generative Fallback
Conversational Agents could be useful if you intend to expand the application beyond document Q&A. For this particular use case, you would need to customize it a lot, so grounding is better.
Generative Fallback - Useful if you have search results with low relevancy, you can provide an answer using the model.
10. Grounding Multi-Turn and Cost
11. Grounded Generation API vs. Grounding API
The Grounding API is what you are using, and it is not called Grounded Generation API. It uses the underlying search engine to fetch relevant data for your prompt and grounds the response accordingly.
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
Appreciate your solution, @MJane
I tried unstructured documents with jsonl while creating datastore and made id field in structData indexable.
I tried both the apis. First, I implemented filter method using Grounded Generation Api (for rag_file_ids). Second, For Grounding Api, I wasn’t able to include filter when using grounding api. As suggested by you, when ingesting my docs to gcs, I have put my documentId as metadata for each document but couldn’t figure out how to filter when querying.
Grounded Generation API uses engine_id/app_id and Grounding Api uses datastore id. Since Grounding API doesn’t need to have an app attached to it, can I save cost using Grounding API by using it instead of Grounded Generation API.
Cost:
Scenario 1: Data store ($5 per GiB) + Layout Parser + Search App as Grounded Generation API needs it [searchapp($2:standard+$4:because by default it uses basic llm? How can I remove it?)+groundedgenerationapi($2.5)] per 1000 requests = data store + $8.5 per 1000 requests.
Scenario 2: Data store ($5 per GiB) + Layout Parser + Gemini 1.5 Flash | I/p prompt: 2500 char [$0.3125 per 1,000 requests] + O/p: Grounding API [$0.150 per 1,000 requests] = data store + $0.4625 per 1000 requests
Does both the scenarios complete my use case I’ve mentioned earlier, and are the calculations correct?
###Grounded Generation API
from google.cloud import discoveryengine_v1 as discoveryengine
project_number = "80"
engine_id = "unstructured-jsonl-17"
client = discoveryengine.GroundedGenerationServiceClient()
request = discoveryengine.GenerateGroundedContentRequest(
# The full resource name of the location.
# Format: projects/{project_number}/locations/{location}
location=client.common_location_path(project=project_number, location="global"),
generation_spec=discoveryengine.GenerateGroundedContentRequest.GenerationSpec(
model_id="gemini-1.5-flash-001",
),
# Conversation between user and model
contents=[
discoveryengine.GroundedGenerationContent(
role="user",
parts=[
discoveryengine.GroundedGenerationContent.Part(
text=“What are the names of CHAPTER VI and CHAPTER VII ?"
)
],
)
],
system_instruction=discoveryengine.GroundedGenerationContent(
parts=[
discoveryengine.GroundedGenerationContent.Part(
text="Don't make up new information by yourself. Add a smiley emoji after the answer."
)
],
),
# What to ground on.
grounding_spec=discoveryengine.GenerateGroundedContentRequest.GroundingSpec(
grounding_sources=[
discoveryengine.GenerateGroundedContentRequest.GroundingSource(
search_source=discoveryengine.GenerateGroundedContentRequest.GroundingSource.SearchSource(
# The full resource name of the serving config for a Vertex AI Search App
filter='id: ANY("6c310a44-a02f-4210-a48d-53568afca394")',
max_result_count= 10,
serving_config=f"projects/{project_number}/locations/global/collections/default_collection/engines/{engine_id}/servingConfigs/default_search",
),
),
]
),
)
response = client.generate_grounded_content(request)
# Handle the response
print(response)
“””Output”””
(base) apple@Apples-MacBook-Pro vertex % python xbdg.py
candidates {
content {
role: "model"
parts {
text: "The document you provided does not contain the names of CHAPTER VI and CHAPTER VII. It does, however, contain information about CHAPTER XVII, CHAPTER XXIX, and CHAPTER XXVIII. 😊 \n"
}
}
grounding_score: 0.196530014
grounding_metadata {
}
}
Grounding API
###Grounding API (Searching in full datastore, I want use documentId as filter)
###It takes Datastore ID not engine_id/app_id
import vertexai
from vertexai.preview.generative_models import (
GenerationConfig,
GenerativeModel,
Tool,
grounding,
)
# TODO(developer): Update and un-comment below lines
PROJECT_ID = "80"
data_store_id = "unstructured-jsonl-17"
vertexai.init(project=PROJECT_ID, location="us-central1")
model = GenerativeModel("gemini-1.5-flash-001")
tool = Tool.from_retrieval(
grounding.Retrieval(
grounding.VertexAISearch(
datastore=data_store_id,
project=PROJECT_ID,
location="global",
)
)
)
prompt = “What are the names of CHAPTER VI and CHAPTER VII ?"
response = model.generate_content(
prompt,
tools=[tool],
generation_config=GenerationConfig(
temperature=0.0,
),
)
print(response.text)
print(response)
“””Output”””
The provided source does not contain information about the names of CHAPTER VI and CHAPTER VII.
The source does mention CHAPTER X and CHAPTER XVII.
candidates {
content {
role: "model"
parts {
text: "The provided source does not contain information about the names of CHAPTER VI and CHAPTER VII. \n\nThe source does mention CHAPTER X and CHAPTER XVII. \n"
}
}
finish_reason: STOP
safety_ratings {
category: HARM_CATEGORY_HATE_SPEECH
probability: NEGLIGIBLE
probability_score: 0.166992188
severity: HARM_SEVERITY_LOW
severity_score: 0.245117188
}
safety_ratings {
category: HARM_CATEGORY_DANGEROUS_CONTENT
probability: NEGLIGIBLE
probability_score: 0.34765625
severity: HARM_SEVERITY_NEGLIGIBLE
severity_score: 0.099609375
}
safety_ratings {
category: HARM_CATEGORY_HARASSMENT
probability: NEGLIGIBLE
probability_score: 0.20703125
severity: HARM_SEVERITY_LOW
severity_score: 0.215820312
}
safety_ratings {
category: HARM_CATEGORY_SEXUALLY_EXPLICIT
probability: NEGLIGIBLE
probability_score: 0.27734375
severity: HARM_SEVERITY_LOW
severity_score: 0.296875
}
grounding_metadata {
retrieval_queries: "what are the names of CHAPTER VI and CHAPTER VII"
grounding_chunks {
retrieved_context {
uri: "gs://doc-trying1/doc-trying1/Almost Done/CompaniesAct2013.pdf"
title: "CompaniesAct2013"
text: "(2) The Central Government may, by rules, prescribe the manner and the intervals in which the internal audit shall be conducted and reported to the Board. # CHAPTER X \n...\n154…………..”
}
}
grounding_supports {
segment {
start_index: 98
end_index: 149
text: "The source does mention CHAPTER X and CHAPTER XVII."
}
grounding_chunk_indices: 0
confidence_scores: 0.617767155
}
}
avg_logprobs: -0.35637292554301603
}
usage_metadata {
prompt_token_count: 11
candidates_token_count: 31
total_token_count: 42
prompt_tokens_details {
modality: TEXT
token_count: 11
}
candidates_tokens_details {
modality: TEXT
token_count: 31
}
}
model_version: "gemini-1.5-flash-001"
This is the document. Though both the above implementation doesn’t answer my simple question. I have used Layout Parser + Document Chunking. It’s not able to retrieve a basic chunk which was just a normal text in the pdf. How can I make it reliable? [PS: Grounding API searches in whole datastore as I wasn’t able to figure out how to use filter while querying]
Pub/Sub for Data Store Import: On console, When I ingest a metadata.jsonl. In activity, I can see 15 docs imported, it doesn't tell me for each document. I suppose that's the same with apis as well. I am interested in knowing for each file instead of 15 docs at once. (I can be highly wrong about all this)
Miscellaneous:
Since we have all the document metadata and its content indexed (embeddings though), is there a way to achieve on-the-fly keyword full-text search similar to Elasticsearch?
tl;dr-
@MJane Can you answer ?
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |