I'm trying to build a solution that accomplishes the following:
So I guess the first question is, are the steps I've listed above the appropriate steps to build a RAG solution from data in a GCS bucket?
I've gone through several notebooks on the Google Gen AI Github repo. I can get those to work just fine, but I can't seem to get anywhere when I attempt to customize them to accomplish what I've listed above. Is anyone aware of any good step by step documentation or code samples that performs what I'm trying to do?
Hi Jason -- Google Cloud has a couple different offerings for building a RAG app. Based on your description above, Vertex AI Search & Conversation (VASC) might be a good pick. (This product went by a few other names, previously - Discovery Engine, Gen AI App Builder, Enterprise Search)
Here's a code example in Java Spring, from some recent experimentation I did.
import com.google.cloud.discoveryengine.v1.SearchRequest;
import com.google.cloud.discoveryengine.v1.SearchResponse;
import com.google.cloud.discoveryengine.v1.SearchServiceClient;
import com.google.cloud.discoveryengine.v1.SearchServiceSettings;
import com.google.cloud.discoveryengine.v1.ServingConfigName;
import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.generativeai.preview.ChatSession;
import com.google.cloud.vertexai.generativeai.preview.GenerativeModel;
import com.google.cloud.vertexai.generativeai.preview.ResponseHandler;
...
@PostMapping(value = "/chat", consumes = "application/json", produces = "application/json")
public ChatMessage message(@RequestBody ChatMessage message) {
String userPrompt = message.getPrompt();
logger.info("💬 POST /chat, prompt: " + userPrompt);
// 1 - Query Vertex AI Search (VASC aka discoveryengine API) for matching
// documents
String projectId = "YOUR_PROJECT_ID";
String location = "global";
String collectionId = "default_collection";
String dataStoreId = "YOUR_VASC_DATASTORE";
String servingConfigId = "default_search";
String searchQuery = userPrompt;
String endpoint = String.format("discoveryengine.googleapis.com:443", location);
String augment = "";
try {
SearchServiceSettings settings = SearchServiceSettings.newBuilder().setEndpoint(endpoint).build();
SearchServiceClient searchServiceClient = SearchServiceClient.create(settings);
SearchRequest request = SearchRequest.newBuilder()
.setServingConfig(
ServingConfigName.formatProjectLocationCollectionDataStoreServingConfigName(
projectId, location, collectionId, dataStoreId, servingConfigId))
.setQuery(searchQuery)
.setPageSize(10)
.build();
SearchResponse response = searchServiceClient.search(request).getPage().getResponse();
for (SearchResponse.SearchResult element : response.getResultsList()) {
Struct derivedStructData = element.getDocument().getDerivedStructData();
Map<String, Value> fields = derivedStructData.getFieldsMap();
Value extractiveAnswersValue = fields.get("extractive_answers");
ListValue listValue = extractiveAnswersValue.getListValue();
Value firstValue = listValue.getValues(0);
Struct structValue = firstValue.getStructValue();
Map<String, Value> innerFields = structValue.getFieldsMap();
Value contentValue = innerFields.get("content");
String stringValue = contentValue.getStringValue();
augment += stringValue;
}
} catch (Exception e) {
logger.error("⚠️ Vertex AI ERROR: " + e);
}
// 2 - Use augmented prompt to query Gemini (Vertex AI API)
String geminiPrompt = "You are a helpful car manual chatbot. Answer the car owner's question about their car. Human prompt: "
+ userPrompt
+ ",\n Use the following grounding data as context. This came from the relevant vehicle owner's manual: "
+ augment;
logger.info("🔮 GEMINI PROMPT: " + geminiPrompt);
String geminiLocation = "us-central1";
String modelName = "gemini-pro";
try {
VertexAI vertexAI = new VertexAI(projectId, geminiLocation);
GenerateContentResponse response;
GenerativeModel model = new GenerativeModel(modelName, vertexAI);
ChatSession chatSession = new ChatSession(model);
response = chatSession.sendMessage(geminiPrompt);
String strResp = ResponseHandler.getText(response);
logger.info("🔮 GEMINI RESPONSE: " + strResp);
message.setResponse(strResp);
} catch (Exception e) {
logger.error("⚠️ GEMINI ERROR: " + e);
}
return message;
}
}
Hi,
using VASC from console it's possible to configure the option search type with "search with a response" in the widget configuration tab.
Using the SearchServiceClient it's possible to enable this feature? The SearchResponse object has a getSummary() method but returns an empty string
Thanks
[SOLVED]
Looking the curl call in the integration section I found the missing parameters. Here the example code
SearchRequest request =
SearchRequest.newBuilder()
.setServingConfig(
ServingConfigName.formatProjectLocationCollectionDataStoreServingConfigName(
projectId, location, collectionId, dataStoreId, servingConfigId))
.setQuery(searchQuery)
.setPageSize(5)
.setQueryExpansionSpec(SearchRequest.QueryExpansionSpec.newBuilder().setCondition(SearchRequest.QueryExpansionSpec.Condition.AUTO).build())
.setContentSearchSpec(
SearchRequest.ContentSearchSpec.newBuilder()
.setSummarySpec(SearchRequest.ContentSearchSpec.SummarySpec.newBuilder().setSummaryResultCount(5)
.setModelPromptSpec(SearchRequest.ContentSearchSpec.SummarySpec.ModelPromptSpec.newBuilder().setPreamble(prompt).getDefaultInstanceForType())
.setModelSpec(SearchRequest.ContentSearchSpec.SummarySpec.ModelSpec.newBuilder().setVersion("preview").build())
.setIncludeCitations(true)
.setIgnoreAdversarialQuery(true)
.build())
.build())
.build();
You can also use Vector Search with langchain, but in my test it got much much more expensive then Search & Conversation.
I'm using Vertex Generic Search with structured data. The moment one uses order_by, the results go haywire.
Could anyone steer me in the correct direction? The moment I try to sort_by on a numeric field, the search results go haywire, and everything seems to be returned, in no order whatsoever. The modelYear below is a numeric field. In the sample below I tried boosting but that also influences everything.
# Refer to the `SearchRequest` reference for all supported fields:
# https://cloud.google.com/python/docs/reference/discoveryengine/latest/google.cloud.discoveryengine_v1.types.SearchRequest
request = discoveryengine.SearchRequest(
serving_config=serving_config,
query=search_query,
page_size=10,
content_search_spec=content_search_spec,
query_expansion_spec=discoveryengine.SearchRequest.QueryExpansionSpec(
condition=discoveryengine.SearchRequest.QueryExpansionSpec.Condition.DISABLED,
),
spell_correction_spec=discoveryengine.SearchRequest.SpellCorrectionSpec(
mode=discoveryengine.SearchRequest.SpellCorrectionSpec.Mode.MODE_UNSPECIFIED,
),
# Optional: Boost search results based on conditions
boost_spec=discoveryengine.SearchRequest.BoostSpec(
condition_boost_specs=[
discoveryengine.SearchRequest.BoostSpec.ConditionBoostSpec(
condition="modelYear = 2023",
boost=1
),
]
)
# Optional: Use fine-tuned model for this request
# custom_fine_tuning_spec=discoveryengine.CustomFineTuningSpec(
# enable_search_adaptor=True
# ),
#order_by="modelYear desc",
#filter="modelYear = 2025",
)
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |