Hi, I am having trouble working with a data store Tool using indexed data from a csv file. I'm planing to invoke this code from a TOOL via the Vertex AI Agent Builder.
When I query the data store I'm getting more results returned than I would have expected.
For example, csv data:
id,pets,enriched_result
1,I have a question about dogs,please check out dog information https://acme/dogs
2,I have a question about cats,please check out cat information https://acme/cats
My tests:
Dogs, returns both cats and dogs:
query: "any info about dogs?",
want: "please check out dog information https://acme/dogs",
got: "please check out dog information https://acme/dogs,please check out cat information https://acme/cats"
Cats, returns both cats and dogs:
query: "any info about cats?",
want: "please check out cat information https://acme/cats",
got: "please check out dog information https://acme/dogs,please check out cat information https://acme/cats"
Birds, returns just cats:
query: "any info about birds?",
want: "no results found",
got: "please check out cat information https://acme/cats"
My code, is in go but can convert it if easier but I hope it helps to demonstrate the approach, I used this example as a base
package pets
import (
"context"
"errors"
"fmt"
"strings"
discoveryengine "cloud.google.com/go/discoveryengine/apiv1beta"
"cloud.google.com/go/discoveryengine/apiv1beta/discoveryenginepb"
"github.com/chainguard-dev/clog"
"google.golang.org/api/iterator"
"google.golang.org/api/option"
)
const (
projectID = "foo"
searchEngineID = "pets-ds_123"
location = "us"
endpoint_base = "discoveryengine.googleapis.com:443"
)
func SearchQuery(ctx context.Context, query string) (string, error) {
log := clog.FromContext(ctx)
endpoint := endpoint_base
if location != "global" {
endpoint = fmt.Sprintf("%s-%s", location, endpoint_base)
}
client, err := discoveryengine.NewSearchClient(ctx, option.WithEndpoint(endpoint))
if err != nil {
return "", fmt.Errorf("creating Vertex AI Search client: %w", err)
}
defer client.Close()
// Full resource name of search engine serving config
servingConfig := fmt.Sprintf("projects/%s/locations/%s/collections/default_collection/dataStores/%s/servingConfigs/default_serving_config",
projectID, location, searchEngineID)
searchRequest := &discoveryenginepb.SearchRequest{
ServingConfig: servingConfig,
Query: query,
RelevanceThreshold: discoveryenginepb.SearchRequest_HIGH,
}
extraPetInfo := []string{}
count := 0
it := client.Search(ctx, searchRequest)
for {
resp, err := it.Next()
if errors.Is(err, iterator.Done) {
log.Infof("%d No more results", count)
break
}
if err != nil {
return "", err
}
extraPetInfo = append(extraPetInfo, resp.GetDocument().GetStructData().GetFields()["enriched_result"].GetStringValue())
log.Infof("%+v\n", resp)
}
if len(extraPetInfo) == 0 {
return "No results found", nil
}
return strings.Join(extraPetInfo, "\n"), nil
}
This is just an example but demonstrates the same behaviour I'm having with a real world scenario.
Are my assumptions correct in how this should fit together? Should I be able to pass a query to the search and match against a pet column so I can return another field from that matched row?
Any suggestions on how I can adapt the code? Thanks.
Hi @rawlingsj,
Welcome to Google Cloud Community!
Based on your description, you're encountering unexpected behavior in your data querying logic. It seems that the query might not be targeting the specific field you're intending to search against. Vertex AI Search requires you to specify which field to match against in your queries. This ensures that the search is more precise and relevant to the specific data you are querying. By defining the field, you can avoid generic matches and improve the accuracy of your search results. With regard to this, you can consider the following, which might help you answer your current scenario:
I hope the above information is helpful.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |