Hi,
I recently been looking into the Agent Builder, creating data stores and agent. However, I'm really struggling to find a way to connect my bigquery table with agent or gemini API (it can easily be connect with a search app though). I can create a datastore using bigquery, but this is incompatible with both the agent and gemini API to be used as a datastore.
Is there any way around?
Is it advisable to go full DIY and try out the embedding and vector search APIs for bigquery data? (Although seems to be lot complicated)
Hi,
Data stores have a direct connection to bigquery: https://cloud.google.com/dialogflow/vertex/docs/concept/data-store
If that option does not fit your need or does not work very well, you can create a pipeline that bulks data to a bucket and then you have that bucket connected to a datastore that is being re-indexed.
Best,
Xavi
Hi @xavidop ,
Thanks for the response.
However, you can create a datastore directly from BigQuery table, but that datastore cannot be used in Agent app nor with the Gemini API, only compatible with the search app.
If I try to push the bigquery table directly in GCS, there are following scenarios,
1. If I direct export the table to GCS it is of type "application/octet-stream", which is not supported type for import of datastore as shown below
2. If somehow manage to upload a CSV on GCS and then try to create a datastore using Cloud Storage, it says the following warning as below. As CSV I believe are only supported for FAQs and nothing more,
What will be your suggestion on it? use of JSONL? or any other way?
I'm not sure as this is pretty common use case to import data from BigQuery to Datastore for Agent app. Either I'm over complicating it or is this feature not available at all.
Update:
It is neither working with JSONL while connecting it with Gemini API. Have the following error as below,
Hi,
You have to create a datastore from the agent not from vertex ai. With that, you will link the datastore to your agent and you will be able to use it!
Best,
Xavi
I have created the datastore from Agent Builder. The below screenshot I shared earlier is from Agent Builder Datastore console
As you mentioned, I tried a way around like pushing the BQ table as CSV or JSON from GCS Bucket.
However, this still doesn't work. CSV are only for FAQ data and not for any BQ table content, as mentioned in the below warning on the console,
Also, even when I upload it using the CSV or JSON format I would only be able to integrate it as FAQ under datastore for Agent
and this tool is not working at all when you actually connect it to the agent and query it. Might be because the data I provide should be of FAQ format and is not.
yeah, you will need to preprocess your CSV file:
But the above link is providing you preprocessing of CSV for FAQ data only. Which means the that only FAQ data can be uploaded in datastore with the mentioned schema in the link provided.
That is the point I'm trying to make is that if you wanna have any structured type of datastore it could only contain FAQ data and not any bigquery table or other formats.
Hence I'm unable to use the BigQuery tables for datastore in Agent. Below example to just clarify of what possible type of data I might wanna to upload
You can create a data store from that BQ table without issues. The fact that the UI says structured FAQ table with CSV files is just an example. You can create data stores without issues with BQ. I would recommend trying that out
I'm not sure I get it right.
Yes there is option for to create datastore from BigQuery table like below
But datastore once created can only be connected with search app and not with the agent nor with the gemini API. That is the issue
no, I have connected a bigquery datastore linked to an agent. you have to create the data store from the data store handler
Can you share some steps with screenshots of how this could happen? Would be really helpful to sort this out.
Like only know about the data store page in agent builder with link below
https://console.cloud.google.com/gen-app-builder/data-stores
I'm not sure if there is any other place to look for data store connecting with the agent.
sorry about that,
I just checked and from conversational apps, I can only create them using as data sources websites and storage like buckets.
Any updates on this? Thanks
Unfortunately nothing for now.
Although BigQuery Datastore connectivity to work with Agent Builder and Gemini API is one the most prominent use case and yet unavailable.
Might be they include in future updates.
There's still a possibility to retrieve structured data stored in BQ from a data store agent. You can create a synthetic question answer CSV file from the structured dataset. Let me give you an example. Let's say you have the following schema:
{
"COD_Ateneo": "00101",
"ANNO_VALIDITA": 2023,
"NomeOperativo": "Torino",
"NUMERO": "L\/DS",
"DES": "Scienze della difesa e della sicurezza",
"NOME_CORSO": "Scienze Strategiche e della Sicurezza",
"DESC_SEDE": "Corso Regina Margherita, 60\/A - Torino",
"COMUNE": "TORINO",
"PROVINCIA": "TORINO",
"LINGUA_DEL_CORSO": "italiano",
"TELEMATICA": "didattica tradizionale in presenza",
"REGIONE": "PIEMONTE",
"CODICE_UNIVOCO": 0,
"ISCRITTI": 77820,
"STATALE": "Statale",
"ZONA": "NORD OVEST",
"URL": "www.unito.it",
"CITY": "TORINO"
},
You could generate the following question-answer pair from that data (not specifically related to the data):
Question: Quali università offrono "Culture e Letterature del Mondo Moderno" come campo di studio?
Answer: Le università che offrono "Culture e Letterature del Mondo Moderno" come campo di studio includono: Torino
@alessiasacchi
Do you think I can create a csv with column headings "question, answer, title, url" and then create a datastore as unstructured.
Please help me with resources to set examples training the vertax ai agent for datastore. any references for structured/unstructured would be of great help.
I am facing the same issue, although I have created a structured datastore using bigquery, later created a tool as well for this datastore with FAQ type but I am facing difficulty setting input and output parameters.
@alessiasacchi can you help here?
@ammar_hanif That is correct. At the moment Data store agent only supports structured data in BigQuery in the CSV format below where title and url are optional.
"question","answer","title","url"
Reading through the thread I noticed that you mixed search and chat apps. Data stores used by search and chat apps are different and you cannot interchange them. So if you are trying to build a chat app you must create a data store that chat apps can fetch (you cannot use a data store that you had previously created for a search app). @mayurrathicg There are many reasons why the tool is not fetching any data, are you using the CSV format in your BQ table? If so, since debugging a data store agent is easier than debugging an agent app I would try the same query in Dialogflow and check Cloud Logging (or even just the original JSON response in the simulator).
Thanks for the insight @alessiasacchi,
I have a list of 20 products(food menu), with around 30 fields each. I am creating a chat app (ordering assistant).
What should be my approach to create the datastore, and how to query it. Any example be of great help.
I have the same issue. In my case is much more harder because my client wants to ask somenting in his BQ database with more than 60K lines in natural language. Agent Search does not work properly. If you don't have that much data, you could import that data to a cloud storage bucket then create a cloud storage datastore.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |