I'm genuinely excited about the release of the Gen AI App Builder, now available for general access. I've attempted to set up a chat app that responds to questions using a datastore I had indexed from BigQuery. I even tried integrating it with Dialogflow and added the datastore there, but to no avail.
The documentation has been a challenge to decipher, and most tutorials I found focus on data sources from sites, which I can't verify. If anyone has insights or suggestions to make this process easier, I'd appreciate the guidance.
Thanks,
Ludovico
Solved! Go to Solution.
I managed to integrate the data from bigquery into the datastore but using a different a different approach.
As reported in the documentation of Gen AI App Builder here are the format supported and example. As you see csv is supported only for Q&A providing pair of questions and answer.
Considering this; I exported from bigquery as csv and converted each of my single row to pdf and it works fine.
https://cloud.google.com/generative-ai-app-builder/docs/agent-data-store
There are three types of data stores and each support different types of data.
Data store type Supported formats Examples Input mechanism
Web | html, pdf | www.example.com/* | collected from google search index |
Unstructured | html, pdf | mydoc.pdf, private.html | uploaded through Cloud Storage bucket or BigQuery |
Structured | csv | "why?","why not" | uploaded through Cloud Storage bucket or BigQuery |
I am also facing similar issue while adding structured data from Big Query to create a chat bot.
My usecase uses a simple transaction data with columns like date, Store, sales.
I want to pass this data and want the chatbot to answer questions like which store corresponds to max sales. Is there any correlation between sales and the date etc?
I managed to integrate the data from bigquery into the datastore but using a different a different approach.
As reported in the documentation of Gen AI App Builder here are the format supported and example. As you see csv is supported only for Q&A providing pair of questions and answer.
Considering this; I exported from bigquery as csv and converted each of my single row to pdf and it works fine.
https://cloud.google.com/generative-ai-app-builder/docs/agent-data-store
There are three types of data stores and each support different types of data.
Data store type Supported formats Examples Input mechanism
Web | html, pdf | www.example.com/* | collected from google search index |
Unstructured | html, pdf | mydoc.pdf, private.html | uploaded through Cloud Storage bucket or BigQuery |
Structured | csv | "why?","why not" | uploaded through Cloud Storage bucket or BigQuery |
Glad you were able to find another approach - thanks for coming back and updating your question with the solution @l-cesaro!
Hi @l-cesaro , thanks for sharing this. Question on the conversion from BigQuery to csv and pdf, and then using it as the data store. Were you able to do this programmatically? Appreciate it if you could share the details as well. thanks
@l-cesaroany specific tool you used to do the conversion from row to PDF?
I wrote a script myself but you can use this one that worked pretty good.
https://cloudconvert.com/csv-to-pdf
thank you so much!
@l-cesaroI'm seeing now that they also accept GCS bucket full of JSON files - perhaps converting each row to JSON may be a better solution than converting to PDF. Hmm, I wonder which one may be better in terms of retrieval
For our use case we needed a single datastore for unstructured and structured data that's why I converted the CSVs to unstructured. In case you can keep the search applications and their related for datastore separated you can go for that. Hopefully they will enable search applications on multiple datastore.
I am facing the same problem and I am trying to understand if we have to convert structured data into Unstructured before using this new GenAI chat then why bigquery integration is given in the first place.
I simply want to use bigquery as datastore in my dialogflow CX without any conversion.
Is there any solution to this ?
@xavidop I feel like you might be able to help on this one. 😉
@AndrewB you were right!
Hi @amitagarg22 no conversion is required, you just need to select the source to BQ and it will automatically index: https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es#bigquery
Hi @xavidop , tried it and its working for search but not for chat.
Hi @RVinegas , For me its not working for search also. Can you share some screenshots on steps you have followed.
I created one search app , created new data store with bigquery structured data. After sometime i started asking question from the preview section.
Hi @amitagarg22 , i just followed the steps from this reference no extra steps were done. Were you able to confirm that your data from BigQuery were successfully ingested (Step 12)?
@RVinegas I followed the same steps. Not sure , if it is related to type of data..
Below is data i have in bigQuery. And I am asking questions like details for customerCode 1533584 etc etc.
@amitagarg22 The screenshot you shared is from BigQuery. Were you able to confirm that your data from BigQuery were successfully ingested?
To check the status of your ingestion, go to the Data Stores page and click your data store name to see details about it on its Data page. When the status column on the Activity tab changes from In progress to Import completed, the ingestion is complete. Depending on the size of your data, ingestion can take several minutes to several hours.
Yes, I was successful
And there are only 50 rows in database.
Hi @amitagarg22 were you able to find what the issue was?
facing the similar issue here. Please respond
Till this day the issue remains.
There is no way to directly connect bigquery table as a datastore for your chat, agent or gemini API. I mean why?
One can create datastore from unstructured documents (which I say are not performing up to the mark) but unable to do the same for a well structured bigquery table?
I'll be glad to know if someone has found a way out of it, or is there any future integration coming to support bigquery tables in chat, agent and gemini API.
You can create an automation that dumps the big query table in a bucket and have that bucket connected to a data store
User | Count |
---|---|
2 | |
2 | |
1 | |
1 | |
1 |