Gen AI App Builder - cannot integrate properly datastore from bigquery

I'm genuinely excited about the release of the Gen AI App Builder, now available for general access. I've attempted to set up a chat app that responds to questions using a datastore I had indexed from BigQuery. I even tried integrating it with Dialogflow and added the datastore there, but to no avail.

The documentation has been a challenge to decipher, and most tutorials I found focus on data sources from sites, which I can't verify. If anyone has insights or suggestions to make this process easier, I'd appreciate the guidance.

Thanks,

Ludovico

Solved Solved
1 19 2,243
1 ACCEPTED SOLUTION

I managed to integrate the data from bigquery into the datastore but using a different a different approach.

As reported in the documentation of Gen AI App Builder here are the format supported and example. As you see csv is supported only for Q&A providing pair of questions and answer.
Considering this; I exported from bigquery as csv and converted each of my single row to pdf and it works fine.

https://cloud.google.com/generative-ai-app-builder/docs/agent-data-store

Supported content

There are three types of data stores and each support different types of data.

 

Data store type Supported formats Examples Input mechanism

Webhtml, pdfwww.example.com/*collected from google search index
Unstructuredhtml, pdfmydoc.pdf, private.htmluploaded through Cloud Storage bucket or BigQuery
Structuredcsv"why?","why not"uploaded through Cloud Storage bucket or BigQuery

View solution in original post

19 REPLIES 19

I am also facing similar issue while adding structured data from Big Query to create a chat bot.

My usecase uses a simple transaction data with columns like date, Store, sales.
I want to pass this data and want the chatbot to answer questions like which store corresponds to max sales. Is there any correlation between sales and the date etc? 

I managed to integrate the data from bigquery into the datastore but using a different a different approach.

As reported in the documentation of Gen AI App Builder here are the format supported and example. As you see csv is supported only for Q&A providing pair of questions and answer.
Considering this; I exported from bigquery as csv and converted each of my single row to pdf and it works fine.

https://cloud.google.com/generative-ai-app-builder/docs/agent-data-store

Supported content

There are three types of data stores and each support different types of data.

 

Data store type Supported formats Examples Input mechanism

Webhtml, pdfwww.example.com/*collected from google search index
Unstructuredhtml, pdfmydoc.pdf, private.htmluploaded through Cloud Storage bucket or BigQuery
Structuredcsv"why?","why not"uploaded through Cloud Storage bucket or BigQuery

Glad you were able to find another approach - thanks for coming back and updating your question with the solution @l-cesaro!

Hi @l-cesaro , thanks for sharing this. Question on the conversion from BigQuery to csv and pdf, and then using it as the data store. Were you able to do this programmatically? Appreciate it if you could share the details as well. thanks

@l-cesaroany specific tool you used to do the conversion from row to PDF?

I wrote a script myself but you can use this one that worked pretty good.

https://cloudconvert.com/csv-to-pdf

thank you so much!

@l-cesaroI'm seeing now that they also accept GCS bucket full of JSON files - perhaps converting each row to JSON may be a better solution than converting to PDF. Hmm, I wonder which one may be better in terms of retrieval

For our use case we needed a single datastore for unstructured and structured data that's why I converted the CSVs to unstructured. In case you can keep the search applications and their related for datastore separated you can go for that. Hopefully they will enable search applications on multiple datastore.

I am facing the same problem and I am trying to understand if we have to convert structured data into Unstructured before using this new GenAI chat then why bigquery integration is given in the first place.
I simply want to use bigquery as datastore in my dialogflow CX without any conversion. 
Is there any solution to this ?

@xavidop I feel like you might be able to help on this one. 😉

@AndrewB you were right!

Hi @amitagarg22 no conversion is required, you just need to select the source to BQ and it will automatically index: https://cloud.google.com/generative-ai-app-builder/docs/create-data-store-es#bigquery

Hi @xavidop , tried it and its working for search but not for chat.

Hi @RVinegas , For me its not working for search also. Can you share some screenshots on steps you have followed.

I created one search app , created new data store with bigquery structured data. After sometime i started asking question from the preview section.

Hi @amitagarg22 , i just followed the steps from this reference no extra steps were done. Were you able to confirm that your data from BigQuery were successfully ingested (Step 12)? 

@RVinegas I followed the same steps. Not sure , if it is related to type of data..
Below is data i have in bigQuery. And I am asking questions like details for customerCode 1533584 etc etc.

amitagarg22_0-1707964686918.png

 

@amitagarg22 The screenshot you shared is from BigQuery. Were you able to confirm that your data from BigQuery were successfully ingested?

To check the status of your ingestion, go to the Data Stores page and click your data store name to see details about it on its Data page. When the status column on the Activity tab changes from In progress to Import completed, the ingestion is complete. Depending on the size of your data, ingestion can take several minutes to several hours.

Yes, I was successful

amitagarg22_0-1707968907398.png

And there are only 50 rows in database.