Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Dialogflow CX data store - structured Big query data - agent is not retrieving the content

Hi, @alessiasacchi 

When I use the data store as structured Big query table in the Dialogflow cx, the agent is not retrieving the content from the big query table. Do I need to tweak anything else for the agent to work. Below is the original response generated in the agent. It simply says code not found even though the document generation is successful in the data store.

 

Stephengcp_0-1703073203247.png

 

0 17 1,667
17 REPLIES 17

Hello Stephen, thank you for your message. I haven't had the chance to test BQ data sources myself. So I have asked internally. Will let you know. 

Thanks for the prompt response, will wait for further updates

Hi, is the data store trained and ready to use?

Hi @xavidop ,

Yes it is. The data store import is completed and I can see the documents being created in the data store. It just has ~200 records and it was created yesterday.

Hi Stephen,  please ensure you're using the correct schema in BQ. Please check this out to learn more. One of our customer engineers will reach out shortly, in the meanwhile he asked me to verify that. Thanks

Hi Alessia, I double checked it is the correct schema in BQ. Also I tried with csv file for RAG, it is still not able to answer to the questions I am asking. Will wait for the customer engineer to triage further on this.

I would keep the two separate (CSV and BQ). Regarding CSV, a while ago another user submitted a bug, just like you're reporting the file had been successfully uploaded and indexed but DF didn't retrieve/return any data. Responded back with "Sorry, etc etc" which sounds very much like a generative response from LLM (works as intended if zero results are returned from the DS). After making very little changes (I added quotes to the header + rows and made the header row lower case ) the same file produced accurate responses. When uploading data to the data store, the CSV format must be used. Each file must have a header row describing the columns. Check out the official doc. Try the example data and see if the error persists. Regarding BQ if you're using the correct schema consider filing a product issue 

Hi Alessia,

Thanks for the detailed response, it helps. I have one further clarification, does the structured data stores in Dialogflow CX is only for FAQ with schema "Question" & "Answers". Because I was under the impression that any structured data in the data store can be embedded & vectorized and used for RAG with Gemini pro or other LLM's. Let me know your suggestions in GCP on the best practices to create embeddings for any structured data and use it for RAG with Gemini pro/other LLM's.

Hi Stephen,

that is correct, you can use the Data store API to retrieve information from those data stores and use those values with other LLMs in other tools:
https://cloud.google.com/generative-ai-app-builder/docs/apis

Hi Stephen, 

There are multiple levels in the stack, from high-level to low level:
- Chat / Conversational AI: Vertex AI Conversation, Dialogflow CX
-  Search / Vector DB / Embeddings: Vertex AI Search, Vertex AI Vector Search (Matching Engine)
-  LLMs: Vertex AI Language (e.g., LLMs, PaLM 2, text-bison, chat-bison, Model Garden)
The lower you get in the stack (closer to the actual LLM calls and Vertex API calls) the more durable it can be. 

You will find this colab useful if you want to dive deeper into RAG and use LLM to query an external system. Please let me know if you have issues accessing it.

Hope this helps,

Alessia

Hi Alessia,

Thanks for the response, I don't have access to the colab notebook you shared. Can you please grant me access to it.

Ups, I am sorry. Let me try, what is your email address?

<PII removed by Staff>.

Please try again and let me know if it works

Apologies, Can you grant access to <PII removed by Staff>. For some reason the google docs are disabled for the earlier provided account.

Ok done. Unfortunately I am a commenter and can't manage access. So my request will go to the owner. 

In regards to the initial question, BigQuery as a data store has to be in a specific schema: question, answer, title, url; it can't be arbitrary data. So the considerations I wrote yesterday about CSV files seem to apply for BQ as well. I cannot find an official doc/sample. Meanwhile please recreate a different schema that adheres to the required format and let us know!