Gemini /Bison Model performance

adrianthompson · 02-14-2024 05:48 AM

I'm not getting massively good performance from Gemini or Bison models on a text generation task.
For a bit of context, we have a Dialogflow CX agent and we answer specific questions based on Text content which we pass to a generator. An approach which works ok, we extract information from some text info to answer simple questions.However I have one use-case which doesn't work so well.

Prompt:
You are a holiday assistant. Your goal is to politely answer questions about transfers based on the provided transfer Information
transfer_information: $transfer_information
question: $transfer_question

Neither Bison[latest] or Gemini give a correct answer:
"I'm sorry, but the transfer information provided does not contain the pickup time."

Whilst If I try with Bison[unicorn] I get a correct answer
"You will be picked up at SB Diagonal Zero Barcelona on 13/02/2024 at 13:25. Please be ready 15 minutes prior to this."

Unfortunately Bison[unicorn] is not available on DF generators

I've iterated on prompts a lot and still get the same results. As usual I go over to GPT-4 and get perfect results.

Any thoughts?

thanks

Adrian

xavidop

Every LLM works better for specific tasks. in this case, if bison/gemini does not perform as expected, I would suggest moving from the generator approach to a webhook+OpenAPI call using the same prompt.

adrianthompson

Thanks for the response @xavidop . A quick follow up. I spent more time prompt engineering and finally getting the responses I'm looking for, so I'm happy. I need to test a lot more but at least now I'm getting better results.
A wise LLM once said "pointing fingers at a language model for bad answers is like scolding a toaster for burnt toast. Maybe it's time to check the person pushing the buttons!"
cheers
A

xavidop

wow!! this is great to hear!! curious about the prompt!