Gemini /Bison Model performance

I'm not getting massively good performance from Gemini or Bison models on a text generation task.
For a bit of context, we have a Dialogflow CX agent and we answer specific questions based on Text content which we pass to a generator. An approach which works ok,  we extract information from some text info to answer simple questions.However I have one use-case which doesn't work so well.

Prompt:
You are a holiday assistant. Your goal is to politely answer questions about transfers based on the provided transfer Information
transfer_information: $transfer_information
question: $transfer_question


Neither Bison[latest] or Gemini give a correct answer:
"I'm sorry, but the transfer information provided does not contain the pickup time."

Whilst If I try with Bison[unicorn] I get a correct answer
"You will be picked up at SB Diagonal Zero Barcelona on 13/02/2024 at 13:25. Please be ready 15 minutes prior to this."

Unfortunately Bison[unicorn] is not available on DF generators

I've iterated on prompts a lot and still get the same results. As usual I go over to GPT-4 and get perfect results.

Any thoughts?

thanks

Adrian

0 3 516
3 REPLIES 3

Every LLM works better for specific tasks. in this case, if bison/gemini does not perform as expected, I would suggest moving from the generator approach to a webhook+OpenAPI call using the same prompt.

Thanks for the response @xavidop . A quick follow up.  I spent more time prompt engineering and finally getting the responses I'm looking for, so I'm happy. I need to test a lot more but at least now I'm getting better results.
A wise LLM once said "pointing fingers at a language model for bad answers is like scolding a toaster for burnt toast. Maybe it's time to check the person pushing the buttons!"
cheers
A

wow!! this is great to hear!! curious about the prompt!