Solved: Vertex AI Gemini Model's configuration parameters:...

PhuuPwint · 03-13-2024 09:47 PM

Hi,

Currently, I'm testing Gemini model on both Vertex and Google AI studios, but the answers are not stable. And I think this is related to hyperparameter tuning.

So, I want to know the details of temperature, top-p and top-k parameters and how they works in the text generating process. Thank you!

gimaldi

LLM models are not deterministic, so even using the same hyperparameters settings it is normal for different answers to be given to the same question.
In practice, hyperparameters try to tune the probabilities that a word is consistent with the text that precedes it. For example, think of a simple completion of the sentence "I slept":

I slept in bed
I slept on the couch
I slept on the train
I slept on a cloud

As you can see, they are in order of probability: I slept in bed is the most probable completion, I slept on a cloud is the least probable. If you set the temperature to 0 then the model will almost always answer I slept in bed and sometimes I slept on the couch. If you set the temperature to the opposite, i.e. to 1, then the model will provide you with a more random response among the 4 in the example. Therefore, a temperature from 0.7 to 1 is more suitable for creative contexts, while a temperature between 0 and 0.3 is preferable if you want more technically precise answers. top_k and top_p act like temperature, but using different algorithms. I have no expertise on the algorithms, but in general the same temperature rule applies to both, namely lower value for less random responses and a higher value for more random responses.

I'll try to give you a real example performed with Gemini Pro 1.0. I set the maximum values top_k = 40 and top_P = 1 and kept them fixed. I did tests with different temperature values, three tests for each value. The task consists of generating a title for an article.

PROMPT
Create 1 title for the following article.

article:
As many businesses figure out new ways to go digital, one thing is clear: talent continues to be one of the key ways to enable an inclusive digital economy. Employers in Asia Pacific list technology as the leading in-demand skill, with digital marketing and e-commerce following close behind. Simultaneously, many people are looking to learn new skills that will help them meet the requirements of the evolving job market. So we must create new ways to help businesses and job seekers alike.

RESULTS

- temperature = 0.0 [top_k = 40, top_P = 1]
1. Bridging the Digital Divide: Talent as the Key to an Inclusive Digital Economy
2. Bridging the Digital Divide: Talent as the Key to an Inclusive Digital Economy
3. Bridging the Digital Divide: Talent as the Key to an Inclusive Digital Economy

- temperature = 0.3 [top_k = 40, top_P = 1]
1. Unlocking Digital Inclusion: Talent as the Key to an Equitable Economy
2. Bridging the Digital Divide: Talent and Skills for an Inclusive Digital Economy
3. Bridging the Digital Divide: Talent and Skills for an Inclusive Digital Economy

- temperature = 0.6 [top_k = 40, top_P = 1]
1. Bridging the Digital Divide: Talent as a Catalyst for an Inclusive Economy
2. Unlocking Digital Inclusion: The Role of Talent Acquisition and Reskilling
3. Unlocking an Inclusive Digital Economy: The Essential Role of Talent

- temperature = 1.0 [top_k = 40, top_P = 1]
1. Bridging the Digital Talent Gap: Empowering Businesses and Job Seekers in a Connected Future
2. Unlocking the Future of Digital Success: Talent as the Cornerstone of an Inclusive Economy
3. Bridging the Digital Divide: Talent as the Gateway to an Inclusive Digital Economy

Ciao

View solution in original post

gimaldi

Hello,
for all the details I recommend referring to the official Vertex documentation

Quoting the documentation the big picture is "For each token selection step (by model), the top-K tokens with the highest probabilities are sampled. Then tokens are further filtered based on top-P with the final token selected using temperature sampling."

As a very simplified example:

prompt: "Describe the content of a "
model token selection: [picture, book, article, movie, song, poem, essay, story, presentation, report]
top K: [book, article, poem, essay, story, presentation, report]
top P: [article, presentation, report]
temperature: report
output: report

Hope it's useful
Ciao

PhuuPwint

Thank you so much! But I haven't understood yet. Since I need to explain my co-workers detailed, more deep-dive examples are needed.

I will explore to the link you mentioned. Thank you.

gimaldi

LLM models are not deterministic, so even using the same hyperparameters settings it is normal for different answers to be given to the same question.
In practice, hyperparameters try to tune the probabilities that a word is consistent with the text that precedes it. For example, think of a simple completion of the sentence "I slept":

I slept in bed
I slept on the couch
I slept on the train
I slept on a cloud

As you can see, they are in order of probability: I slept in bed is the most probable completion, I slept on a cloud is the least probable. If you set the temperature to 0 then the model will almost always answer I slept in bed and sometimes I slept on the couch. If you set the temperature to the opposite, i.e. to 1, then the model will provide you with a more random response among the 4 in the example. Therefore, a temperature from 0.7 to 1 is more suitable for creative contexts, while a temperature between 0 and 0.3 is preferable if you want more technically precise answers. top_k and top_p act like temperature, but using different algorithms. I have no expertise on the algorithms, but in general the same temperature rule applies to both, namely lower value for less random responses and a higher value for more random responses.

I'll try to give you a real example performed with Gemini Pro 1.0. I set the maximum values top_k = 40 and top_P = 1 and kept them fixed. I did tests with different temperature values, three tests for each value. The task consists of generating a title for an article.

PROMPT
Create 1 title for the following article.

article:
As many businesses figure out new ways to go digital, one thing is clear: talent continues to be one of the key ways to enable an inclusive digital economy. Employers in Asia Pacific list technology as the leading in-demand skill, with digital marketing and e-commerce following close behind. Simultaneously, many people are looking to learn new skills that will help them meet the requirements of the evolving job market. So we must create new ways to help businesses and job seekers alike.

RESULTS

- temperature = 0.0 [top_k = 40, top_P = 1]
1. Bridging the Digital Divide: Talent as the Key to an Inclusive Digital Economy
2. Bridging the Digital Divide: Talent as the Key to an Inclusive Digital Economy
3. Bridging the Digital Divide: Talent as the Key to an Inclusive Digital Economy

- temperature = 0.3 [top_k = 40, top_P = 1]
1. Unlocking Digital Inclusion: Talent as the Key to an Equitable Economy
2. Bridging the Digital Divide: Talent and Skills for an Inclusive Digital Economy
3. Bridging the Digital Divide: Talent and Skills for an Inclusive Digital Economy

- temperature = 0.6 [top_k = 40, top_P = 1]
1. Bridging the Digital Divide: Talent as a Catalyst for an Inclusive Economy
2. Unlocking Digital Inclusion: The Role of Talent Acquisition and Reskilling
3. Unlocking an Inclusive Digital Economy: The Essential Role of Talent

- temperature = 1.0 [top_k = 40, top_P = 1]
1. Bridging the Digital Talent Gap: Empowering Businesses and Job Seekers in a Connected Future
2. Unlocking the Future of Digital Success: Talent as the Cornerstone of an Inclusive Economy
3. Bridging the Digital Divide: Talent as the Gateway to an Inclusive Digital Economy

Ciao

PhuuPwint

My deepest thanks for your consideration.

It's really useful. I've got the answer.

Phuu

Vertex AI Gemini Model's configuration parameters: Temperature, Top-K, Top-P