Re: Tokenize function/endpoint for PaLM 2 models

aliacar · 11-30-2023 05:27 AM

Hi all,

I am trying to use PaLM 2 models with the Vertex AI Python SDK, it works as expected with the text-bison model with the predict method.

I am looking for a method to tokenize the text before using the predict method. This is useful to truncate if the input text exceeds the context size. For example, other LLM services such as OpenAI and Cohere have tokenizers.

Is there a way to do that with the Python SDK or another Pythonic way? I could not find any.

Thanks.

Joss

Exact same problem here, i am also looking for a solution

lsolatorio

Hi @aliacar,

Welcome and thank you for reaching out to our community.

I understand that you are trying to tokenize your predictions and we appreciate your eagerness to learn it. You are right about not being able to find any reference to this as it is not currently offered by Vertex AI Python SDK specifically for PaLM 2 models.

You can however submit this as a feature request to our Issue tracking system and product feature requests.

Hope this helps.

aliacar

Hey @lsolatorio thanks for your answer.