Trouble re-using cached user prompt with different...

zaphod72 · 02-11-2025 05:57 PM

Hi,

Here's my scenario -

I have a set of system prompts that will run as separate inferences with each element of a set of user prompts. The user prompts consist of large PDFs so I want to cache those.

So if I have
System Prompt A, B, C

and
UserPrompt 1, 2, 3

then I will run inferences
SP_A + UP_1
SP_A + UP_2
SP_A + UP_3
SP_B + UP_1
SP_B + UP_2
etc.

As each User Prompt has a large number of tokens I want to cache these.

However I can't seem to be able to just cache the user prompt.

I have

        cached_content = client.caches.create(
            model=llm_id,
            config=CreateCachedContentConfig(
                contents=<List of Part.from_uri>,
                ttl=PROMPT_CACHE_TTL,
            ),
        )
...
    generation_config = GenerateContentConfig(
        temperature=params.temperature,
        max_output_tokens=params.max_tokens,
        system_instruction="You are a helpful assistant."
        cached_content=cache_entry_id,
    )
    generation_response = client.chats.create(
        model=params.llm_id,
        config=generation_config,
    ).send_message(message="Summarize the documents")

I get

google.genai.errors.ClientError: 400 INVALID_ARGUMENT. {
  'error': {
    'code': 400,
    'message': 'Tool config, tools and system instruction should not be set in therequest when using cached content.',
    'status': 'INVALID_ARGUMENT'
}}

(And there's a typo - "therequest")

Thanks

-Darren

Trouble re-using cached user prompt with different system prompts.