Does Context Caching change the Tokens-Per-Minute ...

This website uses Cookies. Click Accept to agree to our website's cookie use as described in our Privacy Policy. Click Preferences to customize your cookie settings.

Reject

Preferences

Google Cloud
Google Workspace
AppSheet
Looker & Looker Studio
Google Cloud Security

Google Cloud Home
Cloud Forums
Groups
- Cloud FinOps and Cost Optimization Community
Learning & Certification Hub
Articles & Information
Community Resources
Cloud Events

cancel

Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Search instead for

Did you mean:

Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Google Cloud
Cloud Forums
AI/ML
Does Context Caching change the Tokens-Per-Minute ...

Topic Options

Subscribe to RSS Feed
Mark Topic as New
Mark Topic as Read
Float this Topic for Current User
Bookmark
Subscribe
Mute
Printer Friendly Page

Solved

Does Context Caching change the Tokens-Per-Minute (TPM) rate limit?

Posted on 07-18-2024 02:00 PM

Share this topic

Twitter

piercelamb

Bronze 1

Post Options

Mark as New
Bookmark
Subscribe
Mute
Subscribe to RSS Feed
Permalink
Print
Report Inappropriate Content

Reply posted on --/--/---- --:-- AM

Post Options

Mark as New
Bookmark
Subscribe
Mute
Subscribe to RSS Feed
Permalink
Print
Report Inappropriate Content

If I use Context Caching to cache a million tokens, then run n small prompts that use those million tokens in the cache, am i still limited to 2 million tokens per minute?

For e.g. if I cache a million tokens and make 3 prompts that are 500 tokens that reference the cache then execute all 3 parallel, does the 3rd prompt get rate limited?

I assume the above is how it would work if i _was not_ using the cache, I'm curious if the cache changes anything?

0 0 286

Topic Labels

Labels:
Gemini

0 Likes

View All Topics In this Discussion Space
Previous Topic
Next Topic

0 REPLIES 0

Preview Exit Preview

never-displayed

Additional options

Associated Products

You do not have permission to remove this product association.

Top Labels in this Space

AI ML General 1,045
AutoML 275
Bison 33
Cloud Error Reporting 1
Cloud Natural Language API 135
Cloud TPU 30
Contact Center AI 83
Dialogflow 727
Document AI 262
express mode 1
Gecko 8
Gemini 411
Gen App Builder 181
Generative AI Studio 216
Google AI Studio 107
Model Garden 70
Otter 3
PaLM 2 40
Recommendations AI 95
Speech-to-Text 148
Tensorflow Enterprise 12
Text-to-Speech 129
Translation AI 132
Unicorn 4
Vertex AI Model Registry 289
Vertex AI Platform 1,401
Vertex AI Workbench 192
Video AI 51
Vision AI 181