Re: Monitor the usage of Gemini API on Vertex AI

hilmanarc · 10-31-2024 11:56 PM

Hi,

I am currently running the Gemini API on Vertex AI. I would like to monitor the API usage, specifically the number of tokens used over a specific time period or on a daily basis. While I understand that the billing reflects usage in terms of cost, how can I track it in terms of token consumption?

Thank you!

TimotiSburv

Hello,

try logging token counts for each request within your app, then use Cloud Logging or BigQuery to view daily or periodic usage. Alternatively, check Usage Reports or Cloud Monitoring for any token-based metrics available.

Hope it helps

dawnberdan

Hi @hilmanarc.,

Welcome to Google Cloud Community!

To track token consumption for your Gemini API on Vertex AI, you can use these methods:

CountTokens API: Before sending requests, use this API to directly calculate input tokens. This will help you estimate costs and avoid exceeding the context window.
Cloud Monitoring: Set up custom metrics to track your token usage. You can visualize the data and configure alerts to notify you when usage reaches certain thresholds.
Logging: Log token usage data from your application. This will provide detailed information for analysis and insights.

To ensure cost-effectiveness, remember to regularly monitor your usage, select the appropriate Gemini model for your requirements, and optimize your prompts.

I hope the above information is helpful.

andycross

How do I do this and guarantee I get the same answer as Google for billing purposes? Other LLM providers will return a 'usage' key with exact token counts.

dtinth

In the billing report, set the “Group by” setting to SKU, then it will show you the exact number of tokens, type of token, and name of the model. For example:

SKU: Gemini 1.5 Flash Text Output - Predictions
Service: Vertex AI
SKU ID: 34CE-11E0-D24D
Usage: 221,496 count

Here 221496 is the number of tokens.