Hi team,
I've been experimenting with Vertex AI function calling and the preview feature, context caching. While following the documentation and Python examples (codelabs), everything seems to work as expected. However, when implementing the same in Node.js, I’ve noticed a major discrepancy, particularly with context caching.
Looking forward to your thoughts!
Hi @lovee93,
Welcome to Google Cloud Community!
The cached_content.create call fails in Node.js because the calculated token count falls below the required minimum for context caching. This is likely caused by a discrepancy in token counts between Python and Node.js, potentially due to differing tokenization algorithms in their respective Vertex AI SDKs.
Here are some potential reasons and suggestions you might consider to address the issue:
You can also refer to the following documents for more details:
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
Thank you so much for replying and suggesting possible things to try on. I am currently using the latest version of the sdk which is 1.9.3 and have tried with a minimal example. The fact that is confusing me is that when I just use 1 PDF file, the token count is more while when I use 2 PDF files, the token count is less. Attaching screenshots for your reference:
Testing with 1 PDF file, the cached content is of 23258 tokens.
Testing with 2 PDF files, the cached content is of 19904 tokens
Here's the repository with this example: https://github.com/Lovee93/context-caching-bug
Thank you!