Solved: Custom metadata labels for embedding models

mannetjie1 · 02-17-2025 06:45 AM

Vertex generative models provide custom metadata labels (for requests) but embedding models don't.

Does anyone know how this can be done or if this will be available in the future?

dawnberdan

Welcome to Google Cloud Community!

The ability to add custom metadata labels directly to embedding model requests in Vertex AI is not currently supported. This feature is available for generative models, but not yet for embedding models. There is no publicly available information regarding if or when this functionality will be added for embeddings.

To work around this limitation, you'll need to manage the metadata separately:

Pre-processing: Add the metadata to your input data before you generate the embeddings. This could involve adding metadata fields directly to the text you're embedding or storing the metadata in a separate file or database, linked to unique identifiers for your embeddings.

Post-processing: Generate the embeddings and store the metadata with them in a database (like BigQuery, Cloud SQL, or a NoSQL database). This method needs a system to keep track of which metadata goes with which embedding.

The best method depends on your use case. If you need to search or filter embeddings based on metadata, a database approach is necessary. For information on future updates, refer to this document and the release notes.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

View solution in original post

dawnberdan

Hi @mannetjie1,

Welcome to Google Cloud Community!

The ability to add custom metadata labels directly to embedding model requests in Vertex AI is not currently supported. This feature is available for generative models, but not yet for embedding models. There is no publicly available information regarding if or when this functionality will be added for embeddings.

To work around this limitation, you'll need to manage the metadata separately:

Pre-processing: Add the metadata to your input data before you generate the embeddings. This could involve adding metadata fields directly to the text you're embedding or storing the metadata in a separate file or database, linked to unique identifiers for your embeddings.

Post-processing: Generate the embeddings and store the metadata with them in a database (like BigQuery, Cloud SQL, or a NoSQL database). This method needs a system to keep track of which metadata goes with which embedding.

The best method depends on your use case. If you need to search or filter embeddings based on metadata, a database approach is necessary. For information on future updates, refer to this document and the release notes.

Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.

mannetjie1

Thank you @dawnberdan . We wanted to label the embedding model for billing tracking purposes. We will have to see what we can do as a workaround.