How is Cost Calculated for GCP Vertex AI Vector Se...

trk · 07-08-2024 04:38 AM

Hi everyone,

I'm working on a production-scale Retrieval-Augmented Generation (RAG) application using Google Cloud Platform (GCP), and I'm trying to understand how the costs are calculated for Vertex AI Vector Search (MatchingEngine/Datastore). Here are the specific details of my use case and questions:

### Context and Requirements

1. **Data Size**:
- I have 20GB of data for ingestion.

2. **Data Ingestion**:
- How is the cost calculated for ingesting this data?

3. **Index Hosting**:
- Is the service serverless (pay-as-you-go) or fully managed? What are the hosting costs associated with maintaining the index?

4. **Retriever**:
- If my number of queries per second (QPS) is 10, that translates to a total of 1,000 queries per day or 30,000 queries per month. How is the cost calculated for querying data from the vector search?

### Pricing Calculation Details

From what I understand, Vector Search pricing is determined by:
- The size of your data.
- The amount of queries per second (QPS) you want to run.
- The number of nodes you use.

For my specific use case:
- **Data Size**: 20GB.
- **Queries**: 10 QPS (Total 30,000 queries per month).

Could someone help me with a detailed cost breakdown based on the above parameters along with serving cost? Any insights or examples from those who have used this service would be highly appreciated!

Thank you in advance for your help!

ruthseki

Hi @trk,

Welcome to Google Cloud Community!

Based on this documentation, you are right that:

Vector Search pricing is determined by the size of your data, the amount of queries per second (QPS) you want to run, and the number of nodes you use.

In addition, I’m not seeing any straightforward way to calculate the exact cost of ingesting 20GB of data into Vertex AI Vector Search.

To get your estimated serving cost, you need to calculate your total data size. Your data size is the number of your embeddings/vectors* the number of dimensions you have* 4 bytes per dimension. After you have the size of your data you can calculate the serving cost and the building cost. The serving cost plus the building cost equals your monthly total cost.

Serving cost: # replicas/shard * # shards (~data size/shard size) * hourly cost * 730 hours

Building cost: data size(in GiB) * $3/GiB * # of updates/month

Here is the sample Vector Search pricing examples including QPS:

Vertex Search pricing examples.png

While I don’t have visibility on how this has been calculated, I would highly suggest connecting with our sales team to get a custom quote for your application. You can also use the Pricing calculator to generate a cost estimate based on your projected usage.

Vertex AI Vector Search uses a pay-as-you-go pricing model, you only pay for the services you use. Based on the pricing calculator, I’m not seeing any index hosting costs associated with maintaining the index.

Instead, I’m seeing this Vertex AI Vector Search configuration which includes :

Number of vectors
Index size
Number of vector dimensions
Shard size
Region
Machine type
Number of replicas per shard
Index update type
Streaming updates per month

Here is the sample cost estimate summary from Pricing Calculator for reference:

Sample Pricing Calculator.png

Cost Estimate Summary.png

I hope the above information is helpful.

MiguelEspino00

Hi!

Im using the Python SDK for Vertex Vector Search. Is there any way to see the number of vectors that I have on my index?

Thanks in advanced!

cdaniele

Hi team,
the pricing calculator does not have any reference to the QPS parameter, so I'm wondering how this will have impact on the actual cost?

Thanks in advance,
Carmelo

hardikitis

What's the cost for 100M Embeddings, 128 dim, 100 Qps? 100M Embeddings will need 2 nodes of e2-highmem-16? Can it be single node?

How is Cost Calculated for GCP Vertex AI Vector Search?