Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Adding data to Vector Search Index

 

I am familiar with vectordbs like weaviate and pinecone, and trying to migrate my vectors to gcp. I have uploaded data to a bucket, created index with it and deployed an endpoint using python and following this documentation Vector Search Quickstart.Querying works fine for the index.

Now I want to add new data to the same index, (batch upsertion or equivalent in weaviate or pinecone), which as far as i understand can be done via GUI by doing batch update in the edit Index option, but I am unable to figure out how can i achieve it with python client.

I apologise if this is stupid question, been trying to figure out for a while now, and docs are confusing me more than helping.

I looked at update and rebuild Index and updateIndex, but am unable to figure out if this creates entire index from scratch with the new data or upserts them like weaviate or pinecone.

ps: UseCase:
I want to be able to query my embeddings with text like, 'a dog' etc returning best matches.

 
Solved Solved
0 5 4,775
1 ACCEPTED SOLUTION

Hey @someOne2, not sure about the naming for Python but in .NET there is a class IndexServiceClient that has all methods to manage the index. And it has a method UpsertDatapoints that processes a UpsertDatapointsRequest which represents batch of vectors.

View solution in original post

5 REPLIES 5

Hi @someOne2

Welcome and thank you for reaching out to our community.

I believe both functionalities do not have batch data processing (upserting).

  • The update index updates the metadata associated with the index, like description, labels, or settings while the actual data content (vectors and document IDs) remains untouched  
  • The rebuild index rebuilds the entire index from scratch, deleting all existing data in the index then recreates it using the data provided in the request

I hope I was able to provide you with useful insights.

 

hey @lsolatorio , thanks for replying.

Is there an equivalent of upserting ( preferably batch upserts, like in pinecone & weaviate ), in google aiplatform ?
Use Case : we have multiple collections with millions of records/vectors in each collection, which we frequently need to add-to, update to or delete from in bulk.

Hey @someOne2, not sure about the naming for Python but in .NET there is a class IndexServiceClient that has all methods to manage the index. And it has a method UpsertDatapoints that processes a UpsertDatapointsRequest which represents batch of vectors.

Hi @lsolatorio I guess somehow you've missed the option for batch update, which does upsert exactly the vectors.

Hey @esthereklund, can you elaborate a bit on what do you mean by batch update here, and thanks for the above answer.