Hi ,
I’m using Vertex AI Vector Search (streaming update method) and having trouble indexing datapoints. My data is cleaned and is vertex ai ready in accordance with online guidance. The initial CSV was uploaded to big query and then exported as a json to build the vector.
Despite successful 200 OK responses from the API, the index remains empty:
No vectors are showing in vectorCount
Queries return empty []
Even when using the exact same 384-dim vector used during indexing
It seems that no vectors are actually being ingested into the index, even though the API call completes successfully. The index is created and deployed fine, but querying returns nothing — and when checking the index info its because the vectors just aren’t there.
Index created with dimensions: 384, DOT_PRODUCT_DISTANCE, STREAM_UPDATE
Data successfully transformed from BigQuery CSV → valid JSON
Sample rows of 10 contain properly formatted vectors (embedding_0 to embedding_383)
All restricts fields (e.g. restricts_keywords, restricts_devices) were formatted and validated and below the character length permittes.
JSON payload looks correct — matches API spec
Used curl POST
The deployedIndexId is valid and matches what we see from
The vectors may be silently rejected, maybe either due to formatting (e.g. invalid datapoint ID, malformed restricts) or because the endpoint isn’t fully enabled for streaming upserts. But no error message is returned, and no logs in log explorer making this very hard to debug.
We’d really appreciate guidance on:
How to verify whether vectors were actually stored
Any known issues that cause silent failure of upsertDatapoints
How to confirm if our deployed index is truly ready for streaming ingestion
Thanks so much in advance, we’ve exhausted documentation and have reached out last week but not had a response back from GCP support and would love any help getting this unblocked.
Hi @KJ_24,
Welcome to Google Cloud Community!
To address your questions about Vertex AI Vector Search and streaming upserts:
1. How to Verify Whether Vectors Were Actually Stored:
Query the Index: This is the most direct method. After upserting, immediately query the index using the exact same vector (or a very similar vector) that you just indexed. Be sure your query is set up to return all results, not just the top N. If you get an empty result, that confirms the vector was not successfully stored. Remember to account for potential latency; wait a minute or two after upserting before querying.
Get Index Statistics: The primary indicator is monitoring the vectorCount metric of the index resource itself. Use the Google Cloud Console or the gcloud command-line tool to fetch the index's details:
gcloud ai indexes describe INDEX_ID --region=REGION
Look for the vectorCount field in the output. This value should increment after successful upserts. Be aware that there can be a delay (minutes to hours) before this metric is fully updated, so this is not a real time measurement of what has just been upserted.
Metric Monitoring: Use Cloud Monitoring to track the aiplatform.googleapis.com/index/vectors metric for your index. This shows the number of vectors in the index over time. Again, expect a delay in seeing updates.
Sampling with Larger Datasets: If you're indexing a large dataset, don't try to check every vector. Instead, create a set of "known good" vectors and their corresponding datapoint IDs. Upsert these, then query specifically for them to verify ingestion is working at all.
2. Any Known Issues That Cause Silent Failure of upsertDatapoints:
Here are the most commonly reported causes of silent failures with upsertDatapoints in Vertex AI Vector Search, especially using streaming updates:
3. How to Confirm if Our Deployed Index Is Truly Ready for Streaming Ingestion:
Was this helpful? If so, please accept this answer as “Solution”. If you need additional assistance, reply here within 2 business days and I’ll be happy to help.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |