Bronze 1
Since ‎03-17-2024
a month ago

My Stats

  • 5 Posts
  • 0 Solutions
  • 0 Likes given
  • 6 Likes received

Yash2384's Bio

Badges Yash2384 Earned

View all badges

Recent Activity

I was looking into the code# Set docker and quantization for AWQ quantized models VLLM_DOCKER_URI = "us-docker.pkg.dev/vertex-ai/vertex-vision-model-garden-dockers/pytorch-vllm-serve:20231127_0916_RC00" quantized_model_id = "TheBloke/Llama-2-70B-chat...
I am using this library to make a prediction request to the model deployed on Vertex AI. I am getting a timeout exception, Not sure if I need to increase the timeout and up to what value . Also what is the default value , I can find nothing in the do...
I've integrated an LLM model into the Model Registry using a custom Docker container. The model is hosted correctly, and I can consistently execute prediction requests. However, occasionally I encounter a '503 Service Unavailable' error.This issue be...
I've deployed a container hosting a customized model in Vertex AI. I encounter connection timeout exceptions, particularly when there are 5 or more concurrent requests.I'm exploring an alternative approach that is cost-effective and capable of autosc...