Solved: update AutoscalingMetricSpec for online prediction

tanay1 · 08-23-2024 04:20 AM

I want to update the autoscaling metric such that new replicas are triggered based on the volume of incoming prediction requests (incoming traffic), I got a documentation :AutoscalingMetricSpec But it doesn't show metricName parameter values that can be set to scale based on incoming requests, Please share the resources/documentations for this. Also, let me know what all things need to considered for Autoscaling.

jaia

Hello,

Thank you for contacting the Google Cloud Community.

To ensure a faster resolution and dedicated support for your issue, I kindly request you to file a support ticket by clicking here. Our support team will prioritize your request and provide you with the assistance you need.

For individual support issues, it is best to utilize the support ticketing system. We appreciate your cooperation!

View solution in original post

jaia

Hello,

Thank you for contacting Google Cloud Community!

Here's the relevant documentation: AutoscalingMetricSpec

Regards,
Jai Ade

tanay1

Hi, thanks for response,

But I think this doesn't mention about the autoscaling metrics in "Vertex online prediction endpoint". It is for group of VMs. Do we have any dedicated documentation for the autoscaling metrics on vertex prediction endpoint??

jaia

Hello,

Thank you for contacting the Google Cloud Community.

To ensure a faster resolution and dedicated support for your issue, I kindly request you to file a support ticket by clicking here. Our support team will prioritize your request and provide you with the assistance you need.

For individual support issues, it is best to utilize the support ticketing system. We appreciate your cooperation!

leogri

Hello,

I have the same question, what was the answer in the end?

Thanks a lot!
Leo