Hello there, I am fairly new to Machine Learning deployment. I have multiple models to deploy. A basic PyTorch image classifier, a Tensorflow regression model and basic sentiment analysis model. So far I have seen the best way to deploy would be using ONNX runtime with a Triton server. Given the nature if multiple models would Docker be a good solution. Also if i get a traffic of 100,000 users/mo with ingress of 8MB/user and egress of 40KB/user, what costs am I looking at? How much vCPU, RAM and memory should I expect. Should I use docker to compose all of the models together or use AWS SageMaker or Azure Functions?
I am advise you in azure functions there multiple functions at azure .it is likely.