Hi team,
I have been looking for best practices, examples, and tutorials to deploy a local rag-based method using ollama and LangChain (open source tools) into GCP compute engine.
My question is there any examples and tutorials that show how to do this pipeline on compute engine?
Note that I do not want to use vertex AI.
Thank you.