Hi team,
I have been looking for best practices, examples, and tutorials to deploy a local rag-based method using ollama and LangChain (open source tools) into GCP compute engine.
My question is there any examples and tutorials that show how to do this pipeline on compute engine?
Note that I do not want to use vertex AI.
Thank you.
Hello Aj1985,
Welcome to Google Cloud Community!
For the tutorials and documentation of a local rag-based method using ollama here are the Setup Ollama on GCP. In Ollama with Open WebUI on Ubuntu 24.04 we have two ways on deployment:
For the examples and guides of a local rag-based method using LangChain, here is a Blog regarding Gen AI apps: Deploy LangChain on Cloud Run with LangServe.
I hope the above information is helpful.
User | Count |
---|---|
2 | |
1 | |
1 | |
1 | |
1 |