Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Build RAG on compute engine using ollama

Hi team,

 

I have been looking for best practices, examples, and tutorials to deploy a local rag-based method using ollama and LangChain (open source tools) into GCP compute engine.

My question is there any examples and tutorials that show how to do this pipeline on compute engine? 

Note that I do not want to use vertex AI.

Thank you. 

0 1 918
1 REPLY 1

Hello Aj1985,

Welcome to Google Cloud Community!

For the tutorials and documentation of a local rag-based method using ollama here are the Setup Ollama on GCP. In Ollama with Open WebUI on Ubuntu 24.04 we have two ways on deployment:

  • Terraform
    You need to manually provide the specification in the console. 
    Screenshot 2024-09-05 11.13.33 PM.pngScreenshot 2024-09-05 11.14.09 PM.pngScreenshot 2024-09-05 11.14.24 PM.png
  • Command-Line Deployment
    Command-line deployment uses Terraform, HashiCorp's configuration language. Learn more 

    Check out the documentation on how to deploy VM products through the CLI by using Terraform.

For the examples and guides of a local rag-based method using LangChain, here is a Blog regarding Gen AI apps: Deploy LangChain on Cloud Run with LangServe.

I hope the above information is helpful.