GenAI GraphRAG and AI agents using Vertex AI Reaso...

marutichandc · ‎08-08-2024

Building and Deploying GenAI GraphRAG Applications and AI agents using Google Cloud’s Vertex AI Reasoning Engine with LangChain and Neo4j

Authors:
Michael Hunger, Head of Product Innovation, Neo4j
Maruti C, Partner Engineering Lead,Google

Generative AI developers not familiar with orchestration tools and architectures often face challenges when deploying their work to production. Google’s LangChain on Vertex AI Reasoning Engine Runtime (Public preview) now offers an easy way to deploy, scale, and monitor Gen AI applications and APIs without in-depth knowledge of containerization, serverless compute or cloud configurations.

In this article , we demonstrate how to use Vertex AI Reasoning Engine to deploy a Gen AI application that:

Utilizes GraphRAG on a Neo4j knowledge graph of articles, referring to companies and the people involved with them.
Allows end users to ask questions of this dataset while not only pulling in the text embedding based vector search but also additional contextual information related to the question.

Neo4j is the most popular open source graph database, which allows you to store, manage and query billions of entities and their relationships. It is used by thousands of customers and hundreds of thousands of developers worldwide to build applications that use rich domain models to get new insights, manage organizations, produce recommendations, prevent fraud and manage energy and goods distributions in real-time.

A knowledge graph is a rich network of information representing people, processes, organizations, products and more at high fidelity. You can think of it as a digital twin of your organization (or of reality).

The importance of knowledge graphs in GenAI development cannot be overstated. Gartner considers knowledge graphs essential to the development of GenAI and has urged data leaders to “leverage the power of LLMs with the robustness of knowledge graphs to build fault-tolerant AI applications.”

GraphRAG combines two powerful technologies: retrieval-augmented generation (RAG) and knowledge graphs. RAG allows GenAI applications to access and query external datasets, while knowledge graphs make the data smarter by enriching the contextual information with entities and capturing the complex relationships between them. This enriched context enables LLMs to reason, infer, and accurately answer questions and execute tasks, anchoring their responses and actions in factual information.

Reasoning Engine is a Vertex AI service that has all the benefits of a cloud-native execution environment: security, privacy, observability, and scalability. You can productionize and scale your Gen AI Agents and applications with a simple API call, quickly turning locally-tested prototypes into enterprise-ready deployments. LangChain on Vertex AI lets you deploy your application to a Reasoning Engine managed runtime.

This deployment solution allows developers to use the Vertex AI Python SDK for setup, testing and deployment.

The workflow is as follows:

Create a Python class for your Gen AI application class to provide necessary environment information and static data through its constructor.
Implement a ‘set_up’ method within your class to initialize the LangChain orchestration framework when the application starts.
Create a 'query' method to handle user queries and process them, returning text responses.
Deploy your application using reasoning_engines.ReasoningEngine.create. This involves passing the class instance and specifying optional requirements for deployment.
Reasoning Engine builds containers and provisions the scalable HTTP API servers on the backend.
Once deployed, the GenAI app runs on Vertex AI, and the API can be queried from Python or any of the available client libraries in Vertex AI, including C#, Java, Node.js, Go, or REST API.

Screenshot 2024-08-07 at 2.13.14 PM.png

Neo4j’s integrations with Google Cloud, combined with the extensive LangChain integrations, allow you to seamlessly incorporate Neo4j knowledge graphs into your Gen AI infrastructure. You can use LangChain and other orchestration frameworks to build and deploy RAG architectures, like GraphRAG, with the Reasoning Engine runtime.

Below is an example of a GraphRAG application on a Company News Knowledge Graph using LangChain, Neo4j and Gemini Pro.

Knowledge Graph Companies Dataset

The dataset we are using as an example is a graph database containing articles reporting on companies and their associations with industries and people, that we explored before. The database is a subset of the Diffbot knowledge graph.

The example database is publicly available with a read-only user:

https://demo.neo4jlabs.com:7473/browser/

URI: neo4j+s://demo.neo4jlabs.com
User: companies
Password: companies
Database: companies

Screenshot 2024-08-07 at 2.14.51 PM.png

The knowledge graph was already prepared for usage in a GraphRAG system, by augmenting it with vector embeddings and indexes for vector and fulltext search. Here is a high level description of the process.

The text of each of the articles was split into manageable chunks for later processing. These were also stored as attributes of Chunk nodes in the graph which were connected to the Article nodes.

Embeddings were computed for each of the text chunks with an Vertex AI embedding model and stored on each chunk node. A Neo4j vector index ‘news_google’ and a fulltext index ‘news_fulltext’ (for hybrid search) were added too so that they can be used in the GraphRAG setup.

For our GenAI application we are using the Neo4jVector LangChain integration, which allows for advanced RAG patterns including GraphRAG.

In the application configuration, we provide both the vector and fulltext index (hybrid search) as well as an additional graph retrieval query that fetches the following additional information for each chunk.

Parent Article of the Chunk (aggregate all chunks for a single article)
Organization(s) mentioned
IndustryCategory(ies) for the organization
Person(s) connected to the organization and their roles (e.g. investor, chairman, ceo)

Screenshot 2024-08-07 at 12.07.02 PM.png

We can test-run our retrieval query with the question "What is the news about DeepMind?" on the Neo4j Browser and would get the following visual graph results:

// set these parameters first
:params {project: "<your-project-id>", token: "<gcloud auth print-access-token>"}

with "What are the news about Deepmind?" as question
with genai.vector.encode(question, "VertexAI", {projectId:$project, token:$token}) as vector
call db.index.vector.queryNodes('news_google',3,vector) yield node, score

WITH node as c,score
MATCH path=(c)<-[:HAS_CHUNK]-(article:Article)-[:MENTIONS]->(org:Organization)-->(p:Person|IndustryCategory)
RETURN path

Installation and Setup

You can follow along with the steps using the Jupyter Notebook in the GitHub repository.

For our initial setup, we have to initialize our Google project and also provide the GCS bucket to store the intermediate files for deployment. For your setup you will have to provide your own.

# replace with your own project-id, region and bucket
PROJECT_ID = "vertex-ai-neo4j-extension"
REGION = "us-central1"
STAGING_BUCKET = "gs://neo4j-vertex-ai-extension"

from google.colab import auth
auth.authenticate_user(project_id=PROJECT_ID)

!gcloud config set project vertex-ai-neo4j-extension

We also install the latest dependencies of the Vertex AI Python SDK and google-cloud-aiplatform, langchain, langchain_community, langchain_google_vertexai, and neo4j. If you run this as a notebook, you might need to restart the runtime after the installation.

!pip install --quiet neo4j==5.20.0
!pip install --quiet langchain_google_vertexai==1.0.4
!pip install --quiet --force-reinstall langchain==0.2.0 langchain_community==0.2.0

!pip install --quiet google-cloud-aiplatform==1.51.0
!pip install --quiet  google-cloud-resource-manager==1.12.3

Then we can import the dependencies and get going. We use the LangChain and Neo4j dependencies later in our session for testing and local execution.

import vertexai
from vertexai.preview import reasoning_engines

vertexai.init(
   project=PROJECT_ID,
   location=REGION,
   staging_bucket=STAGING_BUCKET,

)

from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain_google_vertexai import ChatVertexAI, VertexAIEmbeddings
from langchain_community.vectorstores import Neo4jVector
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough

LangchainCode Class

We will utilize Vertex AI Gemini Pro 1.5 as the LLM . For model parameters, we use: temperature of 0.1, top-k=40, top-p=0.8.

Our LangchainCode class contains the __init__() constructor for initialization which can only hold serializable information (strings and numbers). We pass in the prompt template and the database connection information for our Neo4j database which we take from environment variables.

Constructor

URI = os.getenv('NEO4J_URI', 'neo4j+s://demo.neo4jlabs.com')
USER = os.getenv('NEO4J_USERNAME','companies')
PASSWORD = os.getenv('NEO4J_PASSWORD','companies')
DATABASE = os.getenv('NEO4J_DATABASE','companies')

class LangchainCode:
  def __init__(self):
    # model parameters
    self.model_name = "gemini-1.5-pro-001"
    self.max_output_tokens = 1024
    self.temperature = 0.1
    self.top_p = 0.8
    self.top_k = 40
# Google Cloud project information
    self.project_id = PROJECT_ID
    self.location = REGION
# neo4j connection details
    self.uri = URI
    self.username = USER
    self.password = PASSWORD
    self.database = DATABASE
    self.prompt_input_variables = ["query"]
    self.prompt_template="""
      You are a venture capital assistant that provides useful answers about companies, 
      their boards, financing etc.only using the information from a company database 
      already provided in the context.Prefer higher rated information in your context and 
      add source links in your answers.
      Context: {context}"""

Set_up Method

In set_up() Gemini as LLM, Vertex AI Embeddings and the Neo4jVector retriever are combined into a LangChain chain. We also load the imports only during setup, not earlier, so that their state doesn't have to be serialized when we deploy the agent to Reasoning Engine.

def set_up(self):
  from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
  from langchain_google_vertexai import VertexAIEmbeddings, ChatVertexAI
  from langchain_community.vectorstores import Neo4jVector
  from langchain_core.output_parsers import StrOutputParser
  from langchain_core.runnables import RunnableParallel, RunnablePassthrough

  llm = ChatVertexAI(
     model_name=self.model_name,
     max_output_tokens=self.max_output_tokens,
     max_input_tokens=32000,
     temperature=self.temperature,
     top_p=self.top_p,
     top_k=self.top_k,
     project = self.project_id,
     location = self.location,
     response_validation=False,
     verbose=True
 )
     embeddings = VertexAIEmbeddings("textembedding-gecko@001")
     self.qa_chain = self.configure_qa_rag_chain(llm, embeddings)

GraphRAG Configuration for set_up

We use the separate method configure_qa-rag_chain for the setup of the GraphRAG LangChain integration with the vector and fulltext-indexes for hybrid search augmented with the retrieval query similar to the one we used in the example before, just as a nested structure. We will retrieve the top-k = 5 results from the vector and fulltext-index and re-rank the results.

def format_docs(docs):
   return "\n\n".join(doc.page_content for doc in docs)

def configure_qa_rag_chain(self, llm, embeddings):
   qa_prompt = ChatPromptTemplate.from_messages([
       SystemMessagePromptTemplate.from_template(self.prompt_template),
       HumanMessagePromptTemplate.from_template("Question: {question}"
                                                 "\nWhat else can you tell me about it?"),
   ])

   # Vector + Knowledge Graph response
   kg = Neo4jVector.from_existing_index(
       embedding=embeddings,
       url=self.uri, username=self.username,password=self.password,database=self.database,
       search_type="hybrid",
       keyword_index_name="news_fulltext",
       index_name="news_google",
       retrieval_query="""
         WITH node as c,score
         MATCH (c)<-[:HAS_CHUNK]-(article:Article)

         WITH article, collect(distinct c.text) as texts, avg(score) as score
         RETURN article {.title, .sentiment, .siteName, .summary,
               organizations: [ (article)-[:MENTIONS]->(org:Organization) |
                     org { .name, .revenue, .nbrEmployees, .isPublic, .motto, .summary,
                     orgCategories: [ (org)-[:HAS_CATEGORY]->(i) | i.name],
                     people: [ (org)-[rel]->(p:Person) | p { .name, .summary, role: 
 replace(type(rel),"HAS_","") }]}],
               texts: texts} as text,
         score, {source: article.siteName} as metadata
       """,
   )
   retriever = kg.as_retriever(search_kwargs={"k": 5})

   chain = (
       {"context": retriever | format_docs , "question": RunnablePassthrough()}
       | qa_prompt
       | llm
       | StrOutputParser()
   )
   return chain

Query Method

The LangChain chain we configured is then called in the minimalistic query() method with chain.invoke().

def query(self, query):
   return self.qa_chain.invoke(query)

Testing our Gen AI API

We can test out our newly created LangchainCode class by instantiating it locally and just running a test query.

# testing locally
lc = LangchainCode()
lc.set_up()

response = lc.query('What are the news about IBM and its acquisitions and who are the people involved?')
print(response)

We get this output:

IBM acquired several companies, including:
* **Ascential Software** in 2005 to strengthen its data integration capabilities [CHINAdaily.com.cn](https://www.chinadaily.com.cn/bizchina/2012-11/01/content_15865847.htm).
* **Cognos Inc.** in 2007 for approximately $5 billion to enhance its business intelligence and performance management software offerings [CHINAdaily.com.cn](https://www.chinadaily.com.cn/bizchina/2012-11/01/content_15865847.htm).
* **Netezza** in September 2010 to boost its data warehousing capabilities [CHINAdaily.com.cn](https://www.chinadaily.com.cn/bizchina/2012-11/01/content_15865847.htm).
* **Algorithmics** in 2011 for $387 million to improve its risk analysis capabilities and financial services [CHINAdaily.com.cn](https://www.chinadaily.com.cn/bizchina/2012-11/01/content_15865847.htm).
....

Key people involved in these acquisitions include:

* **Virginia Rometty**, former IBM president and chief executive officer, who led the acquisitions of Ascential Software, Cognos Inc., Netezza, and Algorithmics.
* **Arvind Krishna**, current CEO of IBM, who spearheaded the acquisitions of Turbonomic Inc., myInvenio, and Bluetab Solutions Group.
* **Jim Whitehurst**, former president of IBM and former CEO of Red Hat, played a key role in IBM's hybrid-cloud strategy, including the acquisition of Red Hat in 2019.

These acquisitions reflect IBM's strategic focus on expanding its capabilities in data analytics, cloud computing, artificial intelligence, and security.

Deploying with Reasoning Engine

Cool, seems our code works. Now let's deploy as ReasoningEngine with the Google Vertex AI Python SDK. For the deployment, you need to provide the instance of the class whose instantiation captures relevant environment variables and configuration and the dependencies, in our case google-cloud-aiplatform, langchain, langchain_community, langchain_google_vertexai, neo4j.

reasoning_engine = reasoning_engines.ReasoningEngine.create(
   LangchainCode(),
   requirements=[
       "google-cloud-aiplatform==1.57.0",
       "langchain_google_vertexai==1.0.6",
       "langchain==0.2.6",
       "langchain_community==0.2.6",
       "neo4j==5.20.0"
   ],
   display_name="Neo4j Vertex AI RE Companies",
   description="Neo4j Vertex AI RE Companies",
   sys_version="3.10",
   extra_packages=[]
)

This starts deploying the application as follows:

INFO:vertexai.reasoning_engines._reasoning_engines:Using bucket neo4j-vertex-ai-extension
INFO:vertexai.reasoning_engines._reasoning_engines:Writing to gs://neo4j-vertex-ai-extension/reasoning_engine/reasoning_engine.pkl
INFO:vertexai.reasoning_engines._reasoning_engines:Writing to gs://neo4j-vertex-ai-extension/reasoning_engine/requirements.txt
INFO:vertexai.reasoning_engines._reasoning_engines:Creating in-memory tarfile of extra_packages
INFO:vertexai.reasoning_engines._reasoning_engines:Writing to gs://neo4j-vertex-ai-extension/reasoning_engine/dependencies.tar.gz
INFO:vertexai.reasoning_engines._reasoning_engines:Creating ReasoningEngine
INFO:vertexai.reasoning_engines._reasoning_engines:Create ReasoningEngine backing LRO: projects/PPPPPPPPPPPPP/locations/us-central1/reasoningEngines/EEEEEEEEEEEEE/operations/OOOOOOOOOOO
INFO:vertexai.reasoning_engines._reasoning_engines:ReasoningEngine created. Resource name: projects/PPPPPPPPPPPPP/locations/us-central1/reasoningEngines/EEEEEEEEEEEEE
To use this ReasoningEngine in another session:
reasoning_engine = vertexai.preview.reasoning_engines.ReasoningEngine('projects/PPPPPPPPPPPPP/locations/us-central1/reasoningEngines/OOOOOOOOOOO')

We see the different steps that Reasoning Engine takes in the output (we masked the IDs used in the resource name)

Serialize / package the class to a GCS bucket
Write the requirements and extra packages to the bucket
Create, deploy and start the Reasoning Engine instance

Testing the deployed Reasoning Engine

And after successful deployment we can use the resulting reasoning-engine object via the query method, passing in our user question. If we want to access the deployed ReasoningEngine later, e.g. in a client, we can do so via its id.

# if needed access ReasoningEngine from the reasoning-engine-id
# reasoning_engine = vertexai.preview.reasoning_engines.ReasoningEngine('projects/PPPPPPPPPPPPP/locations/us-central1/reasoningEngines/OOOOOOOOOOO')
response = reasoning_engine.query(query="Who is on the board of Siemens?")
print(response)

Which generates the following response:

The following people are on the board of Siemens:
* Jim Hagemann Snabe, Manager and chairman [source](https://www.google.com/search?q=Siemens)
* Dominika Bettman, CEO at Siemens [source](https://www.google.com/search?q=Siemens)
* Alejandro Preinfalk, CEO at Siemens [source](https://www.google.com/search?q=Siemens)
* Miguel Angel Lopez Borrego, CEO at Siemens [source](https://www.google.com/search?q=Siemens)
* Hanna Hennig, CIO at Siemens [source](https://www.google.com/search?q=Siemens)
* Barbara Humpton, CEO at Siemens Bank [source](https://www.google.com/search?q=Siemens)
* Matthias Rebellius, CEO at Siemens Corporate Technology [source](https://www.google.com/search?q=Siemens)
* Cedrik Neike, CEO at Siemens [source](https://www.google.com/search?q=Siemens)
* Ralf P. Thomas, CFO at Siemens [source](https://www.google.com/search?q=Siemens)
* Michael Sigmund, Chairman at Siemens [source](https://www.google.com/search?q=Siemens)
* Horst J. Kayser, Chairman at Siemens [source](https://www.google.com/search?q=Siemens)
* Michael Diekmann, Business person [source](https://www.google.com/search?q=Siemens)
* Norbert Reithofer, German businessman [source](https://www.google.com/search?q=Siemens)
* Werner Brandt, German manager [source](https://www.google.com/search?q=Siemens)
* Birgit Steinborn, Chairman at Siemens [source](https://www.google.com/search?q=Siemens)

Siemens is a German multinational conglomerate in the following industries: Networking Companies, Wind Energy Companies, Engine Manufacturers, Computer Hardware Companies, Electrical Equipment Manufacturers, Electronic Products Manufacturers, Software Companies, Energy Companies, Home Appliance Manufacturers, Turbine Manufacturers, Nuclear Energy Companies, Manufacturing Companies, Machine Manufacturers, Renewable Energy Companies, Tool Manufacturers. [source](https://www.google.com/search?q=Siemens)

Building Generative AI Agents:

Generative AI agents are AI powered software entities that uses a combination of a generative model’s capabilities, tools to connect to the external world, and high-level reasoning to achieve a desired end goal or state. These agents enhance enterprise processes by automating tasks, assisting human workers, and digitizing services, ultimately leading to higher productivity and improved customer satisfaction. Organizations use AI agents to achieve specific goals and more efficient business outcomes.

There are various ways to build AI agents on Google cloud platform, including managed offerings like no-code agent builder platform as well as using open-source frameworks like Langchain , OneTwo and Llamaindex.

In this blog we will focus on building and deploying agents with LangChain on Vertex AI, which involves four distinct layers, each catering to specific development needs.

Model (Gemini model): This layer handles content generation, understanding and responding to user queries in natural language, and summarizing information.
Tools (Langchain tools/Gemini Function Calling): This layer allows your agent to interact with external systems and APIs, enabling it to perform actions beyond just generating text or images.
Reasoning (LangChain): This layer organizes your application code into functions, defining configuration parameters, initialization logic, and runtime behavior. LangChain simplifies LLM application development by providing the building blocks for generative AI applications, and developers maintain control over crucial aspects like custom functions, agent behavior, and model parameters.
Deployment (Reasoning Engine): This Vertex AI service hosts your AI agent and provides benefits such as security, observability, and scalability. Reasoning Engine is compatible with LangChain or any open-source framework to build customizable agentic workflows.

Screenshot 2024-08-07 at 1.40.32 PM.png

The Gen AI application that we built in the previous section can be extended to an agentic application by making the below changes

Modify the Langclass py class to be in this format.

Create the py function that has the langchain chain that does the GraphRAG as a tool as below

from langchain.agents import Tool
tools = [
	Tool(
		name='Knowledge Base',
		func=self.qa_chain.invoke,
		description=('use this tool when answering specific news queries 
           to get more information about the topic'
 			)
		) 
	]

Deploy the application as below

remote_app = reasoning_engines.ReasoningEngine.create(
	reasoning_engines.LangchainCode(
		model=model,
		tools=[tools],
	),
 	requirements=[
		"google-cloud-aiplatform[reasoningengine,langchain]",
	],
	display_name="Neo4j Vertex AI RE Companies",
)
remote_app

Query using the remote app that was created in the previous step. Example:

remote_app = reasoning_engines.ReasoningEngine("projects/PROJECT_ID/locations/LOCATION/reasoningEngines/REASONING_ENGINE_ID")

response = remote_app.query(input="What is the exchange rate from US dollars to Swedish currency?")

The above steps lets you use Reasoning engine to deploy a Gen AI application which is agentic in nature. In this example we used the py function as a tool but you can use any langchain tool such as function calling and create a multi agent application. We will deep dive into this aspect more in future.

You can also debug and optimize your agents by enabling tracing in the Reasoning Engine. Here is a notebook that explains how you can use Cloud Trace for exploring the tracing data to get insights.

Neo4j’s Langchain integrations

The Neo4j LangChain integrations offer various features that can help with various use cases, such as vector search with contextual retrieval, Natural language to graph query language (Cypher) generation, knowledge graph construction, conversational memory, and LangServe templates for advanced RAG patterns.

Next Steps

Try it out yourself, enable Vertex AI on your Google Cloud account and get going, here is the repository with the Jupyter Notebook and the documentation for the Neo4j LangChain Integrations, the Vertex AI Python SDK and Reasoning Engine (with LangChain).

Michael: I've been working with Vertex AI Extensions and Reasoning Engine for quite some months now - so I really enjoyed the Cloud Next ‘24 talk by Julia Wiesinger, Kristopher Overholt, and JC Escalante on building and deploying agents with Reasoning Engine and LangChain on Vertex AI (resources). The Google Vertex AI team ran a webinar on June 6 on "Building and Deploying AI Agents with LangChain on Vertex AI" (blog post) that goes deeper into the agentic aspects of Reasoning Engine.

Please provide us feedback on this article, on the Google Cloud documentation or at or on GitHub issues. And let us know if you run into any issues using vertex AI or want to see new features here at goo.gle/vertex-ai-issues. And of course if you found this useful, share it with your friends and colleagues.

Former Community Member · ‎09-01-2024

Thank you Mr @marutichandc

ezetina · ‎03-18-2025

@marutichandc great article! Thanks for sharing with the community. Do you have any example of how the database was modeled? I have a bunch of PDFs and I'd like to do something similar but I'm wondering the best way to divide the files and create relations between them to use this approach, if you can point me to the right direction will be appreciated.

PREET2711 · ‎03-19-2025

Thanks for giving us information