Building and Deploying GenAI GraphRAG Applications and AI agents using Google Cloud’s Vertex AI Reasoning Engine with LangChain and Neo4j
Authors:
Michael Hunger, Head of Product Innovation, Neo4j
Maruti C, Partner Engineering Lead,Google
Generative AI developers not familiar with orchestration tools and architectures often face challenges when deploying their work to production. Google’s LangChain on Vertex AI Reasoning Engine Runtime (Public preview) now offers an easy way to deploy, scale, and monitor Gen AI applications and APIs without in-depth knowledge of containerization, serverless compute or cloud configurations.
In this article , we demonstrate how to use Vertex AI Reasoning Engine to deploy a Gen AI application that:
Neo4j is the most popular open source graph database, which allows you to store, manage and query billions of entities and their relationships. It is used by thousands of customers and hundreds of thousands of developers worldwide to build applications that use rich domain models to get new insights, manage organizations, produce recommendations, prevent fraud and manage energy and goods distributions in real-time.
A knowledge graph is a rich network of information representing people, processes, organizations, products and more at high fidelity. You can think of it as a digital twin of your organization (or of reality).
The importance of knowledge graphs in GenAI development cannot be overstated. Gartner considers knowledge graphs essential to the development of GenAI and has urged data leaders to “leverage the power of LLMs with the robustness of knowledge graphs to build fault-tolerant AI applications.”
GraphRAG combines two powerful technologies: retrieval-augmented generation (RAG) and knowledge graphs. RAG allows GenAI applications to access and query external datasets, while knowledge graphs make the data smarter by enriching the contextual information with entities and capturing the complex relationships between them. This enriched context enables LLMs to reason, infer, and accurately answer questions and execute tasks, anchoring their responses and actions in factual information.
Reasoning Engine is a Vertex AI service that has all the benefits of a cloud-native execution environment: security, privacy, observability, and scalability. You can productionize and scale your Gen AI Agents and applications with a simple API call, quickly turning locally-tested prototypes into enterprise-ready deployments. LangChain on Vertex AI lets you deploy your application to a Reasoning Engine managed runtime.
This deployment solution allows developers to use the Vertex AI Python SDK for setup, testing and deployment.
The workflow is as follows:
Create a Python class for your Gen AI application class to provide necessary environment information and static data through its constructor.
Neo4j’s integrations with Google Cloud, combined with the extensive LangChain integrations, allow you to seamlessly incorporate Neo4j knowledge graphs into your Gen AI infrastructure. You can use LangChain and other orchestration frameworks to build and deploy RAG architectures, like GraphRAG, with the Reasoning Engine runtime.
Below is an example of a GraphRAG application on a Company News Knowledge Graph using LangChain, Neo4j and Gemini Pro.
Knowledge Graph Companies Dataset
The dataset we are using as an example is a graph database containing articles reporting on companies and their associations with industries and people, that we explored before. The database is a subset of the Diffbot knowledge graph.
The example database is publicly available with a read-only user:
https://demo.neo4jlabs.com:7473/browser/
URI: neo4j+s://demo.neo4jlabs.com
The knowledge graph was already prepared for usage in a GraphRAG system, by augmenting it with vector embeddings and indexes for vector and fulltext search. Here is a high level description of the process.
The text of each of the articles was split into manageable chunks for later processing. These were also stored as attributes of Chunk nodes in the graph which were connected to the Article nodes.
Embeddings were computed for each of the text chunks with an Vertex AI embedding model and stored on each chunk node. A Neo4j vector index ‘news_google’ and a fulltext index ‘news_fulltext’ (for hybrid search) were added too so that they can be used in the GraphRAG setup.
For our GenAI application we are using the Neo4jVector LangChain integration, which allows for advanced RAG patterns including GraphRAG.
In the application configuration, we provide both the vector and fulltext index (hybrid search) as well as an additional graph retrieval query that fetches the following additional information for each chunk.
We can test-run our retrieval query with the question "What is the news about DeepMind?" on the Neo4j Browser and would get the following visual graph results:
// set these parameters first
:params {project: "<your-project-id>", token: "<gcloud auth print-access-token>"}
with "What are the news about Deepmind?" as question
with genai.vector.encode(question, "VertexAI", {projectId:$project, token:$token}) as vector
call db.index.vector.queryNodes('news_google',3,vector) yield node, score
WITH node as c,score
MATCH path=(c)<-[:HAS_CHUNK]-(article:Article)-[:MENTIONS]->(org:Organization)-->(p:Person|IndustryCategory)
RETURN path
You can follow along with the steps using the Jupyter Notebook in the GitHub repository.
For our initial setup, we have to initialize our Google project and also provide the GCS bucket to store the intermediate files for deployment. For your setup you will have to provide your own.
# replace with your own project-id, region and bucket
PROJECT_ID = "vertex-ai-neo4j-extension"
REGION = "us-central1"
STAGING_BUCKET = "gs://neo4j-vertex-ai-extension"
from google.colab import auth
auth.authenticate_user(project_id=PROJECT_ID)
!gcloud config set project vertex-ai-neo4j-extension
We also install the latest dependencies of the Vertex AI Python SDK and google-cloud-aiplatform, langchain, langchain_community, langchain_google_vertexai, and neo4j. If you run this as a notebook, you might need to restart the runtime after the installation.
!pip install --quiet neo4j==5.20.0
!pip install --quiet langchain_google_vertexai==1.0.4
!pip install --quiet --force-reinstall langchain==0.2.0 langchain_community==0.2.0
!pip install --quiet google-cloud-aiplatform==1.51.0
!pip install --quiet google-cloud-resource-manager==1.12.3
Then we can import the dependencies and get going. We use the LangChain and Neo4j dependencies later in our session for testing and local execution.
import vertexai
from vertexai.preview import reasoning_engines
vertexai.init(
project=PROJECT_ID,
location=REGION,
staging_bucket=STAGING_BUCKET,
)
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain_google_vertexai import ChatVertexAI, VertexAIEmbeddings
from langchain_community.vectorstores import Neo4jVector
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
We will utilize Vertex AI Gemini Pro 1.5 as the LLM . For model parameters, we use: temperature of 0.1, top-k=40, top-p=0.8.
Our LangchainCode class contains the __init__() constructor for initialization which can only hold serializable information (strings and numbers). We pass in the prompt template and the database connection information for our Neo4j database which we take from environment variables.
URI = os.getenv('NEO4J_URI', 'neo4j+s://demo.neo4jlabs.com')
USER = os.getenv('NEO4J_USERNAME','companies')
PASSWORD = os.getenv('NEO4J_PASSWORD','companies')
DATABASE = os.getenv('NEO4J_DATABASE','companies')
class LangchainCode:
def __init__(self):
# model parameters
self.model_name = "gemini-1.5-pro-001"
self.max_output_tokens = 1024
self.temperature = 0.1
self.top_p = 0.8
self.top_k = 40
# Google Cloud project information
self.project_id = PROJECT_ID
self.location = REGION
# neo4j connection details
self.uri = URI
self.username = USER
self.password = PASSWORD
self.database = DATABASE
self.prompt_input_variables = ["query"]
self.prompt_template="""
You are a venture capital assistant that provides useful answers about companies,
their boards, financing etc.only using the information from a company database
already provided in the context.Prefer higher rated information in your context and
add source links in your answers.
Context: {context}"""
In set_up() Gemini as LLM, Vertex AI Embeddings and the Neo4jVector retriever are combined into a LangChain chain. We also load the imports only during setup, not earlier, so that their state doesn't have to be serialized when we deploy the agent to Reasoning Engine.
def set_up(self):
from langchain.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain_google_vertexai import VertexAIEmbeddings, ChatVertexAI
from langchain_community.vectorstores import Neo4jVector
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel, RunnablePassthrough
llm = ChatVertexAI(
model_name=self.model_name,
max_output_tokens=self.max_output_tokens,
max_input_tokens=32000,
temperature=self.temperature,
top_p=self.top_p,
top_k=self.top_k,
project = self.project_id,
location = self.location,
response_validation=False,
verbose=True
)
embeddings = VertexAIEmbeddings("textembedding-gecko@001")
self.qa_chain = self.configure_qa_rag_chain(llm, embeddings)
We use the separate method configure_qa-rag_chain for the setup of the GraphRAG LangChain integration with the vector and fulltext-indexes for hybrid search augmented with the retrieval query similar to the one we used in the example before, just as a nested structure. We will retrieve the top-k = 5 results from the vector and fulltext-index and re-rank the results.
def format_docs(docs):
return "\n\n".join(doc.page_content for doc in docs)
def configure_qa_rag_chain(self, llm, embeddings):
qa_prompt = ChatPromptTemplate.from_messages([
SystemMessagePromptTemplate.from_template(self.prompt_template),
HumanMessagePromptTemplate.from_template("Question: {question}"
"\nWhat else can you tell me about it?"),
])
# Vector + Knowledge Graph response
kg = Neo4jVector.from_existing_index(
embedding=embeddings,
url=self.uri, username=self.username,password=self.password,database=self.database,
search_type="hybrid",
keyword_index_name="news_fulltext",
index_name="news_google",
retrieval_query="""
WITH node as c,score
MATCH (c)<-[:HAS_CHUNK]-(article:Article)
WITH article, collect(distinct c.text) as texts, avg(score) as score
RETURN article {.title, .sentiment, .siteName, .summary,
organizations: [ (article)-[:MENTIONS]->(org:Organization) |
org { .name, .revenue, .nbrEmployees, .isPublic, .motto, .summary,
orgCategories: [ (org)-[:HAS_CATEGORY]->(i) | i.name],
people: [ (org)-[rel]->(p:Person) | p { .name, .summary, role:
replace(type(rel),"HAS_","") }]}],
texts: texts} as text,
score, {source: article.siteName} as metadata
""",
)
retriever = kg.as_retriever(search_kwargs={"k": 5})
chain = (
{"context": retriever | format_docs , "question": RunnablePassthrough()}
| qa_prompt
| llm
| StrOutputParser()
)
return chain
The LangChain chain we configured is then called in the minimalistic query() method with chain.invoke().
def query(self, query):
return self.qa_chain.invoke(query)
We can test out our newly created LangchainCode class by instantiating it locally and just running a test query.
# testing locally
lc = LangchainCode()
lc.set_up()
response = lc.query('What are the news about IBM and its acquisitions and who are the people involved?')
print(response)
We get this output:
IBM acquired several companies, including:
* **Ascential Software** in 2005 to strengthen its data integration capabilities [CHINAdaily.com.cn](https://www.chinadaily.com.cn/bizchina/2012-11/01/content_15865847.htm).
* **Cognos Inc.** in 2007 for approximately $5 billion to enhance its business intelligence and performance management software offerings [CHINAdaily.com.cn](https://www.chinadaily.com.cn/bizchina/2012-11/01/content_15865847.htm).
* **Netezza** in September 2010 to boost its data warehousing capabilities [CHINAdaily.com.cn](https://www.chinadaily.com.cn/bizchina/2012-11/01/content_15865847.htm).
* **Algorithmics** in 2011 for $387 million to improve its risk analysis capabilities and financial services [CHINAdaily.com.cn](https://www.chinadaily.com.cn/bizchina/2012-11/01/content_15865847.htm).
....
Key people involved in these acquisitions include:
* **Virginia Rometty**, former IBM president and chief executive officer, who led the acquisitions of Ascential Software, Cognos Inc., Netezza, and Algorithmics.
* **Arvind Krishna**, current CEO of IBM, who spearheaded the acquisitions of Turbonomic Inc., myInvenio, and Bluetab Solutions Group.
* **Jim Whitehurst**, former president of IBM and former CEO of Red Hat, played a key role in IBM's hybrid-cloud strategy, including the acquisition of Red Hat in 2019.
These acquisitions reflect IBM's strategic focus on expanding its capabilities in data analytics, cloud computing, artificial intelligence, and security.
Cool, seems our code works. Now let's deploy as ReasoningEngine with the Google Vertex AI Python SDK. For the deployment, you need to provide the instance of the class whose instantiation captures relevant environment variables and configuration and the dependencies, in our case google-cloud-aiplatform, langchain, langchain_community, langchain_google_vertexai, neo4j.
reasoning_engine = reasoning_engines.ReasoningEngine.create(
LangchainCode(),
requirements=[
"google-cloud-aiplatform==1.57.0",
"langchain_google_vertexai==1.0.6",
"langchain==0.2.6",
"langchain_community==0.2.6",
"neo4j==5.20.0"
],
display_name="Neo4j Vertex AI RE Companies",
description="Neo4j Vertex AI RE Companies",
sys_version="3.10",
extra_packages=[]
)
This starts deploying the application as follows:
INFO:vertexai.reasoning_engines._reasoning_engines:Using bucket neo4j-vertex-ai-extension
INFO:vertexai.reasoning_engines._reasoning_engines:Writing to gs://neo4j-vertex-ai-extension/reasoning_engine/reasoning_engine.pkl
INFO:vertexai.reasoning_engines._reasoning_engines:Writing to gs://neo4j-vertex-ai-extension/reasoning_engine/requirements.txt
INFO:vertexai.reasoning_engines._reasoning_engines:Creating in-memory tarfile of extra_packages
INFO:vertexai.reasoning_engines._reasoning_engines:Writing to gs://neo4j-vertex-ai-extension/reasoning_engine/dependencies.tar.gz
INFO:vertexai.reasoning_engines._reasoning_engines:Creating ReasoningEngine
INFO:vertexai.reasoning_engines._reasoning_engines:Create ReasoningEngine backing LRO: projects/PPPPPPPPPPPPP/locations/us-central1/reasoningEngines/EEEEEEEEEEEEE/operations/OOOOOOOOOOO
INFO:vertexai.reasoning_engines._reasoning_engines:ReasoningEngine created. Resource name: projects/PPPPPPPPPPPPP/locations/us-central1/reasoningEngines/EEEEEEEEEEEEE
To use this ReasoningEngine in another session:
reasoning_engine = vertexai.preview.reasoning_engines.ReasoningEngine('projects/PPPPPPPPPPPPP/locations/us-central1/reasoningEngines/OOOOOOOOOOO')
We see the different steps that Reasoning Engine takes in the output (we masked the IDs used in the resource name)
And after successful deployment we can use the resulting reasoning-engine object via the query method, passing in our user question. If we want to access the deployed ReasoningEngine later, e.g. in a client, we can do so via its id.
# if needed access ReasoningEngine from the reasoning-engine-id
# reasoning_engine = vertexai.preview.reasoning_engines.ReasoningEngine('projects/PPPPPPPPPPPPP/locations/us-central1/reasoningEngines/OOOOOOOOOOO')
response = reasoning_engine.query(query="Who is on the board of Siemens?")
print(response)
Which generates the following response:
The following people are on the board of Siemens:
* Jim Hagemann Snabe, Manager and chairman [source](https://www.google.com/search?q=Siemens)
* Dominika Bettman, CEO at Siemens [source](https://www.google.com/search?q=Siemens)
* Alejandro Preinfalk, CEO at Siemens [source](https://www.google.com/search?q=Siemens)
* Miguel Angel Lopez Borrego, CEO at Siemens [source](https://www.google.com/search?q=Siemens)
* Hanna Hennig, CIO at Siemens [source](https://www.google.com/search?q=Siemens)
* Barbara Humpton, CEO at Siemens Bank [source](https://www.google.com/search?q=Siemens)
* Matthias Rebellius, CEO at Siemens Corporate Technology [source](https://www.google.com/search?q=Siemens)
* Cedrik Neike, CEO at Siemens [source](https://www.google.com/search?q=Siemens)
* Ralf P. Thomas, CFO at Siemens [source](https://www.google.com/search?q=Siemens)
* Michael Sigmund, Chairman at Siemens [source](https://www.google.com/search?q=Siemens)
* Horst J. Kayser, Chairman at Siemens [source](https://www.google.com/search?q=Siemens)
* Michael Diekmann, Business person [source](https://www.google.com/search?q=Siemens)
* Norbert Reithofer, German businessman [source](https://www.google.com/search?q=Siemens)
* Werner Brandt, German manager [source](https://www.google.com/search?q=Siemens)
* Birgit Steinborn, Chairman at Siemens [source](https://www.google.com/search?q=Siemens)
Siemens is a German multinational conglomerate in the following industries: Networking Companies, Wind Energy Companies, Engine Manufacturers, Computer Hardware Companies, Electrical Equipment Manufacturers, Electronic Products Manufacturers, Software Companies, Energy Companies, Home Appliance Manufacturers, Turbine Manufacturers, Nuclear Energy Companies, Manufacturing Companies, Machine Manufacturers, Renewable Energy Companies, Tool Manufacturers. [source](https://www.google.com/search?q=Siemens)
Generative AI agents are AI powered software entities that uses a combination of a generative model’s capabilities, tools to connect to the external world, and high-level reasoning to achieve a desired end goal or state. These agents enhance enterprise processes by automating tasks, assisting human workers, and digitizing services, ultimately leading to higher productivity and improved customer satisfaction. Organizations use AI agents to achieve specific goals and more efficient business outcomes.
There are various ways to build AI agents on Google cloud platform, including managed offerings like no-code agent builder platform as well as using open-source frameworks like Langchain , OneTwo and Llamaindex.
In this blog we will focus on building and deploying agents with LangChain on Vertex AI, which involves four distinct layers, each catering to specific development needs.
The Gen AI application that we built in the previous section can be extended to an agentic application by making the below changes
from langchain.agents import Tool
tools = [
Tool(
name='Knowledge Base',
func=self.qa_chain.invoke,
description=('use this tool when answering specific news queries
to get more information about the topic'
)
)
]
remote_app = reasoning_engines.ReasoningEngine.create(
reasoning_engines.LangchainCode(
model=model,
tools=[tools],
),
requirements=[
"google-cloud-aiplatform[reasoningengine,langchain]",
],
display_name="Neo4j Vertex AI RE Companies",
)
remote_app
remote_app = reasoning_engines.ReasoningEngine("projects/PROJECT_ID/locations/LOCATION/reasoningEngines/REASONING_ENGINE_ID")
response = remote_app.query(input="What is the exchange rate from US dollars to Swedish currency?")
The above steps lets you use Reasoning engine to deploy a Gen AI application which is agentic in nature. In this example we used the py function as a tool but you can use any langchain tool such as function calling and create a multi agent application. We will deep dive into this aspect more in future.
You can also debug and optimize your agents by enabling tracing in the Reasoning Engine. Here is a notebook that explains how you can use Cloud Trace for exploring the tracing data to get insights.
The Neo4j LangChain integrations offer various features that can help with various use cases, such as vector search with contextual retrieval, Natural language to graph query language (Cypher) generation, knowledge graph construction, conversational memory, and LangServe templates for advanced RAG patterns.
Try it out yourself, enable Vertex AI on your Google Cloud account and get going, here is the repository with the Jupyter Notebook and the documentation for the Neo4j LangChain Integrations, the Vertex AI Python SDK and Reasoning Engine (with LangChain).
Michael: I've been working with Vertex AI Extensions and Reasoning Engine for quite some months now - so I really enjoyed the Cloud Next ‘24 talk by Julia Wiesinger, Kristopher Overholt, and JC Escalante on building and deploying agents with Reasoning Engine and LangChain on Vertex AI (resources). The Google Vertex AI team ran a webinar on June 6 on "Building and Deploying AI Agents with LangChain on Vertex AI" (blog post) that goes deeper into the agentic aspects of Reasoning Engine.
Please provide us feedback on this article, on the Google Cloud documentation or at or on GitHub issues. And let us know if you run into any issues using vertex AI or want to see new features here at goo.gle/vertex-ai-issues. And of course if you found this useful, share it with your friends and colleagues.
Thank you Mr @marutichandc
@marutichandc great article! Thanks for sharing with the community. Do you have any example of how the database was modeled? I have a bunch of PDFs and I'd like to do something similar but I'm wondering the best way to divide the files and create relations between them to use this approach, if you can point me to the right direction will be appreciated.
Thanks for giving us information