Neo4j is a popular graph database that is well-suited for storing and querying connected data. Google Kubernetes Engine (GKE) is a managed Kubernetes service that makes it easy to deploy and manage containerized applications. In this article, we'll walk through the steps involved in deploying a Neo4j cluster on GKE.
What is Google Kubernetes Engine (GKE)?
GKE provides a managed environment for deploying, managing, and scaling your containerized applications using Kubernetes. Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications. GKE simplifies the process of setting up and maintaining a Kubernetes cluster by handling the underlying infrastructure and providing features like:
What are Helm Charts?
Helm is a package manager for Kubernetes that simplifies the deployment and management of applications. Helm Charts are packages of pre-configured Kubernetes resources that can be easily deployed with a single command. They provide a templating engine that allows you to customize the deployment based on your specific needs.
Prerequisites
Add the Neo4j Helm chart repository.
helm repo add neo4j https://helm.neo4j.com/neo4j
Update the repository:
helm repo update
helm search repo neo4j/ --versions | grep 5.26.1
The output should be similar to the following:
neo4j/neo4j 5.26.1 5.26.1 Neo4j is the world's leading graph database neo4j/neo4j-admin 5.26.1 5.26.1 Neo4j is the world's leading graph database neo4j/neo4j-headless-service 5.26.1 - Neo4j is the world's leading graph database neo4j/neo4j-persistent-volume 5.26.1 - Sets up persistent disks suitable for a Neo4j H... neo4j/neo4j-reverse-proxy 5.26.1 5.26.1 Sets up an http server and a reverse proxy for ...
This quickstart guide walks through the basics of deploying a Neo4j standalone instance to a cloud or a local Kubernetes cluster using the Neo4j Helm chart.
On the GCP console open the Cloud Shell and click on Authorise:
All the shell commands in this guide assume that the GCP Project, compute zone, and region to use have been set using the CLOUDSDK_CORE_PROJECT, CLOUDSDK_COMPUTE_ZONE, and CLOUDSDK_COMPUTE_REGION environment variables, for example:
export CLOUDSDK_CORE_PROJECT="my-neo4j-project" export CLOUDSDK_COMPUTE_ZONE="europe-west2-a" export CLOUDSDK_COMPUTE_REGION="europe-west2"
If you do not have a Google Kubernetes Engine (GKE) cluster, you can create a single-node one using:
gcloud container clusters create my-neo4j-gke-cluster --num-nodes=1 --machine-type "e2-standard-2"
e2-standard-2 is the minimum instance type required for running the examples of this startup guide on GKE.
Configure kubectl to use your GKE cluster using:
gcloud container clusters get-credentials my-neo4j-gke-cluster
Fetching cluster endpoint and auth data. kubeconfig entry generated for my-neo4j-gke-cluster.
Select the tab as per your Kubernetes environment and using the provided example, create a YAML file for your standalone instance.
neo4j: name: my-standalone resources: cpu: "0.5" memory: "2Gi" # Uncomment to set the initial password #password: "my-initial-password" # Uncomment to use enterprise edition #edition: "enterprise" #acceptLicenseAgreement: "yes" volumes: data: mode: "dynamic" dynamic: # In GKE; # * premium-rwo provisions SSD disks (recommended) # * standard-rwo provisions balanced SSD-backed disks # * standard provisions HDD disks storageClassName: premium-rwo
Install Neo4j using the deployment values.yaml file, created in Create a value.yaml file, and the neo4j/neo4j Helm chart:
Create a neo4j namespace and configure it to be used in the current context:
kubectl create namespace neo4j kubectl config set-context --current --namespace=neo4j
Install the Neo4j standalone server:
helm install my-neo4j-release neo4j/neo4j --namespace neo4j -f my-neo4j.values.yaml
Example output
LAST DEPLOYED: Wed Oct 26 15:19:17 2022 NAMESPACE: neo4j STATUS: deployed REVISION: 1 TEST SUITE: None NOTES: Thank you for installing neo4j. Your release "my-neo4j-release" has been installed in namespace "neo4j". The neo4j user's password has been set to "my-password".To view the progress of the rollout try: $ kubectl --namespace "neo4j" rollout status --watch --timeout=600s statefulset/my-neo4j-release Once rollout is complete you can log in to Neo4j at "neo4j://my-neo4j-release.neo4j.svc.cluster.local:7687". Try: $ kubectl run --rm -it --namespace "neo4j" --image "neo4j:5.1.0" cypher-shell \ -- cypher-shell -a "neo4j://my-neo4j-release.neo4j.svc.cluster.local:7687" -u neo4j -p "my-password" Graphs are everywhere!
Run the kubectl rollout command provided in the output of helm install to watch the Neo4j’s rollout until it is complete.
kubectl rollout status --watch --timeout=600s statefulset/my-neo4j-release
kubectl get statefulsets
NAME READY AGE my-neo4j-release 1/1 2m11s
Check that the pod is Running:
kubectl get pods
NAME READY STATUS RESTARTS AGE my-neo4j-release-0 1/1 Running 0 16m
Check that the pod logs look OK:
kubectl exec my-neo4j-release-0 -- tail -n50 /logs/neo4j.log
2022-10-26 14:19:51.728+0000 INFO Command expansion is explicitly enabled for configuration 2022-10-26 14:19:51.733+0000 WARN Unrecognized setting. No declared setting with name: server.panic.shutdown_on_panic. 2022-10-26 14:19:51.749+0000 INFO Starting... 2022-10-26 14:19:53.062+0000 INFO This instance is ServerId{cb9f2f3c} (cb9f2f3c-cd70-40b1-ac8e-13d9c4d26173) 2022-10-26 14:19:54.970+0000 INFO ======== Neo4j 5.1.0 ======== 2022-10-26 14:19:59.528+0000 INFO Bolt enabled on 0.0.0.0:7687. 2022-10-26 14:20:01.523+0000 INFO Remote interface available at http://localhost:7474/ 2022-10-26 14:20:01.530+0000 INFO id: EF772BAFBDCD3C4921D00A5707C88D6EDE514915DBCC7134E8704AFA15DC19C8 2022-10-26 14:20:01.530+0000 INFO name: system 2022-10-26 14:20:01.531+0000 INFO creationDate: 2022-10-26T14:19:56.631Z 2022-10-26 14:20:01.531+0000 INFO Started.
Check that the services look OK:
kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/my-neo4j-release-lb-neo4j LoadBalancer 10.36.5.34 34.105.179.172 7474:30288/TCP,7687:30584/TCP 14m service/kubernetes ClusterIP 10.36.0.1 443/TCP 22h service/my-neo4j-release ClusterIP 10.36.11.18 7687/TCP,7474/TCP 14m service/my-neo4j-release-admin ClusterIP 10.36.3.238 6362/TCP,7687/TCP,7474/TCP 14m
Use the external IP of the LoadBalancer to access Neo4j from an application outside the Kubernetes cluster. For more information, see Applications accessing Neo4j from outside Kubernetes.
Conclusion
Deploying Neo4j on GKE provides a scalable and reliable platform for running your graph database. By using Helm, you can simplify the deployment process and manage your Neo4j cluster with ease.
Additional Resources