Read/Write access to GCE Persistent Disk from mult...

Nevling · 01-16-2025 08:36 AM

I'm trying to configure a GKE Autopilot cluster to allow multiple Pods to read and write to the same GCE Persistent Disk. I'm aware that ReadWriteMany access mode is not directly supported for GCE Persistent Disks, and I'm still learning about the best practices for shared storage in Kubernetes.

However, I'm encountering an issue where Pods fail to start after node upgrades or scaling events. The Pods get stuck in a pending state with an "already mounted" error, which seems to indicate a problem with the Persistent Disk being accessed by multiple Pods.

My goal is to have a shared volume using a GCE Persistent Disk that can be accessed by multiple Pods simultaneously. Could you please provide some guidance on how to achieve this in GKE Autopilot, considering the limitations of GCE Persistent Disks? I'm open to exploring alternative solutions or workarounds if necessary.

I appreciate any help you can offer in clarifying my understanding and resolving this issue.

Thank you.

mokit

Hi, @Nevling.

If you want to use a volume with ReadWriteMany in a Kubernetes cluster, you need to use a storage solution that supports multiple read-write access (such as NFS). In Google Cloud, the most commonly used option for this is Filestore which supports ReadWriteMany (as far as i know).

You can follow these steps for this scenario:
1. Go to the Filestore section in Google Cloud Console or use gcloud to create a Filestore instance, as -

gcloud filestore instances create <filestore-instance-name> \
  --project=<project-id> \
  --zone=<zone> \
  --tier=STANDARD \
  --file-share=name="<share-file-nam>",capacity=<capacity-size> \
  --network=name="<network-name>"

2. Obtain the NFS IP address of the Filestore instance from the console or using:

gcloud filestore instances describe <filestore-instance-name> --zone=<zone>

3. Create a PersistentVolume (PV) definition that uses the NFS protocol to point to the Filestore instance

apiVersion: v1
kind: PersistentVolume
metadata:
  name: <filestore-pv-name>
spec:
  capacity:
    storage: <capacity-size>
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  nfs:
    path: <mount-path>
    server: <NFS_SERVER_IP>
  storageClassName: "standard"

4. Create a PersistentVolumeClaim (PVC) to request storage from the PV you just created

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: <filestore-pvc-name>
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: <capacity-size>
  storageClassName: "standard"

5. Create a pod that uses the PVC to mount the volume and read/write to it

apiVersion: v1
kind: Pod
metadata:
  name: <pod-name>
spec:
  containers:
    - name: <container-name>
      image: <container-image>
      volumeMounts:
        - mountPath: <app-mount-path>
          name: <filestore-volume-name>
  volumes:
    - name: <filestore-volume-name>
      persistentVolumeClaim:
        claimName: <filestore-pvc-name>

Regards,
Mokit

Nevling

Dear @mokit .

Thank you for reply.

I was afraid I'd have to Filestore..
We had initially considered using Filestore, but we had the impression that the cost would be high, so we decided not to adopt it.

I didn't give you the background.
The application we are currently building is in the experimental phase, and our mission is to keep costs as low as possible, so we were looking for a way to enable ReadWriteMany on GCE Disk.

We have decided to proceed with our cost estimation and request to migrate to Filestore, and if approved, we will refer to the procedures and references you have provided.

Thank you.

garisingh

Could you provide a few more details on your actual use case? That way we can see what the best storage solution might be.

Nevling

Dear @garisingh

Thank you for your reply.

My apologies, the use case description was insufficient.

The current use case is as follows

1. Persistence of DB Data
We are deploying a PostgreSQL database with GKE and need to persist it.
However, we are considering migrating to Cloud SQL for this case.

2. Storage of large size temporary data
We need to convert streaming video to MOV using ffmpeg and upload it to GCS.
Initially, temporary data was stored in-memory, but due to the large size and memory depletion, we believe it is necessary to use an external volume.

Thank you so match.

jayeshmahajan

1. RWX with GCE Storage gets tricker, ideally most of GCP users go with GCP filestore that can be used as NFS. Is that not an option for you?

2. If GCS Bucket is an option for you then install the GCS Fuse CSI Driver in your GKE Autopilot cluster to mount Google Cloud Storage (GCS) buckets as filesystems. GCS is an object store, so it’s not a perfect replacement for block storage, but it supports concurrent access.

3. If you must use GCE Persistent Disks and need to mitigate issues during upgrades or scaling events:
Use Node Affinity: Ensure Pods are scheduled on specific nodes where the Persistent Disk is mounted:

nodeSelector:
disk-ready: "true"

You can use a DaemonSet to prepare the disk on specific nodes.
Pre-Mount the Disk: Use an InitContainer to mount the disk before the main container starts.
Prevent Node Upgrades During Critical Operations: Use GKE maintenance windows to control when upgrades occur, reducing the risk of interruptions.

Nevling

Dear @jayeshmahajan .

Thank you for your reply.

1. RWX with GCE Storage gets tricker, ideally most of GCP users go with GCP filestore that can be used as NFS. Is that not an option for you?

At the time, we opted not to use Filestore due to cost considerations.
We had the impression that Filestore was quite expensive. However, we are currently revisiting the possibility of using Filestore.

2. If GCS Bucket is an option for you then install the GCS Fuse CSI Driver in your GKE Autopilot cluster to mount Google Cloud Storage (GCS) buckets as filesystems. GCS is an object store, so it’s not a perfect replacement for block storage, but it supports concurrent access.

Thanks.
We actually tried mounting GCS as a volume with Cloud Run in the past.
Unfortunately, we overlooked the fact that "When writing to Cloud Storage, the entire file is staged in Cloud Run memory before the file is writ..." This caused us some issues with memory leaks, which was a painful experience.

3. If you must use GCE Persistent Disks and need to mitigate issues during upgrades or scaling events:
Use Node Affinity: Ensure Pods are scheduled on specific nodes where the Persistent Disk is mounted:
nodeSelector:
disk-ready: "true"

That's an interesting.
We understand this could be useful for mitigating issues during temporary events like upgrades or scaling.
However, we believe this might not be suitable for scenarios requiring redundancy, such as when we need multiple replicas of our application accessing the same persistent disk.
Is our understanding correct?

jayeshmahajan

1. Filestore offers Basic and Enterprise tiers, with different pricing and performance levels. For applications with lower IOPS and throughput requirements, Basic HDD can be a cost-effective option.
2. Instead of Cloud Run, GKE offers more control. You can optimize memory usage by carefully managing the data staging process. Use GCS Fuse for workloads that involve small files or read-heavy operations, minimizing the risk of memory leaks.

GCS may still be a viable option if your workload doesn't require frequent writes of large files, but it isn’t a perfect substitute for block storage.

3. Yes, that option wont won't wok in your case for multiple apps for same disk. 1 or 2 are better options for you in my opinion.

andreicojo5

Hello @Nevling,

From what i understand you don't want to use a filestore instance because of the price, this was our problem a few months ago because we had to switch from glusterfs that was our solution in the past but the people that maintained the kubernetes api to connect to our vms didn't maintain it so k8s deprecated it. We also looked for a highly available solution since our workloads are in production and this is what we did:
We have a gcp regional disk that is basically a disk that has a replica on a zone selected by you in the same region (see https://cloud.google.com/compute/docs/disks/regional-persistent-disk) the price of this disk is double of a standard disk but it's still worth for us since the old solution had 3 replicated disks (one limitation is that the disk has to be at least 100GB i think if you want to use standard persistent disk, else you have to go for a balanced one that is a little bit more expensive). From this disk we create a PersistentVolume (see https://cloud.google.com/kubernetes-engine/docs/how-to/persistent-volumes/regional-pd) and we mount it to a pod running a NFS image (in our case this works fine itsthenetwork/nfs-server-alpine, but you can choose whatever image you like or even build you'r own), this pod is also exposed through a service and that's pretty much it, you can use the nfs mountpoint in any namespace via NFS-SERVICE.NFS-NAMESPACE.svc.cluster.local via https://kubernetes.io/docs/concepts/storage/volumes/#nfs. Make sure the nodepool has the same zones associated with the regional persistent disk to make sure it won't create the node on a zone that the disk can't be mounted on.
Everything that i said above is for a highly available setup, but if you want to cut costs more you can do everything i just said with a standard persistent disk and you'r costs will be at a minimum with one disk and one vm.
I hope this helped you finding a cheap storage solution.

Best Regards,
Andrei

usamayousuf13

I am trying to do a similar thing but I am stuck 😕
I currently use GKE Autopilot with Filestore instance for persistent storage, but we cannot continue with Filestore due to its high cost. Also, I am not looking to move to a standard cluster for now.
I explored alternatives and found a way to host NFS server inside a VM (using an additional disk of 30GB specifically for NFS to be used by my GKE cluster as persistent storage).
I created a test pod to see if it works and everything works well, which means and all the firewall rules are fine, the cluster can connect to VM and data is written on the mounted folder as well from GKE to VM Disk.
I have created a new storage class and bound my PV and PVC with it.
The problem started when I created another test pod to access the same PVC for storage, the pod failed and never contain without displaying any error related tothe PVC mount, instead, it just shows that "It failed to scale up"
I am pretty sure that GKE is not allowing me to use that VM-based NFS as readWriteMany.
Is there any way to resolve it? What I am doing wrong here?
Below are details how I created storageClass, PV and PVC
* storageclass.yaml:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nfs
provisioner: kubernetes.io/nfs
reclaimPolicy: Retain
volumeBindingMode: Immediate

* pv.yaml:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: nfs-vm-pv
spec:
  capacity:
    storage: 30Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs
  nfs:
    server: <vm-ip-here>
    path: /mnt/nfs_data

* pvc.yaml:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-pvc
  namespace: my-namespace
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 30Gi
  volumeName: nfs-vm-pv  # Manually specify the PV
  storageClassName: nfs

Please help anyone, or suggest any other alternative. I need to get rid of GCP Filestore ASAP
Thanks

Read/Write access to GCE Persistent Disk from multiple Pods in GKE Autopilot mode