Unable to create file on GCP file store.

rhcechetan · 01-17-2023 05:47 AM

I checked there is enough space and inodes available on GCP file store.
I can be able to create files and folders on same mount point.
I got error message specific to file name.
I also try to find open file using lsof command on same mount point but there is no such deleted files.

kumards

From your description of the issue, it appears that the error occurs for only one file name, but you are able to write other files. Could you please share the text of the error message?

rhcechetan

root@veratos-virtual-appliance-nlp-1-6f47c6b669-xpgbs:/data/NLP-output-directory# touch N56289_text.txt
touch: cannot touch 'N56289_text.txt': No space left on device

kumards

Can you pls share the output of the following command, as described in https://cloud.google.com/filestore/docs/troubleshooting#no_space_left_on_device?

df -i

Also, if the Filestore instance is Enterprise or High Scale tier, please note the known issue described in https://cloud.google.com/filestore/docs/known-issues#capacity_errors_before_reaching_full_provisione....

rhcechetan

root@veratos-virtual-appliance-nlp-1-6f47c6b669-xpgbs:/data/NLP-output-directory# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
overlay 6258720 118112 6140608 2% /
tmpfs 8232716 17 8232699 1% /dev
tmpfs 8232716 17 8232699 1% /sys/fs/cgroup
172.29.209.42:/inputdata 67108864 8047524 59061340 12% /inputdata
172.29.209.10:/outputdata 134217728 20313522 113904206 16% /data
/dev/sda1 6258720 118112 6140608 2% /etc/hosts
shm 8232716 1 8232715 1% /dev/shm
tmpfs 8232716 9 8232707 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 8232716 1 8232715 1% /proc/acpi
tmpfs 8232716 1 8232715 1% /proc/scsi
tmpfs 8232716 1 8232715 1% /sys/firmware

kumards

Thanks for sharing the output of `df -i` , @rhcechetan ! So no issue with inodes, and you already clarified that `lsof` doesn't show a deleted file with the same name. I don't know what might be wrong. I'll ask internally and share what I find. Feel free to create a support request though.

Request to the community: If anyone else has run into similar issues, please share here. Thank you!

rhcechetan

Thanks @kumards !! Could you please let me know how to create a support request.

kumards

Hi @rhcechetan , that would depend on the support service that you have. Please review the options at https://cloud.google.com/support/docs. Also see, https://console.cloud.google.com/support.

DamianS

Hi,

It might be possible, that there is bad block on this NFS share and OS thinking that this particular block is broken and somehow mentioned name is connected with that broken block. With this command fsck will help.

Second option :
It again might be possible that some process is holding this and lsof didn't found that process. Best option is to restart machine ( but I saw that this is an appliance, so it might be hard to do that without making DT). I had in my Unix career two cases when lsof didn't show any processes holding files, but even umount command showed that something is keeping those files. Reboot helped.

Personally, I'm guessing that either one of mentioned options is causing your problem.

best,
DamianS

rhcechetan

@DamianS The file store is mounted inside the GCP Kubernetes having more than two pods. I tried to delete the one pod and recreated pod but still getting the same error message. First option yet not tried as I am not sure it is safe to run fsck on filestore which is mounted on more pods.

DamianS

Yep,
fsck at mounted FS is not a good idea. You can do fsck with -N option to make a dry run (I've. never tested this at mounted FS). At least it worked at on-prems. If you don't want to fix blocks, after umount you can use -n instead of -N. It will check your FS , not perform repair but print bad blocks (if any).

rhcechetan

I don't think so fsck will be able to run on GCP filestore.

DamianS

Yep, you're right. gcp filestore is NFS fs type. I forgot about it