Memory leak when writing files with Filestore mounted with NFS in Cloud Run Job.
I created a task that runs for over an hour in cloud run job. The resulting files are 1-10 GB in size, and the cloud run job requires a lot of memory because the file storage is in the memory area. Therefore, we prepared Filestore, mounted it with NFS and saved the files in /mnt/example. I thought I could run with less memory if I did that.
However, the memory of the cloud run job increased by the size of the file, and the result was not what I expected.
How can I make it run with more files but less memory?
The following is a simplified version of the actual command used
gcloud run jobs create mnt-example \
--project example \
--region asia-northeast1 \
--image asia-northeast1-docker.pkg.dev/example/example-app/mnt-example:latest \
--add-volume name=nfs,type=nfs,location=10.118.2.2:/example \
--add-volume-mount volume=nfs,mount-path=/mnt/example \
--cpu 1 \
--memory 1Gi \
--tasks 1 \
--task-timeout 3h
gcloud run jobs execute mnt-example \
--project example \
--region asia-northeast1
Hi @shikajiro,
The issue is that while the data is stored on Filestore, the application still needs to buffer the data in memory before writing it to the NFS mount. Writing large files directly to a network file system can be memory-intensive because the application typically needs to hold a significant portion of the file in memory before flushing it to disk.
To reduce memory usage when writing large files to Filestore via NFS in Cloud Run Jobs, consider these strategies:
The key is to avoid loading the entire dataset into memory at once. Address the core problem by changing how the data is written to disk. Prioritize the first two points before considering increasing memory.
I hope the above information is helpful.
void ZoomSDKAudioRawDataDelegate::writeToFile(const string &path, AudioRawData *data)
{
static std::ofstream file;.
file.open(path, std::ios::out | std::ios::binary | std::ios::app);
if (!file.is_open())
return Log::error("failed to open audio file path: ’ + path);
file.write(data->GetBuffer(), data->GetBufferLen());
file.close();
file.flush();
stringstream ss;.
ss << ‘Writing “ << data->GetBufferLen() << ”b to “ << path << ” at “ << data->GetSampleRate() << ”Hz’;
Log::info(ss.str());
}
Same problem here. o/
I have even tried to open (in append mode) and close the file on every chunk. To no avail.
print(f"Downloading from {args.url}")
response = requests.get(args.url, stream=True, headers=headers)
print(f"{response.headers=}")
if 200 <= response.status_code <= 299:
print(f"Saving to {args.output}")
total_size = int(response.headers.get("content-length", 0))
chunk_size = 8192 * 1024
with tqdm(
total=total_size, unit="iB", unit_scale=True, mininterval=1.0
) as progress_bar:
for chunk in response.iter_content(chunk_size=chunk_size):
if chunk:
with open(args.output, "ab") as f:
f.write(chunk)
progress_bar.update(len(chunk))
# os.sync()
else:
print(f"Failed to download file: {response.status_code=}")