Hello! I'm trying to upload a large amount of images to a bucket straight from url (not downloading the file locally). I wrote my code based on this sample: https://github.com/googleapis/nodejs-storage/blob/c51cd946171e8749453eef080d2853d31a6e72c8/samples/s...
After ~3000 uploads (this number changes a lot but it's usually around 3000) my script just stops for a few seconds, then successfully uploads few more images and then crashes with this error:
I was able to find out people having the same issue however none of their solutions worked for me(neither changing stream options nor applying timeout)
I'm limitting requests on 50 per second and the files are relatively small (below 500kb). I don't think it's a performance issue because I tried lowering the limit to 5 per second and had the same issue.
Here's my code:
uploadFile
async function uploadFile(
file_name: string,
buffer: Buffer,
contentType: string
) {
const bucketFile = bucket.file(file_name);
const dataStream = new stream.PassThrough();
dataStream.push(buffer, "binary");
dataStream.push(null);
return new Promise((resolve, reject) => {
dataStream
.pipe(
bucketFile.createWriteStream({
resumable: false,
metadata: { cacheControl: "no-cache", contentType },
})
)
.on("error", (error: any) => {
reject(error);
})
.on("finish", () => {
dataStream.end();
resolve("");
});
}).then(async () => {
const down_url = await bucket
.file(file_name)
.getSignedUrl({ action: "read", expires: "03-09-2491" });
console.log(down_url);
});
downloadFile (get a buffer from url)
async function downloadFile(
url: string
): Promise<{ picBuffer: Buffer; type: string }> {
return new Promise((resolve, reject) => {
let req = https.get(url);
req.on("response", (res: any) => {
const data: any = [];
const imageType = res.headers["content-type"];
if (!imageType) console.log("ERROR " + url + "NO IMAGE TYPE");
res
.on("data", function (chunk: any) {
data.push(chunk);
})
.on("end", function () {
resolve({
picBuffer: Buffer.concat(data),
type: imageType,
});
});
});
req.on("error", (err: any) => {
reject(err);
});
});
}
handleImage - main function
const handleImage = (sneakerData: any, size: string, index: number) => {
return new Promise(async (resolve, reject) => {
await limit.removeTokens(1);
try {
const imageURL: string = resolveObject(`image_${size}`, sneakerData);
if (!imageURL.includes("https://")) {
return;
}
const data = await downloadFile(imageURL);
const random = crypto.randomBytes(3).toString("hex");
const file_name = `${random}_${sneakerData.sneaker_id}_${size}.${
data.type.split("/")[1]
}`;
await uploadFile(file_name, data.picBuffer, data.type);
resolve("success");
} catch (err) {
console.log(err);
reject(err);
}
});
};
app
const app = async () => {
const data = await getLinks();
console.log("STARTING");
let tasks: any = [];
for (let i = 0; i < data.length; i++) {
for (let j = 0; j < imageSizes.length; j++) {
tasks.push(handleImage(data[i], imageSizes[j], i));
}
}
await Promise.all(tasks);
};
config
const storage = new Storage({
keyFile: "../../service-account.json",
retryOptions: {
autoRetry: true,
retryDelayMultiplier: 3,
totalTimeout: 500,
maxRetryDelay: 60,
maxRetries: 5,
idempotencyStrategy: IdempotencyStrategy.RetryAlways,
},
});
const bucket = storage.bucket("test_products");
const limit = new RateLimiter({ interval: 1000, tokensPerInterval: 50 });
Sorry for code being unclear in many places, my priority is to get through this error as soon as possible. Thank you
Hello @mfraczek!
Welcome to the Google Cloud Community!
You can try the following troubleshooting options: