Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Getting Error: connect ETIMEDOUT when uploading multiple images to cloud storage using node.js

Hello! I'm trying to upload a large amount of images to a bucket straight from url (not downloading the file locally). I wrote my code based on this sample: https://github.com/googleapis/nodejs-storage/blob/c51cd946171e8749453eef080d2853d31a6e72c8/samples/s...

After ~3000 uploads (this number changes a lot but it's usually around 3000) my script just stops for a few seconds, then successfully uploads few more images and then crashes with this error: 

mfraczek_0-1698023788265.png

I was able to find out people having the same issue however none of their solutions worked for me(neither changing stream options nor applying timeout)
I'm limitting requests on 50 per second and the files are relatively small (below 500kb). I don't think it's a performance issue because I tried lowering the limit to 5 per second and had the same issue.
Here's my code:
uploadFile

 

async function uploadFile(
  file_name: string,
  buffer: Buffer,
  contentType: string
) {
  const bucketFile = bucket.file(file_name);
  const dataStream = new stream.PassThrough();

  dataStream.push(buffer, "binary");
  dataStream.push(null);

  return new Promise((resolve, reject) => {
    dataStream
      .pipe(
        bucketFile.createWriteStream({
          resumable: false,
          metadata: { cacheControl: "no-cache", contentType },
        })
      )
      .on("error", (error: any) => {
        reject(error);
      })
      .on("finish", () => {
        dataStream.end();
        resolve("");
      });
  }).then(async () => {
    const down_url = await bucket
      .file(file_name)
      .getSignedUrl({ action: "read", expires: "03-09-2491" });
    console.log(down_url);
  });

 

downloadFile (get a buffer from url)

 

async function downloadFile(
  url: string
): Promise<{ picBuffer: Buffer; type: string }> {
  return new Promise((resolve, reject) => {
    let req = https.get(url);
    req.on("response", (res: any) => {
      const data: any = [];
      const imageType = res.headers["content-type"];
      if (!imageType) console.log("ERROR " + url + "NO IMAGE TYPE"); 
      res
        .on("data", function (chunk: any) {
          data.push(chunk);
        })
        .on("end", function () {
          resolve({
            picBuffer: Buffer.concat(data),
            type: imageType,
          });
        });
    });
    req.on("error", (err: any) => {
      reject(err);
    });
  });
}

 

handleImage - main function

 

const handleImage = (sneakerData: any, size: string, index: number) => {
  return new Promise(async (resolve, reject) => {
    await limit.removeTokens(1);
    try {
      const imageURL: string = resolveObject(`image_${size}`, sneakerData);
      if (!imageURL.includes("https://")) {
        return;
      }
      const data = await downloadFile(imageURL);
      const random = crypto.randomBytes(3).toString("hex");
      const file_name = `${random}_${sneakerData.sneaker_id}_${size}.${
        data.type.split("/")[1]
      }`;
      await uploadFile(file_name, data.picBuffer, data.type);
      resolve("success");
    } catch (err) {
      console.log(err);
      reject(err);
    }
  });
};

 

app

 

const app = async () => {
  const data = await getLinks();
  console.log("STARTING");
  let tasks: any = [];
  for (let i = 0; i < data.length; i++) {
    for (let j = 0; j < imageSizes.length; j++) {
      tasks.push(handleImage(data[i], imageSizes[j], i));
    }
  }
  await Promise.all(tasks);
};

 

config

 

const storage = new Storage({
  keyFile: "../../service-account.json",
  retryOptions: {
    autoRetry: true,
    retryDelayMultiplier: 3,
    totalTimeout: 500,
    maxRetryDelay: 60,
    maxRetries: 5,
    idempotencyStrategy: IdempotencyStrategy.RetryAlways,
  },
});
const bucket = storage.bucket("test_products");
const limit = new RateLimiter({ interval: 1000, tokensPerInterval: 50 });

 

Sorry for code being unclear in many places, my priority is to get through this error as soon as possible. Thank you 

 

0 1 2,250
1 REPLY 1

 

Hello @mfraczek!

Welcome to the Google Cloud Community!

You can try the following troubleshooting options:

  1. Look into Resumable Uploads. It allows you to resume data transfer to Cloud Storage after a communication failure has interrupted the flow of data. It works by sending multiple requests, each of which contains a portion of the object you're uploading. 
  2. You can try to utilize the gsutil tool in uploading to Cloud Storage. Here, you can use the flag -m to perform multi-threaded/multi-processing to transfer a large number of files.
  3. In addressing the ETIMEDOUT issue, try looking into this SO post as you might have the same problem. 
  4. If the above options don't work, you can contact Google Cloud Support to further look into your case. Let me know if it helped, thanks!

 

Top Solution Authors