Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Serverless (Cloud Function) on-demand copying of files from AWS S3 to Google Cloud Storage Bucket

We have a mobile (native) and browser (Javascript) thick client app which accesses public data files on AWS S3. Although downloading and working with these files is fine in all cases on mobile, some of the public files that our app accesses on S3 are in buckets that don't set the Access-Control-Allow-Origin * CORS header, and as a result our website can't directly access them, even though they're public. 

As a result, we'd like to be able to make local copies of the files when accessing from web browser. The easiest way to do this would seem to have the client make a request to a Google Cloud function, which then copies the AWS file to a GCP bucket that we control (and thus can set CORS headers on), then return the url of that GCP file to the client. 

Is doing the copy directly within a cloud function the right approach to this, or should we have the cloud function do something else, e.g. create a Cloud Storage Transfer job, which then does the actual file transfer? 

Assume that we typically only need to access a few (1-5) files on each client, but they are large (hundreds of megabytes.) Grateful for any suggestions. 

Solved Solved
3 2 1,060
1 ACCEPTED SOLUTION

Hi @ferretnt,

Welcome to Google Cloud Community!

Solution: Use a server-side proxy to copy files on-demand.

Two approaches:

  1. Cloud Function directly copying: Simple and quick, but limited for large files and concurrent requests.
  2. Cloud Function triggering Storage Transfer Service: Efficient for large files and scales better, but more complex to set up.

Recommendation: Use Storage Transfer Service for its efficiency and scalability, especially with large files and potential concurrent requests. Consider simplicity and control needs when making the final decision.

Additional tips:

  • Set proper IAM permissions.
  • Implement error handling and logging.
  • Consider caching for frequently accessed files.

View solution in original post

2 REPLIES 2

No answers came here in in the time we needed them, but for now we've worked around the CORS issue by using AWS Cloudfront as a proxy, and using its "Managed-CORS-With-Preflight" response headers policy to add the headers. This at least lets the browser app work. We'd still be interested in a way to shadow the files through to Google Cloud Storage as above (because it opens up other options, like writing to FireStore via a cloud function to index them), but for now we have an AWS-based band aid. 

Hi @ferretnt,

Welcome to Google Cloud Community!

Solution: Use a server-side proxy to copy files on-demand.

Two approaches:

  1. Cloud Function directly copying: Simple and quick, but limited for large files and concurrent requests.
  2. Cloud Function triggering Storage Transfer Service: Efficient for large files and scales better, but more complex to set up.

Recommendation: Use Storage Transfer Service for its efficiency and scalability, especially with large files and potential concurrent requests. Consider simplicity and control needs when making the final decision.

Additional tips:

  • Set proper IAM permissions.
  • Implement error handling and logging.
  • Consider caching for frequently accessed files.