Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Is Cloud bucket to Cloud bucket transfer faster/slower than On-Prem to Cloud Bucket?

Q1.) when using gcloud storage cp/mv/rsync, which is the better option for a initial upload..?
CloudBucket1 -> CloudBucket2 or On-Premise (LocalWorkstation) -> CloudBucket2

Q2) Is mv faster than cp/rsync while uploading?

I am asking this with respect to a folder with 850,000 small files of sizes (10-500kb each). I have to upload this to my bucket and looking for the fastest way to do it... 

Please help! @DamianS @cesan3 @iamawaneendra @juancarlos_la @EnterSecurity 

Solved Solved
2 4 1,320
1 ACCEPTED SOLUTION

Ad1). Basically, if you dealing with data around the cloud, afair, bucket is doing:
1. Decryption at rest for your data in bucket1
2. Copy / rsync data to bucket 2 AND
3. Encryption at rest for your data in bucket2
All within storage, so it might be true, that OnPrem to Bucket should be faster, because Encryption at rest for data taken from OnPrem is only when data is being transfered either by gcloud or gsuitl command ( or via UI ). So for me, there is less steps to perform, that in bucket2bucket case.

Ad2. Both are great. gsutil is more advanced as he has parallel synchronization and more granular control over behavior. So if you are using for daily basis gcloud commands, use gcloud storage rsync, but if you prefer dedicated cmd for buckets, use gsuitl 🙂 

cheers,
DamianS

View solution in original post

4 REPLIES 4

Hello @savetsa 


@savetsa wrote:

Q2) Is mv faster than cp/rsync while uploading?


Those two commands are totally different. "mv" will move your data from source to target ( without copy ). cp/rsync will copy your data from source to target. So, if you want to keep your data in source and have the same data at the target, cp/rsync should be used.  Basically, mv is faster, however as I said, with mv command you're essentially just changing the file's location within the filesystem. It looks like you have large dataset, so gsutil rsync should be use in this case.


@savetsa wrote:

Q1.) when using gcloud storage cp/mv/rsync, which is the better option for a initial upload..?
CloudBucket1 -> CloudBucket2 or On-Premise (LocalWorkstation) -> CloudBucket2


It depends on several factors including the size of the data, the network speed, and the need for synchronization. 

Bucket1 -> Bucket2
gsutil cp It's optimized for transferring data within Google Cloud and can handle large datasets efficiently BUT If you need to synchronize changes incrementally or if you have a large number of files, rsync will be a better choice.

OnPrem -> bucket2
gsutil cp If you're uploading a relatively small number of files or if you don't need synchronization (in your case big dataset I would not use cp)
gsutil rsync  for larger datasets or when you need to perform repeated uploads and want to minimize transfer time and bandwidth usage

So to summarize this, I would go with rsync in this case.

cheers,
DamianS

 

Thanks for your response. A little follow-up though!

1.) Is it right that Cloud->Cloud (Same bucket but diff folder) is slower than OnPrem->Cloud? I was told that C->C would download the data from source and then upload to destination folder, which is 2x . I guess its not the case. 
I would expect a faster transfer in C->C case than Prem->Cloud.

 2.) gcloud storage rsync vs. gsutil rsync. Which is better? Any idea!

Ad1). Basically, if you dealing with data around the cloud, afair, bucket is doing:
1. Decryption at rest for your data in bucket1
2. Copy / rsync data to bucket 2 AND
3. Encryption at rest for your data in bucket2
All within storage, so it might be true, that OnPrem to Bucket should be faster, because Encryption at rest for data taken from OnPrem is only when data is being transfered either by gcloud or gsuitl command ( or via UI ). So for me, there is less steps to perform, that in bucket2bucket case.

Ad2. Both are great. gsutil is more advanced as he has parallel synchronization and more granular control over behavior. So if you are using for daily basis gcloud commands, use gcloud storage rsync, but if you prefer dedicated cmd for buckets, use gsuitl 🙂 

cheers,
DamianS

Thanks for the answers. This resolves all my queries.

Clarification: 
I just now see the gsutils rsync page which is pointing to choose gcloud storage instead of gsutils as an Important note. So I stick to gcloud storage rsync, but accepting your explanation I choose to upload from OnPrem -> Cloud.

savetsa_0-1710859107822.png