Get hands-on experience with 20+ free Google Cloud products and $300 in free credit for new customers.

Using multiprocessing library stuck infinitely on dataproc cluster

RC1
Bronze 4
Bronze 4
So I have the following code for moving a blob from one folder to another
 

 

def move_blob(input):
 try:
  client=storage.Client()
  bucket=client.get_bucket(input[0])
  blob=bucket.blob(input[1])
  out=bucket.copy_blob(blob,bucket,input[2])
  blob.delete()
 except Exception as e:
  return None, str(e)

 

 
Here input[0] contains bucket name , input[1] contains source file path and input[2] contains destination file path. I am parallelly running using multiprocessing libray using 
 
 

 

with multiprocessing.Pool(processes=60) as pool:
  outputs = pool.map(move_blob, inputs)

 

 
Here inputs is list of 1000 of  tuple elements of which conatins (bucketname , src_path, destination_path). Here the problem is that this multiprocessing loop runs indefinately.
 
Any suggestions ?
Solved Solved
0 1 841
1 ACCEPTED SOLUTION

I found that your question has been answered at this link:

View solution in original post

1 REPLY 1

I found that your question has been answered at this link: