Hi Everyone, I’M Trying To Create A Pipeline From Tasks Without Uploading The Data Into Clearml Server Because

Answered

Hi everyone,

I’m trying to create a pipeline from tasks without uploading the data into clearml server because it’s large and I get into memory issues. Instead I want to use tensorflow prefetch to get small batches while training. Is this possible? If so, is there any available example uses tf prefetch ?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					PunyShrimp95
				
					0
					 × 1

Votes Newest

Answers 6

Thank you. The data is stored in GCP bucket and it’s about 4k images of 640x640. I’m also using host service clearml.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					PunyShrimp95
				
					0
					 × 1

Thank you

  				
Posted 
	one year ago

					More
				  		
  Report
		
					PunyShrimp95
				
					0
					 × 1

Yes, works with GCP too

  				
Posted 
	one year ago

					More
				  		
  Report
		
					EnthusiasticShrimp49
				
					0

Can’t I use GCP?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					PunyShrimp95
				
					0
					 × 1

Hey Yasir, to use tensorflow prefetch your data needs to be (1) chunked and (2) stored on some server/bucket/network-attached FS. If both conditions are not satisfied, TF prefetch won't help you.

How large is the dataset we're talking about?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					EnthusiasticShrimp49
				
					0

That's not that much. You can use the AWS autoscaler and provision a spot g4dn GPU instance with a bit more disk. This should cost you less than 50 cents an hour

  				
Posted 
	one year ago

					More
				  		
  Report
		
					EnthusiasticShrimp49
				
					0

Write your answer

2K Views

6 Answers

one year ago