Hi everyone,

I’m trying to create a pipeline from tasks without uploading the data into clearml server because it’s large and I get into memory issues. Instead I want to use tensorflow prefetch to get small batches while training. Is this possible? If so, is there any available example uses tf prefetch ?

Posted 3 months ago
Thank you. The data is stored in GCP bucket and it’s about 4k images of 640x640. I’m also using host service clearml.

Posted 3 months ago

That's not that much. You can use the AWS autoscaler and provision a spot g4dn GPU instance with a bit more disk. This should cost you less than 50 cents an hour

Posted 3 months ago

Can’t I use GCP?

Posted 3 months ago

Yes, works with GCP too

Posted 3 months ago

Thank you

Posted 3 months ago

Hey Yasir, to use tensorflow prefetch your data needs to be (1) chunked and (2) stored on some server/bucket/network-attached FS. If both conditions are not satisfied, TF prefetch won't help you.

How large is the dataset we're talking about?

Posted 3 months ago
