Hi Everyone! I'Ve Noticed That If I Run An Experiment And It Fails, The Clearml Agent Will Delete All Datasets That Have Been Downloaded During The Run. Is It Correct Behavior? How Can I Force The Agent To Preserve Such Datasets?

Answered

Hi everyone! I've noticed that if I run an experiment and it fails, the ClearML agent will delete all datasets that have been downloaded during the run. Is it correct behavior? How can I force the agent to preserve such datasets?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					ExcitedSeaurchin87
				
					0
					 × 1

Votes Newest

Answers 9

So when you say the files are deleted, how can you tell? where did you look for them?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Sorry. I probably misunderstood you. I just downloaded the clearml-agent package to my machine and ran the agent with the following command: python -m clearml_agent daemon --queue default dinara --docker --detached

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					ExcitedSeaurchin87
				
					0
					 × 1

ClearML agent will delete all datasets

I'm not sure I understood how you've ran the agent...

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

SuccessfulKoala55
I initialized the task with Python:

task = Task.init(project_name=args.project_name, task_name=args.task_name)

and downloaded set of datasets later in the code:

for dataset_name in datasets_list:
clearml_dataset = clearml.Dataset.get(dataset_project=dataset_project, dataset_name=dataset_name)
clearml_dataset_path = clearml_dataset.get_local_copy()

Then I go through the resulting directories in search of the files I need, and send their paths to Pytorch dataset object. If the run fails somewhere later I want to preserve these datasets downloaded.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					ExcitedSeaurchin87
				
					0
					 × 1

Hi ExcitedSeaurchin87 , I think the files are being downloaded to the cache, and the cache simply overwrites older files. How are you running the agent exactly?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

From an efficiency perspective, we should be pulling data as we feed into training. That said, always a good idea to uncompress large zip files and store them as smaller ones that allow you to batch pull for training.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					SubstantialElk6
				
					0
					 × 1

Yes, that's correct. I don't want to re-download datasets because of their large size.

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					ExcitedSeaurchin87
				
					0
					 × 1

ExcitedSeaurchin87 , Hi 🙂

I think it's correct behavior - You wouldn't want leftover files flooding your computer.

Regarding preserving the datasets - I'm guessing that you're doing the pre-processing & training in the same task so if the training fails you don't want to re-download the data?

  				
Posted 
	3 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

Write your answer

2K Views

9 Answers

3 years ago

2 years ago