Hi All! I Am Currently Using A Self-Hosted Clearml Server And Was Looking To Integrate The Clearml Agent To Make Better Usage Of Our Hpc Resources With Gpu Autoscaling. I Am Aware That Clearml Already Supports Aws Autoscaler (In The Pro-Tier), But My Tea

Unanswered

Hi HighCoyote66

However, we need to allocate resources to ourselves manually, using an

srun

command or

sbatch

Long story short, there is a full SLURM integration, basically you push a job into the ClearML queue and it produces a slurm job that uses the agent to setup the venv/container and run your Task, but this is only part of the enterprise version 😞
You can however do the following (notice this is pseudo code, I probably have a typo in the srun command)

Clone your Task in the UI
Copy the new Task ID
srun clearml-agent execute --id <task-id-here>This will use slurm to allocate the job and clearml-agent to actually set the environment automatically and run your code (with the ability to override arguments from the UI, like you would regularly). The missing part is of course the integration to the queue system and the automation (which unfortunately is not part of the open source)

  				
Posted 
	one year ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

229 Views

0 Answers

one year ago