Hi UnsightlyLion90
from my understanding agent do the job of SLURM,
That is kind of correct (they overlap in some ways π )
Any guide of how to integrate both of them?
The easiest way is to just add the "Task.init()" call to your code, and use SLURM to schedule the job. this will make sure all jobs are fully logged (this can also includes automatically uploading the models, and artifact support etc)
Full SLURM support (i.e. similar to the k8s glue support), is currently out of scope, but I'm pretty sure the enterprise version includes support for it.
wdyt?
Hi Martin, Thank you for your reply! so although I have not dig into the docs, I can imagine three ways to run the jobs:
slurm (get accress to compute node) -> clearml (bash script, for make a copy of source code and build virtual environment) -> python script. clearml (bash script) -> slurm -> python scripts slurm -> python script with clearml APIMay I say you suggested the third way? If so, would I get the benefit of clearml to take care of my project (log the git commit and copy of source code)
I think so, when you are saying "clearml (bash script..." you basically mean, "put my code + packages + and run it" , correct ?
Yes, for now I have a bash script like make a snapshot of the source code and all the config file at the time of I submit a slurm job, and when sometime later the job run, use that snapshot. and I hope clearml can do it for me.
That should work π
BTW, you might play around with "clearml-agent execute --id <task_id_here>"
This will basically clone the code, create a venv with the python packages, apply uncommitted changes and will run the actual code. This could be a replacement for your bash. (notice it means that you need to clone the Task in the UI, then you can Change parameters, then the run the agent manually in SLURM and it will take the params from the UI.)
I see, in that way I do not use clearmlβs queue, instead ask clearml to run the code immediately the slurm job begin. Correct?