What setting do you have in this section of your clearml.conf
None
2024-02-08 11:23:52,150 - clearml.storage - ERROR - Failed creating storage object
Reason: Missing key and secret for S3 storage access (
)
(edited)
This looks unrelated, to the hotfix, it looks like you misconfigured something and therefor failing to write to s3
I'm afraid that would be the best method. You could probably hack something into clearml sdk yourself since it's open source
Hi @<1708653001188577280:profile|QuaintOwl32> , you can set some default image to use. My default for most jobs is nvcr.io/nvidia/pytorch:23.03-py3
I think for this you would need to report this manually. You can extract all of this data using the API and then create custom plots/scalars that you can push into reports for custom dashboards 🙂
Can you add a screenshot of how you see them currently?
Hi @<1569133676640342016:profile|MammothPigeon75> , I believe such SLURM integration of what you described is supported on ClearML Scale/Enterprise versions
I'm not sure I understand. Can you give a specific example of what you have VS what you'd like it to be?
Hi @<1649946171692552192:profile|EnchantingDolphin84> , it's not a must but it would be the suggested approach 🙂
You need to follow the instructions here - None
Hi! Hmmm, good question. I think it's asynchronous since most of the uploading processes are usually async. Is there a specific use case you're thinking of?
From the screenshots provided you ticked 'cpu' mode AND I think the machine that you're using n1-standard-1 is a cpu only machine, if I'm not mistaken.