I'm not sure. Maybe @<1523701087100473344:profile|SuccessfulKoala55> can help 🙂
Hi @<1531807732334596096:profile|ObliviousClams17> , I think for your specific use case it would be easiest to use the API - fetch a task, clone it as many times as needed and enqueue it into the relevant queues.
Fetch a task - None
Clone a task - None
Enqueue a task (or many) - [None](https://clear.ml/docs/latest/docs/references/api/ta...
SubstantialElk6 , you can find some neat examples here:
https://github.com/allegroai/clearml/tree/master/examples/pipeline
What is the exact python version you're trying to run on?
Hi SubstantialElk6 ,
From a quick glance I don't see any abilities not covered. Is there some specific capability you're looking for?
VexedCat68 , you can iterate through all 'running' tasks in a project and abort them through the api. The endpoint is tasks.stop
Hi @<1523721697604145152:profile|YummyWhale40> _, what if you specify the output_uri
through the code in Task.init()
?
Hi @<1569133683275730944:profile|CrabbyDove13> , the PyCharm plugin is for working with remote environments. I don't think you need is with VSCode since this capability is covered by clearml-session
Hi @<1695969549783928832:profile|ObedientTurkey46> , do you have a code snippet that reproduces this behaviour?
Is this all happening when you're running locally? How many gpu's do you have/try to run on? Also, can you provide an example code snippet to try and run something basic to get a similar failure. I think I have a machine with multiple gpus that I can try playing on 🙂
Cool, thanks for the info! I'll try to play with it as well 🙂
Hi @<1610445887681597440:profile|WittyBadger59> , how are you reporting the plots?
I would suggest taking a look here and running all the different examples to see the reporting capabilities:
None
JitteryCoyote63 , if you go to a completed experiment you only see the packages stage installed in the log?
What OS/ClearML-Agent are you running?
I might not be able to get to that but if you create an issue I'd be happy to link or post what I came up with, wdyt?
Taking a look at your snippet, I wouldn't mind submitting a PR for such a cool feature 🙂
Then just use export
Hi @<1523701717097517056:profile|ScantMoth28> , what version of ClearML are you using? Are you using a self hosted server or the community one?
Hi @<1523701491863392256:profile|VastShells9> , I would suggest the following form to contact ClearML - None
@<1529271085315395584:profile|AmusedCat74> , what happens if you try to run it with clearml
1.8.0?
TartSeagull57 , I couldn't make the sample you gave me work 😞
Can you please provide a self contained example that would reproduce the issue?
How did you set the output URI?
Do try with the port through
Hi @<1665891247245496320:profile|TimelyOtter30> , not sure I follow. It looks like a misconfiguration. I think you need to see the correct settings here: None , also note the direct reference to minio 🙂
Hi @<1618418423996354560:profile|JealousMole49> , I'm afraid there is no such capability at the moment. Basically metrics mean any metadata that was saved (scalars, logs, plots etc). You can delete some log/metric heavy experiments/tasks/datasets to free up some space. Makes sense?
What is the setup that they do the training on?
Can you provide a self contained contained snippet that reproduces this behavior?
WackyRabbit7 ,I am noticing that the files are saved locally, is there any chance that the files are over-written during the run or get deleted at some point and then replaced?
Also, is there a reason the files are being saved locally and not at the fileserver?
I couldn't manage to reproduce it on my end. But also in my cases it always saves the files to the fileserver. So I'm curious what's making it save locally in your case
If it works on two computers and one computer is having problems then I'll be suspecting some issue with the computer itself. Maybe permissions or network issues