Reputation
Badges 1
981 × Eureka!SuccessfulKoala55 Am I doing/saying something wrong regarding the problem of flushing every 5 secs (See my previous message)
Nevermind, nvidia-smi command fails in that instance, the problem lies somewhere else
Setting to redis from version 6.2 to 6.2.11 fixed it but I have new issues now 😄
then print(Task.get_project_object().default_output_destination) is still the old value
Disclaimer: I didn't check this will reproduce the bug, but that's all the components that should reproduce it: a for loop creating figures and clearml logging them
Ok, in that case it probably doesn’t work, because if the default value is 10 secs, it doesn’t match what I get in the logs of the experiment: every second the tqdm adds a new line
Thanks for sharing the issue UnevenDolphin73 , I’ll comment on it!
I’ve set dynamic: “strict” in the template of the logs index and I was able to keep the same mapping after doing the reindex
Hi CostlyOstrich36 , this weekend I took a look at the diffs with the previous version ( https://github.com/allegroai/clearml-server/compare/1.1.1...1.2.0# ) and I saw several changes related to the scrolling/logging:
apiserver/bll/event/ http://log_events_iterator.py apiserver/bll/event/ http://events_iterator.py apiserver/config/default/services/_mongo.conf apiserver/database/model/ http://base.py apiserver/services/ http://events.pyI suspect that one of these changes might be responsible ...
SuccessfulKoala55 Thanks to that I was able to identify the most expensive experiments. How can I count the number of documents for a specific series? Ie. I suspect that the loss, that is logged every iteration, is responsible for most of the documents logged, and I want to make sure of that
Hi CostlyOstrich36 , I am not using Hydra, only OmegaConf, so you mean just calling OmegaConf.load should be enough?
I could delete the files manually with sudo rm (sudo is required, otherwise I get Permission Denied )
I have the same problem, but not only with subprojects, but for all the projects, I get this blank overview tab as shown in the screenshot. It only worked for one project, that I created one or two weeks ago under 0.17
to pass secrets to each experiment
you mean to run it on the CI machine ?
yes
That should not happen, no? Maybe there is a bug that needs fixing on clearml-agent ?
It just to test that the logic being executed in if not Task.running_locally() is correct
I’d like to move to a setup where I don’t need these tricks
AgitatedDove14 I think it’s on me to take the pytorch distributed example in the clearml repo and try to reproduce the bug, then pass it over to you 🙂
mmmh probably yes, I can’t say for sure (because I don’t remember precisely when I upgraded to 0.17) but it looks like that
That said, v1.3.1 is already out, with what seems like a fix:
So you mean 1.3.1 should fix this bug?
So get_registered_artifacts() only works for dynamic artifacts right? I am looking for a download_artifacts() which allows me to retrieve static artifacts of a Task
It failed as well
I assume you’re using a self-hosted server?
Yes
Task.get_project_object().default_output_destination = None
The task I cloned from is not the one I though