Reputation
Badges 1
611 × Eureka!I restarted it after I got the errors, because as everyone knows, turning it off and on usually works ๐
Python 3.8.8, clearml 1.0.2
Thanks a lot. But even for a user, I can not set a default for all projects, right?
Do you mean venv_update ?
It could be that either the clearml-server has bad behaviour while clean up is ongoing or even after.
If you compare the two outputs it put at the top of this thread, the one being the output if executed locally and the other one being the output if executed remotely, it seems like command is different and wrong on remote.
The default behavior mimics Pythonโs assert statement: validation is on by default, but is disabled if Python is run in optimized mode (via python -O). Validation may be expensive, so you may want to disable it once a model is working.
As in if it was not empty it would work?
Well, after restarting the agent (to set it into --detached more) it set the cleanup_task.py into service mode, but my monitoring tasks are just executed on the agent itself (no new service clearml-agent is started) and then it is aborted right after starting.
Works with 1.4. Sorry for not checking versions myself!
When you say it is an SDK parameter this means that I only have to specify it on the computer where I start the task from, right? So an clearml-agent would read this parameter from the task itself.
This my environment installed from env file. Training works just fine here:
Oh you are right. I did not think this through... To implement this properly it gets to enterprisy for me, so I ll just leave it for now :D
btw: I am pretty sure this used to work, but then stopped work some time ago.
Thank you very much!
No reason in particular. How many people work at http://allegro.ai ?
I will create a minimal example.
Yea, tensorboardX is using moviepy.
I have an carla.egg file on my local machine and on the worker that I include with sys.path.append before I can do import carla . It is the same procedure on my local machine and on the clearml-agent worker.
Yea, when the server handles the deletes everythings fine and imo, that is how it should always have been.
I don't think it is a viable option. You are looking at the best case, but I think you should expect the worst from the users ๐ Also I would rather know there is a problem and have some clutter than to hide it and never be able to fix it because I cannot identify which artifacts are still in use without spending a lot of time comparing artifact IDs.
Specific step in the pipeline. The main step (the experiment) is currently just a file with a Task.init ` and then the experiment code. I am wondering how to modify this code such that it can be run in the pipeline or as standalone.
Nvm, I think its my mistake. I will investigate.
I see. Thank you very much. For my current problem giving priority according to queue priority would kinda solve it. For experimentation I will sometimes enqueue a task and then later enqueue a another one of a different kind, but what happens is that even though this could be trivially solved, I will have to wait for the first one to finish. I guess this is only a problem for people with small "clusters" where SLURM does not make sense, but no scheduling at all is also suboptimal.
However, I...
But I do not have anything linked correctly since I rely in conda installing cuda/cudnn for me
Maybe if you have time you can take a look at the log I posted in the beginning. I think I have the same extra_index_url and the nightly flag activated ๐