Reputation
Badges 1
533 × Eureka!I have them in two different places, once under Hyperparameters -> General
Hahahah thanks for the help SuccessfulKoala55 & CostlyOstrich36
I really do feel it would be a nice to have the ability to easily configure the Cleanup Service to cleanup only specific projects / tasks as its a common use case to have a project dedicated for debugging and alike
First of all I wasn't aware that was an option - but I think it's preferable to be able to do it through the command line. Because I'm developing the pipeline to be executed remotely, but for debugging I run it locally.
Using what you showed I can obviously write it, and delete it once it is ready, and rewrite it when I'm debugging or adding features - but I think DX-wise it would be nicer to be able to trigger this functionality through the command line
I'm saying that because in the task under "INSTALLED PACKAGES" this is what appears
2021-10-11 10:07:19 ClearML results page:
`
2021-10-11 10:07:20
Traceback (most recent call last):
File "tasks/hpo_n_best_evaluation.py", line 256, in <module>
main(args, task)
File "tasks/hpo_n_best_evaluation.py", line 164, in main
trained_models = get_models_from_task(task=hpo_task)
File "tasks/hpo_n_best_evaluation.py", line 72, in get_models_from_task
with open(pickle_path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/elior/.clearml/c...
🤔 is the "installed packages" part editable? good to know
Isn't it a bit risky manually changing a package version? what if it won't be compatible with the rest?
Maybe something similar to dockers, that I could name each one of my trains agents and then refer to them by name something like
trains-agent daemon --name agent_1 ...
Thentrains-agent stop/start
I've dealt with this earlier today because I set up 2 agents, one for each GPU on a machine, and after editing configurations I wanted to restart only one of them (because the other was working) and then I noticed I don't know which one to kill
I guess the AMI auto updated
Well done to you!
Cool - what kind of objects are returned by .artifacts.
getitem
? I want to check their docs
Let's take a step back. Let's remove the clearml-services from the docker compose for a second, and run it manually (then you can control everything). Once you have it running manually, let's try to replicate the setup back to the docker compose, make sense ?
I'd prefer not to docker-compose down
as researchers are actively working on it, what do you say that I will manually kill the services agent and launch one myself?
That is not very informative
I'll just exclude .cfg files from the deletion, my question is how to recover, must i recreate the agents or there is another way?
this is the full one TimelyPenguin76
Yep, the trains server is basically a docker-compose based service.
All you have to do is change the ports in the docker-compose.yml
file.
If you followed the instructions in the docs you should find that file in /opt/trains/docker-compose.yml
and then you will see that there are multiple services ( apiserver
, elasticsearch
, redis
etc.) and in each there might be a section called ports
which then states the mapping of the ports.
The number on the left, is ...
the path to the JSON file
checking and will let you know
I'm quite confused... The package is not missing, it is in my environment and executing tasks normally ( python my_script.py....
) works
Wait, suddenly the UI changed to 0.16.1, seems like I was shown a cached page
Good, so if I'm templating something using clearml-task
(without queue, so the task is in draft mode) it will use this task? Even though it never exeucted?