So once I enqueue it is up? Docs says I can configure the queues that the auto scaler listens to in order to spin up instances, inside the auto scale task - I wanted to make sure that this config has nothing to do to where the auto scale task was enqueued to
Manual model registration?
This is what I meant should be documented - the permissions...
First of all I wasn't aware that was an option - but I think it's preferable to be able to do it through the command line. Because I'm developing the pipeline to be executed remotely, but for debugging I run it locally.
Using what you showed I can obviously write it, and delete it once it is ready, and rewrite it when I'm debugging or adding features - but I think DX-wise it would be nicer to be able to trigger this functionality through the command line
2021-10-11 10:07:19 ClearML results page:
`
2021-10-11 10:07:20
Traceback (most recent call last):
File "tasks/hpo_n_best_evaluation.py", line 256, in <module>
main(args, task)
File "tasks/hpo_n_best_evaluation.py", line 164, in main
trained_models = get_models_from_task(task=hpo_task)
File "tasks/hpo_n_best_evaluation.py", line 72, in get_models_from_task
with open(pickle_path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/home/elior/.clearml/c...
Maybe something similar to dockers, that I could name each one of my trains agents and then refer to them by name something like
trains-agent daemon --name agent_1 ...
Thentrains-agent stop/start
I've dealt with this earlier today because I set up 2 agents, one for each GPU on a machine, and after editing configurations I wanted to restart only one of them (because the other was working) and then I noticed I don't know which one to kill
I guess the AMI auto updated
Well done to you!
Cool - what kind of objects are returned by .artifacts.
getitem
? I want to check their docs
That is not very informative
I'll just exclude .cfg files from the deletion, my question is how to recover, must i recreate the agents or there is another way?
this is the full one TimelyPenguin76
Yep, the trains server is basically a docker-compose based service.
All you have to do is change the ports in the docker-compose.yml
file.
If you followed the instructions in the docs you should find that file in /opt/trains/docker-compose.yml
and then you will see that there are multiple services ( apiserver
, elasticsearch
, redis
etc.) and in each there might be a section called ports
which then states the mapping of the ports.
The number on the left, is ...
the path to the JSON file
checking and will let you know
Wait, suddenly the UI changed to 0.16.1, seems like I was shown a cached page