
Reputation
Badges 1
533 × Eureka!Yes, I have a metric I want to monitor so I will be able to sort my experiments by it. It is logged in this manner
logger.report_scalar(title='Mean Top 4 Accuracy', series=ARGS.model, iteration=0, value=results['top_4_acc'].mean())
When looking at my dashboard this is how it looks
Could be, my message is that in general, the ability to attach a named scalar (without iteration/series dimension) to an experiment is valuable and basic when looking to track a metric over different experiments
That is not very informative
Thanks very much
Now something else is failing, but I'm pretty sure its on my side now... So have a good day and see you in the next question 😄
I'm asking that because the DSes we have are working on multiple projects, and they have only one trains.conf
file, I wouldn't want them to edit it each time they switch project
TimelyPenguin76 I think our problem is that the agent is not using this environment, I'm not sure which one he does... Is there a way to hard-code the agent environment?
I assume we are talking about the IP I would find here right?
https://www.whatismyip.com/
let me try to docker-compose down --rmi all
whatttt? I looked at config_obj
didn't find any set
method
I have a single IAM, my question is what kind of permissions I should associate with the IAM so that the autoscaler task will work
It's kind of random, it works sometimes and sometimes it doesn't
actually i was thinking about model that werent trained uaing clearml, like pretrained models etc
Maybe something similar to dockers, that I could name each one of my trains agents and then refer to them by name something like
trains-agent daemon --name agent_1 ...
Thentrains-agent stop/start
I've dealt with this earlier today because I set up 2 agents, one for each GPU on a machine, and after editing configurations I wanted to restart only one of them (because the other was working) and then I noticed I don't know which one to kill
it will return a Config
object right?
even though I apply append
No I don't have trains anywhere in my code
the Task
object has a method called Task.execute_remotely
Look it up here:
https://allegro.ai/docs/task.html#trains.task.Task.execute_remotely