Reputation
Badges 1
979 × Eureka!I see 3 agents in the "Workers" tab
Adding back clearml logging with matplotlib.use('agg')
, uses more ram but not that suspicious
TimelyPenguin76 , no, Iβve only set the sdk.aws.s3.region = eu-central-1
param
Yea I really need that feature, I need to move away from key/secrets to iam roles
SuccessfulKoala55 , This is not the exact corresponding request (I refreshed the tab since then), but the request is an events.get_task_logs
, with the following content:
Ping CostlyOstrich36 AgitatedDove14 SuccessfulKoala55 Just making sure this wasn't missed π
It could be yes but the difference between now
and last_report_time
doesnβt match the difference I observe
SuccessfulKoala55 Am I doing/saying something wrong regarding the problem of flushing every 5 secs (See my previous message)
I tried removing type=str but I got same problem π
in the UI the value is correct one (not empty, a string)
Ok, so after updating to trains==0.16.2rc0, my problem is different: when I clone a task, update its script and enqueue it, it does not have any Hyper-parameters/argv section in the UI
The cloning is done in another task, which has the argv parameters I want the cloned task to inherit from
AgitatedDove14 So what you are saying is that since I have trains-server 0.16.1, I should use trains>=0.16.1? And what about trains-agent? Only version 0.16 is released atm, this is the one I use
Hi TimelyPenguin76 ,
trains-server: 0.16.1-320
trains: 0.15.1
trains-agent: 0.16
Thanks for the explanations,
Yes that was the case This is also what I would think, although I double checked yesterday:I create a task on my local machine with trains 0.16.2rc0 This task calls task.execute_remotely() The task is sent to an agent running with 0.16 The agent install trains 0.16.2rc0 The agent runs the task, clones it and enqueues the cloned task The cloned task fails because it has no hyper-parameters/args section (I can seen that in the UI) When I clone the task manually usin...
I mean that I have a taskA (controller) that is in charge of creating a taskB with the same argv parameters (I just change the entry point of taskB)
This is how I start the agent that is running the two experiments in parallel:python3 -m trains_agent --config-file "~/trains.conf" daemon --queue default --log-level DEBUG --detached
ok, what is the 3.8 release? a server release? how does this number relates to the numbers above?
when can we expect the next self hosted release btw?
I hit F12 to check projects.get_all_ex
but nothing is fired, I guess the web ui is just frozen in some weird state
btw CostlyOstrich36 , I can see in Profile > Version: 1.1.1-135 β’ 1.1.1 β’ 2.14
. What these numbers correspond to?
Thanks! (Maybe could be added to the docs ?) π
Yea thats what I thought, I do have trains server 0.15
AgitatedDove14 Is it fixed with trains-server 0.15.1?