Reputation
Badges 1
979 × Eureka!Here is the console with some errors
Yes, I set:auth { cookies { httponly: true secure: true domain: ".clearml.xyz.com" max_age: 99999999999 } }
It always worked for me this way
SuccessfulKoala55 I found the issue thanks to you: I changed a bit the domain but didnβt update the apiserver.auth.cookies.domain
setting - I did it, restarted and now it works π Thanks!
It worked like a charm π± Awesome thanks AgitatedDove14 !
I can also access these files directly if I enter the url in the browser
with what I shared above, I now get:docker: Error response from daemon: network 'host' not found.
mmh it looks like what I was looking for, I will give it a try π
Hi TimelyPenguin76 , I guess it tries to spin them down a second time, hence the double print
no, at least not in clearml-server version 1.1.1-135 β’ 1.1.1 β’ 2.14
with the CLI, on a conda env located in /data
I could delete the files manually with sudo rm
(sudo is required, otherwise I get Permission Denied
)
Ok, but that means this cleanup code should live somewhere else than inside the task itself right? Otherwise it won't be executed since the task will be killed
Sure, I opened an issue https://github.com/allegroai/clearml/issues/288 unfortunately I don't have time to open a PR π
Hi AgitatedDove14 , I donβt see any in the https://pytorch.org/ignite/_modules/ignite/handlers/early_stopping.html#EarlyStopping but I guess I could overwrite it and add one?
The clean up service is awesome, but it would require to have another agent running in services mode in the same machine, which I would rather avoid
The parent task is a data_processing task, therefore I retrieve it so that I can then data_processed = parent_task.artifacts["data_processed"]
I am looking for a way to gracefully stop the task (clean up artifacts, shutdown backend service) on the agent
GrumpyPenguin23 yes, it is the latest
AgitatedDove14 , what I was looking for was: parent_task = Task.get_task(task.parent)
(I use trains-agent 0.16.1 and trains 0.16.2)
I am now trying with agent.extra_docker_arguments: ["--network='host'", ]
instead of what I shared above
AgitatedDove14 I finally solved it: The problem was --network='host'
should be --network=host
I am doing:try: score = get_score_for_task(subtask) except: score = pd.NA finally: df_scores = df_scores.append(dict(task=subtask.id, score=score, ignore_index=True) task.upload_artifact("metric_summary", df_scores)
SuccessfulKoala55 They do have the right filepath, eg:https://***.com:8081/my-project-name/experiment_name.b1fd9df5f4d7488f96d928e9a3ab7ad4/metrics/metric_name/predictions/sample_00000001.png
Also maybe we are not on the same page - by clean up, I mean kill a detached subprocess on the machine executing the agent
So I changed ebs_device_name = "/dev/sda1"
, and now I correctly get the 100gb EBS volume mounted on /
. All good π
Interesting! Something like that would be cool yes! I just realized that custom plugins in Mattermost are written in Go, could be a good hackday for me π to learn go