Reputation
Badges 1
979 × Eureka!Here is the console with some errors
Yes, I set:auth { cookies { httponly: true secure: true domain: ".clearml.xyz.com" max_age: 99999999999 } }
It always worked for me this way
mmmh there is no closing of the task happening at that point. Note that just before the task.upload_artifact, I call task.logger.report_table("Metric summary", "Metric summary", 0, df_scores)
, if that matters
SuccessfulKoala55 I found the issue thanks to you: I changed a bit the domain but didnβt update the apiserver.auth.cookies.domain
setting - I did it, restarted and now it works π Thanks!
It worked like a charm π± Awesome thanks AgitatedDove14 !
I can also access these files directly if I enter the url in the browser
with what I shared above, I now get:docker: Error response from daemon: network 'host' not found.
mmh it looks like what I was looking for, I will give it a try π
Hi TimelyPenguin76 , I guess it tries to spin them down a second time, hence the double print
no, at least not in clearml-server version 1.1.1-135 β’ 1.1.1 β’ 2.14
Will it freeze/crash/break/stop the ongoing experiments?
with the CLI, on a conda env located in /data
I could delete the files manually with sudo rm
(sudo is required, otherwise I get Permission Denied
)
Ok, but that means this cleanup code should live somewhere else than inside the task itself right? Otherwise it won't be executed since the task will be killed
Sure, I opened an issue https://github.com/allegroai/clearml/issues/288 unfortunately I don't have time to open a PR π
So I want to be able to visualise it quickly as a table in the UI and be able to download it as a dataframe, which of report_media or artifact is better?
Hi AgitatedDove14 , I donβt see any in the https://pytorch.org/ignite/_modules/ignite/handlers/early_stopping.html#EarlyStopping but I guess I could overwrite it and add one?
The clean up service is awesome, but it would require to have another agent running in services mode in the same machine, which I would rather avoid
The parent task is a data_processing task, therefore I retrieve it so that I can then data_processed = parent_task.artifacts["data_processed"]
I am looking for a way to gracefully stop the task (clean up artifacts, shutdown backend service) on the agent
GrumpyPenguin23 yes, it is the latest
AgitatedDove14 , what I was looking for was: parent_task = Task.get_task(task.parent)
/data/shared/miniconda3/bin/python /data/shared/miniconda3/bin/clearml-agent daemon --services-mode --detached --queue services --create-queue --docker ubuntu:18.04 --cpu-only
(I use trains-agent 0.16.1 and trains 0.16.2)
I am now trying with agent.extra_docker_arguments: ["--network='host'", ]
instead of what I shared above
AgitatedDove14 I finally solved it: The problem was --network='host'
should be --network=host
I am doing:try: score = get_score_for_task(subtask) except: score = pd.NA finally: df_scores = df_scores.append(dict(task=subtask.id, score=score, ignore_index=True) task.upload_artifact("metric_summary", df_scores)