This should be a good started, after googling more on how ssh works will give you the right direction 🙂
If the process is dead it will be removed from the UI after some time
RipeAnt6 , you have to manage your storage on the NAS yourself. We delete data only on the fileserver.
However, you could try mounting the NAS to the fileserver docker as a volume and then deletion should also handle files on the NAS 🙂
From the looks of it, yes. But give it a try to see how it behaves without
Basically the Agent automates the docker run
command with everything that you need (this can become rather complex). You can see this in the third line of the console log:
` Executing: ['docker', 'run', '-t', '--gpus', 'all', '-l', 'clearml-worker-id=ip-172-31-28-179:0', '-l', 'clearml-parent-worker-id=ip-172-31-28-179:0', '-e', 'CLEARML_WORKER_ID=ip-172-31-28-179:0', '-e', 'CLEARML_DOCKER_IMAGE=nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04', '-e', 'CLEARML_TASK_ID=cdbfa9cda5ab4d86b012a87...
Hi @<1567321746447536128:profile|EmaciatedCentipede72> , I would suggest checking what api calls the UI sends when doing work in the UI with reports. You can see it in the network section of developer tools (F12), filter by XHR for easier time reading 🙂
Under add_step there is task_overrides and then you can find this section
reset requirements (the agent will use the "requirements.txt" inside the repo) task_overrides={'script.requirements.pip': ""}
Hi @<1523702786867335168:profile|AdventurousButterfly15> , are the models logged in the artifacts section?
No 🙂
Just remember to follow the upgrade instructions
you configure the worker name in clearml.conf
and I think you'll need to re-run it
EnviousPanda91 , which framework isn't being logged? Can you provide a small code snippet?
Hi NaughtyFish36 ,
Execute a local training run (from within a docker container) which registers a task on our clearml serverWhen you do this, does ClearML detect the docker image that you're running on?
Initially there was the issue of no access via ssh, but that seemed to be fixed through mounting the local .ssh directory onto the docker container root. The subsequent error is the one above i.e. the reference is not a tree. However I can happily checkout that commit hash myself, yet t...
ChubbyOwl99 , are you trying to access http://app.clear.ml ?
Hi @<1615881718445641728:profile|EnchantingSeaturtle2> , what version of clearml
are you using? Are you running the server yourself or using the community server?
Hi @<1539417873305309184:profile|DangerousMole43> , I'm afraid this is not configurable currently. What is your use case?
How would you use the
user properties
as part of an experiment?
I'm guessing to get the properties. I'm guessing this really depends on your needs / use-case
The ValueError is happening because there is no queue called services it appears
You can disable automatic model logging using auto_connect_frameworks
in Task.init()
https://clear.ml/docs/latest/docs/references/sdk/task#taskinit
This however will also disable automatic reporting of scalers. You can also manually force the upload of the final model with
https://clear.ml/docs/latest/docs/references/sdk/model_outputmodel#class-outputmodel
It looks as it is running, did the status of the experiment change?
I can think of two solutions:
Fix local python environments and begin using virtual environments ( https://github.com/pyenv/pyenv for example) Use the agent in --docker
mode. You won't need to worry about python versions but you will need to install Docker on that machine.
So how do you attach the pytorch requirement?
Which mode? It is indeed vague 🙂
Hi @<1570220844972511232:profile|ObnoxiousBluewhale25> , I don't think there is such a capability currently. It is possible to do via the autoscaler where some script will run before the container execution on the machine.
Maybe open a Github feature request to follow up on this?
Regarding 1, can you name them yourself somehow and thus get the wanted result?
What version of clearml-agent are you using? Can you add the full log here?
Hi BroadSeaturtle49 , can you please elaborate on what the issue is?