Reputation
Badges 1
92 × Eureka!I didn;t know that from the client side, you can specify the storage elsewhere than the clearML server. Good to know !
But I still want to know, if possible, to use a blob storage by default, configured on the ClearML server, and each client don't need to do that ...
the weird thing is that: the GPU 0 seems to be in used as reported by nvtop in the host. But it is 50% slower than when running directly instead of through the clearml-agent ...
I don't think agent are aware of each other. Which mean that you can have as many agent as you want and depending on your task usage, they will be fighting for CPU and GPU usage ...
What about migrating existing expriment in the on prem server?
not sure how that work with Docker and machine that is not set up with ssh public key ... We will go to that path sometime in the future so I am quite interested too, on how people do it without ssh public key
Found it: None
And credential are set with :
sdk {
azure.storage {
containers: [
{
account_name: "account"
account_key: "xxxx"
container_name:"clearml"
}
]
}
}
Is it because Azure is "whitelisted" in our network ? Thus need a different certificate ?? And how do I provide 2 differents certificate ? Is bundling them simple as a concat of 2 pem file ?
Onprem: User management is not "live" as you need to reboot and password are hardcoded ... No permission distinction, as everyone is admin ...
We don't have a file server. The clearml conf have :sdk.development.default_output_uri="
None "
Do I need not make changes into clearml.conf so that it doesn't ask for my credentials or is there another way around
You have 2 options:
- set credential inside cleaml.conf : i am not familiar with this and never test it.
- or setup password less ssh with public key None
(I never played with pipeline feature so I am not really sure that it works as I imagined ...)
Based on this : it feels like S3 is supported
How are you using the function update_output_model
?
this looks like the agent running inside your docker did not have any username/password to do git clone. so the default behavior is to wait for keyboard input: which look like hanging ....
Solved @<1533620191232004096:profile|NuttyLobster9> . In my case:
I need to from clearml import Task
very early in the code (like first line), before importing argparse
And not calling task.connect(parser)
if you want to replace MLflow by ClearML: do it !! It's like "Should I use sandal or running shoes for my next marathon ..."
Let your user try ClearML, and I am pretty sure all of them will want to swap over !!!
you are forcing ssh with force_git_ssh_protocol: true
Have you setup ssh keys ?
If you are using ssh keys, why enable_git_ask_pass: true
?
In the web UI, in the queue/worker tab, you should see a service queue and a worker available in that queue. Otherwise the service agent is not running. Refer to John c above
you should be able to explicitly upload a file of your choice as artefact using something like this: None
most of the time, "user" would expect that clearml handle the caching by itself
Are you sure all the files needed are pushed to your git repo ?
Go to a another folder and git clone that exact branch/commit and check the files are there ?
most of people probable wont even know what that do
What should I put in there? What is the syntax for git package?
--gpus 0,1
: I believe this basically say that your code launched by the agent has access to both GPUs and that is it. Now it is up to your code to choose which GPU to use and what not and how ...
what about having 2 agents, one on each GPU, on the same machine, serving the same queue ? So that when you enqueue, which ever agent (thus GPU) available will take the new task
all good. Just wanted to know in case I missed it
I understand to from the agent, point of view, I just need to update the conf file to use new credential and new server address.
yup, you have the flexibility and option, that what so nice with ClearML