
Reputation
Badges 1
101 × Eureka!Yes. I am investigating that route now.
So we have 3 python package, store in github.com
On the dev machine, the datascientist (DS) will add the local ssh key to his github account as authorized ssh keys, account level.
With the DS can run git clone git@github.com:org/repo1
then install that python package via pip install -e .
Do that for all 3 python packages, each in its own repo1
, repo2
and repo3
. All 3 can be clone using the same key that the DS added to his account.
The DS run a tra...
You will need to change more than just REQUESTS_CA_BUNDLE
to use custom certificate. Python libraries don't all follow REQUESTS_CA_BUNDLE
You need to also add your certificate to your OS
In conda we have to export SSL_CERT_FILE=~/ca-bundle.crt
etc ...
You are using CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL the wrong way
@<1523701070390366208:profile|CostlyOstrich36> Is there a way to tell clearml to not try to detect the Installed package ?
Just keep in mind my your bottleneck will be the transfer rate. So mounting will not save you anything as you still need to transfer the whole dataset sooner or later to your GPU instance.
One solution is as Jake suggest. The other can be pre-download the data to your instance with a CPU only cheap instance type, then restart the instance with GPU.
please share your .service
content too as there are a lot of way to "spawn" in systemd
so what was the solution/hack then ?
while the other may need to be 1
instead of true
something like this: None ?
you should be able to test your credential first using something like rclone or azure-cli
once you install manually your package inside the docker container, check that your file module_b/templates/my_template.yml
is where it should be
@<1523701070390366208:profile|CostlyOstrich36> I would like to point to azure blob storage, what kind of url schema should I use ? And also, where do you configure the credential for the ClearML server to access to Azure blob as file_server ? I couldn't find any documentation around this topic 😞
TIA
Found it: None
And credential are set with :
sdk {
azure.storage {
containers: [
{
account_name: "account"
account_key: "xxxx"
container_name:"clearml"
}
]
}
}
if you are on github.com , you can use Fine tune PAT token to limit access to minimum. Although the token will be tight to an account, it's quite easy to change to another one from another account.
(wrong tab sorry :P)
following this thread as it happen every now and then that clearml miss some package for some reason ...
or which worker is in a queue ...
I don;t think there is a "kill task" code. By principle, in Linux, as a parent process, ClearML agent launch the training process. When a parent process is terminated, the linux kernel will, in most of the case, kill all child processes, including your training process.
There may be some way to resume a task from ClearML agent when it restart, but I don;t think that is the default behavior
I think a proper screenshot of the full log with some information redacted is the way to go. Otherwise we are just guessing in the dark
have you try a different browser ?
Can you paste here what inside "Installed package" to double check ?
@<1558986867771183104:profile|ShakyKangaroo32> If you just want something to run in regular period, have you consider TaskScheduler: None
How are you using the function update_output_model
?