So we have 3 python package, store in github.com
On the dev machine, the datascientist (DS) will add the local ssh key to his github account as authorized ssh keys, account level.
With the DS can run git clone git@github.com:org/repo1
then install that python package via pip install -e .
Do that for all 3 python packages, each in its own repo1
, repo2
and repo3
. All 3 can be clone using the same key that the DS added to his account.
The DS run a tra...
@<1523701070390366208:profile|CostlyOstrich36>
Yes. I am investigating that route now.
if you want to replace MLflow by ClearML: do it !! It's like "Should I use sandal or running shoes for my next marathon ..."
Let your user try ClearML, and I am pretty sure all of them will want to swap over !!!
Nevermind: None
By default, the File Server is not secured even if Web Login Authentication has been configured. Using an object storage solution that has built-in security is recommended.
My bad
I will try it. But it's a bit random when this happen so ... We will see
@<1523701087100473344:profile|SuccessfulKoala55> I can confirm that v1.8.1rc2 fixed the issue in our case. I manage to reproduce it:
- Do a local commit without pushing
- Create task and queue it
- The queue task failed as expected as the commit is only local
- Push your local commit
- Requeue the task
- Expecting that the task succeeed as the commit is avail: but it fails as the vcs seems to be in weird state from previous failure
- Now with v1.8.1rc2 the issue is solved
@<1523701087100473344:profile|SuccessfulKoala55> Actually it failed now: failed to talked to our storage in Azure:
ClearML Task: created new task id=c47dd71dea2f421db05647a21d78ed26
2024-01-25 21:45:23,926 - clearml.storage - ERROR - Failed uploading: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)
2024-01-25 21:46:48,877 - clearml.storage - WARNING - Storage helper problem for .clearml.0149daec-7a03-4853-a0cd-a7e2b295...
Is it because Azure is "whitelisted" in our network ? Thus need a different certificate ?? And how do I provide 2 differents certificate ? Is bundling them simple as a concat of 2 pem file ?
@<1523701087100473344:profile|SuccessfulKoala55> Thanks. Manage to get it working now with
export REQUESTS_CA_BUNDLE=/etc/ssl/certs/zscaler.crt
(Ubuntu system)
not sure ... providing Zscaler certificate seems to allow clearml to talk to our clearml server, hosted in azure, Task init worked. But then failed to connect to the storage account (Azure too) ...
Onprem: User management is not "live" as you need to reboot and password are hardcoded ... No permission distinction, as everyone is admin ...
can you make train1.py
use clearml.conf.server1
and train2.py
use clearml.conf2
?? In which case I would be intersted @<1523701087100473344:profile|SuccessfulKoala55>
Can you share the agent log, in the console tab, before the error?
Looks like your issue is not that ClearML is not tracking your changes but more about your Configuration is overwrriten.
This often happen to me. The way I debug this is put a lot of print statement along the code to track when the Configuration is overwriten and narrow down why. print statement will show up in the Console tab.
So I tried:
CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/data/hieu/opt/python-venv/fastai/bin/python3.10
clearml-agent daemon --queue no_venv
Then enqueue a cloned task to no_venv
It is still trying to create a venv (and fail):
[...]
tag =
docker_cmd =
entry_point = debug.py
working_dir = apple_ic
created virtual environment CPython3.10.10.final.0-64 in 140ms
creator CPython3Posix(dest=/data/hieu/deleteme/clearml-agent/venvs-builds/3.10, clear=False, no_vcs_ignore=False, gl...
oh, looks like I need to empty the Installed Package before enqueue the cloned task
Set that env var in the terminal before running the agent ?
Found a trick to have empty Installed package:clearml.Task.force_requirements_env_freeze(force=True,requirements_file="/dev/null")
Not sure if this is the right way or not ...
So I now just need to find a way to not populate Installed Package in the first place
But then how did the agent know where is the venv that it needs to use?
@<1523701070390366208:profile|CostlyOstrich36> Is there a way to tell clearml to not try to detect the Installed package ?
Oh, so you mean CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/path/to/my/venv/bin/python3
??
and just came across this: None
That sounds like what you may be looking for
is this mongodb type of filtering?
@<1523701868901961728:profile|ReassuredTiger98> I found that you an set the file_server
in your local clearml.conf
to your own cloud storage. In our case, we use something like this in our clearml.conf:
api {
file_server: "azure://<account>..../container"
}
All non artifact model are then store in our azure storage. In our self-hosted clearml setup, we don't even have a file server running alltogether
how di you provide credentials to clearml and git ?