ManiacalLizard2

31 Questions, 239 Answers

Active since 05 June 2023

Last activity one month ago

Reputation

Badges 1

92 × Eureka!

Answers 239

0 Also From The Python Sdk, Is There A Way To Specify The Clearml.Conf File To Use ? Like The Equivalent Of

what you mean by different script ?

one year ago

0 Another Quick Question About Fileservers And Clearml-Agent: Clearml-Agent Seems To Ignore The Output Destination Set In The Task Config

If you are using multi storage place, I don't see any other choice than putting multi credential in the conf file ... Free or Paid Clearml Server ...

one year ago

0 I Have Set

I think a proper screenshot of the full log with some information redacted is the way to go. Otherwise we are just guessing in the dark

8 months ago

0 Hi, We Have An Agent Running Inside A Nvidia Official Container. The Agent Seems To See The Gpu Driver But The Gpu Count Is 0 When I Join That Container,

@<1523701087100473344:profile|SuccessfulKoala55> Should I raise a github issue ?

9 months ago

0 What Are Project Default Output ? That The Default Output_Uri Set On The Server Side ? Can I Use Azure Blob Storage ?

so it's not suppose to say "illegal output destination ..." ?

one year ago

0 Just Want To Post It Here Before Raising A Github Issue: There Seems To Be A Regression Bug Since Clearml 1.13.0 Where Out Training In Gpu Is 2X Slower In Our Pipeline Based On

may be specific to fastai as I cannot reproduce it with another training using yolov5

9 months ago

0 Hi Everyone, I Am Running A Pipeline Using The Autoscaler, I Am Able To Spin Up The Vm Instance Using The Autoscaler And The Docker Is Also Getting Installed In There Perfectly. The Issue I Am Facing Is That During Executing A Pipeline Task While Cloning

While creating the autoscaler instance I did provide my git credentials, i.e my username and Personal Access Token.

How exactly did you do that ?

one year ago

0 How To Use Zscaler (Or Custom Certificate) With Clearml ? I Installed The Zscaler Certificate Into The Os System.

@<1523701087100473344:profile|SuccessfulKoala55> I managed to make this working by:
concat the existing OS ca bundle and zscaler certificate. And set REQUESTS_CA_BUNDLE to that bundle file

11 months ago

0 I Have Weird Issue With Clearml Agent: When Queue A Job For A Second Time On The Same Agent, It Get

will do

8 months ago

0 Hi

None
CLEARML_AGENT_SKIP_PIP_VENV_INSTALL
CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL

9 months ago

0 Hello, What Happens To Running Experiments When The Clearml Agent Is Killed (And Restarted A Few Minutes Later)?

I don;t think there is a "kill task" code. By principle, in Linux, as a parent process, ClearML agent launch the training process. When a parent process is terminated, the linux kernel will, in most of the case, kill all child processes, including your training process.
There may be some way to resume a task from ClearML agent when it restart, but I don;t think that is the default behavior

2 months ago

0 Hi, We Have An Agent Running Inside A Nvidia Official Container. The Agent Seems To See The Gpu Driver But The Gpu Count Is 0 When I Join That Container,

@<1523701087100473344:profile|SuccessfulKoala55> it is set to "all" as :

NV_LIBCUBLAS_VERSION=12.2.5.6-1NVIDIA_VISIBLE_DEVICES=allCLRML_API_SERVER_URL=https://<redacted>HOSTNAME=1b6a5b546a6bNVIDIA_REQUIRE_CUDA=cuda>=12.2 brand=tesla,driver>=470,driver<471 brand=unknown,driver>=470,driver<471 brand=nvidia,driver>=470,driver<471 brand=nvidiartx,driver>=470,driver<471 brand=geforce,driver>=470,driver<471 brand=geforcertx,driver>=470,driver<471 brand=quadro,driver>=470,driver<471 brand=qua...

10 months ago

0 I Have Weird Issue With Clearml Agent: When Queue A Job For A Second Time On The Same Agent, It Get

I will try it. But it's a bit random when this happen so ... We will see

8 months ago

0 Hello, What Happens To Running Experiments When The Clearml Agent Is Killed (And Restarted A Few Minutes Later)?

If the agent is the one running the experiment, very likely that your task will be killed.
And when the agent come back, immediately or later, probably nothing will happen. It won't resume ...

2 months ago

0 I Have A Problem Where My Clearml Doesn'T Pick Up From Uncommitted Changes. It Used To Work For A Long Time, But Now It Is Not Working. What Am I Missing?

I can only guess with little information here. You better try to debug with print statement. Is this happening in submodule uncommited changes ?

one month ago

0 How To Use Zscaler (Or Custom Certificate) With Clearml ? I Installed The Zscaler Certificate Into The Os System.

not sure ... providing Zscaler certificate seems to allow clearml to talk to our clearml server, hosted in azure, Task init worked. But then failed to connect to the storage account (Azure too) ...

11 months ago

0 Hi All. In Self Hosted Clearml Suddenly Scalars Stopped To Be Shown In Scalars Tab. Any Ideas?

(wrong tab sorry :P)

one year ago

0 Hi All, I Have Deployed A Clearml Server With Docker To One Of Our Local Machine. I Had Set Up The Filesserver Folder As Mount Point To The Cloud. How Easy Is It To Migrate Our Existing Experiments Later On To A Clearml Server That We Deploy In The Cloud

nevermind, all the database files are in data folder

one year ago

0 I Am Trying To Run One Agent On My Local Machine And One Agent On A Vm

there is a whole discussion about it here: None

one year ago

0 How To Use Zscaler (Or Custom Certificate) With Clearml ? I Installed The Zscaler Certificate Into The Os System.

@<1523701087100473344:profile|SuccessfulKoala55> Actually it failed now: failed to talked to our storage in Azure:

ClearML Task: created new task id=c47dd71dea2f421db05647a21d78ed26



2024-01-25 21:45:23,926 - clearml.storage - ERROR - Failed uploading: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1007)
2024-01-25 21:46:48,877 - clearml.storage - WARNING - Storage helper problem for .clearml.0149daec-7a03-4853-a0cd-a7e2b295...

11 months ago

0 Is There A Way To Tell The Agent To Use A Specific Venv Pre Installed ? Like The One Already Installed In The Developer Pc And The Agent Is Running Inside That Same Pc?

So I tried:

CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/data/hieu/opt/python-venv/fastai/bin/python3.10
clearml-agent  daemon  --queue no_venv

Then enqueue a cloned task to no_venv
It is still trying to create a venv (and fail):

[...]
tag = 
docker_cmd = 
entry_point = debug.py
working_dir = apple_ic
created virtual environment CPython3.10.10.final.0-64 in 140ms
  creator CPython3Posix(dest=/data/hieu/deleteme/clearml-agent/venvs-builds/3.10, clear=False, no_vcs_ignore=False, gl...

one year ago

0 Hello, I'M Trying To Use The Agent To Orchestrate Tasks - Our Install Is Quite Complicated And I'Ve Wrapped It All Up With The Code In A Docker Container; Is There A Way To Get The Agent To Just Run A Command In The Container Rather Than Try To Build/Inst

CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/path/to/my/vemv/bin/python3.12 clearml-agent bla

one year ago

0 Is There A Way To Tell The Agent To Use A Specific Venv Pre Installed ? Like The One Already Installed In The Developer Pc And The Agent Is Running Inside That Same Pc?

Set that env var in the terminal before running the agent ?

one year ago

0 I Am Struggling A Bit To Understand The Use Case Of A Pipeline: Let Say You Have Step1 -> Step2 -> Step3 What Is The Point To Use Pipeline Feature Versus Having A Single Task That Do Those Steps One After Another ???

Clear. Thanks @<1523701070390366208:profile|CostlyOstrich36> !

one year ago

0 Hi

if you are using a self hosted clearml server spin up with docker-compose, then you can just mount your NAS to /opt/clearml/fileserver on the host machine, prior to starting clearml server with docker-compose up

one year ago

0 I Have A Docker Container That Have Clearml-Agent Running Inside In Normal Mode. The Agent Take On A Task And Execute It Fine. I Just Want To Somehow Log The Docker Image Version That The Agent Is Running Inside. I Start My Container With Something Like:

from what I understand, the docker mode were designed for apt based image and also running as root inside the container.
We have container that are not apt based and running not as root
We also do some "start up" that fetch credentials from Key Vault prior running the agent

one year ago

0 Another Quick Question About Fileservers And Clearml-Agent: Clearml-Agent Seems To Ignore The Output Destination Set In The Task Config

no. I set apo.file_server to the None in Both the remote agent clearml.conf and my local clearml.conf
In which case, both case where the code is ran from local or remote, will store metrics to cloud storage

one year ago

0 I Am Trying To Run One Agent On My Local Machine And One Agent On A Vm

yes

one year ago

0 I Just Saw The New Release Of The Agent 1.8.1 :

I am wondering, should it be True be default ???

7 months ago

0 I Have Weird Issue With Clearml Agent: When Queue A Job For A Second Time On The Same Agent, It Get

@<1523701087100473344:profile|SuccessfulKoala55> I can confirm that v1.8.1rc2 fixed the issue in our case. I manage to reproduce it:

Do a local commit without pushing
Create task and queue it
The queue task failed as expected as the commit is only local
Push your local commit
Requeue the task
Expecting that the task succeeed as the commit is avail: but it fails as the vcs seems to be in weird state from previous failure
Now with v1.8.1rc2 the issue is solved

8 months ago

Show more results