JitteryCoyote63

214 Questions, 1021 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

979 × Eureka!

Questions 214
Answers 1021

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hey, Just Wanted To Mention: In Docs, Task.Get_Parameter Does Not Say:

Hey, just wanted to mention: in docs, Task.get_parameter does not say: Different sections with key prefix "section/" , as Task.get_parameters do. Also there ...

clearml

4 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi There, Is It Possible To Configure The Clearml-Agent To Run Some Commands Before Running Each Experiment It Launches? Eg.

Hi there, is it possible to configure the clearml-agent to run some commands before running each experiment it launches? Eg. echo "test" > "test.txt" && <-- ...

clearml

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, From Within An Experiment, How Can I Intercept The Signal That The Experiment Was Aborted And Execute A Cleanup Function? I Tried To Intercept Sigint And Sigterm, Unsuccessfully:

Hi, from within an experiment, how can I intercept the signal that the experiment was aborted and execute a cleanup function? I tried to intercept SIGINT and...

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, I Think I Found A Small Bug:

Hi, I think I found a small bug: Clone an experiment Enqueue it on a queue with no workers Delete the queue Try to Dequeue the experimentThe last operation w...

clearml

3 years ago

0 Votes

22 Answers

1K Views

0 Votes 22 Answers 1K Views

Hi, I Would Like To Switch From The Elastic-Search Service In The Docker-Compose Of The Clearml-Server To An Externally Managed, Scalable Elastic-Search Cluster. I Have Two Questions:

Hi, I would like to switch from the elastic-search service in the docker-compose of the clearml-server to an externally managed, scalable elastic-search clus...

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, There Is A Small Bug With Auto-Refreshing In The Debug Samples Tab Of The Web Ui: If It Is On, Then It Will Always Force The First Series To Be Displayed, Regardless Of What Has Been Selected By The User. Eg: If I Have Two Metrics,

Hi, there is a small bug with auto-refreshing in the DEBUG SAMPLES Tab of the Web UI: If it is ON, then it will always force the first series to be displayed...

clearml

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi There, It Seems Like There Is A Bug With The Visualization Of Debug Samples On The Ui (Server V1.2.0, Self-Hosted): When Clicking On A Debug Sample Then On The Download Button, If The Sample Is Stored In S3, The Download Button Opens A Blank Page With

Hi there, it seems like there is a bug with the visualization of debug samples on the UI (server v1.2.0, self-hosted): when clicking on a debug sample then o...

clearml

2 years ago

0 Votes

1 Answers

959 Views

0 Votes 1 Answers 959 Views

The Markdown Editor For Documenting Projects Is So Great, I Love It

The Markdown editor for documenting projects is so great, I love it 🤩

clearml

3 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Hey, What Is The Exact Difference Between

Hey, what is the exact difference between agent.package_manager.system_site_packages and trains-agent --install-globally ?

clearml

4 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi, Is It Possible To Specify The Required Version Of Python For A Task That Is Different From The Python Running The Clearml-Agent? Example: My Clearml-Agent Is Running On Python 3.8 And I Need A Task To Run On Python 3.10. How Can I Do That?

Hi, is it possible to specify the required version of python for a Task that is different from the python running the clearml-agent? Example: my clearml-agen...

clearml

2 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi There, I Am Trying To Start An Agent In Services Mode With Trains-Server Being On Localhost (But Not Started Together With The Docker-Compose!). My Trains.Conf Is The Following:

Hi there, I am trying to start an agent in services mode with trains-server being on localhost (but not started together with the docker-compose!). My trains...

mlops

4 years ago

0 Votes

30 Answers

942 Views

0 Votes 30 Answers 942 Views

Hi, If I Am Starting My Training With The Following Command:

Hi, if I am starting my training with the following command: python -u -m torch.distributed.launch --nproc_per_node=2 --use_env train.py --config configs/tra...

clearml

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi Folks, Is It Possible To Use An Aws P3 Instance (Which As Several Gpus) With One Agent Per Gpu, All Controlled Through Clearml Aws Autoscheduler? So Clearml Aws Autoscheduler Would Know In Advance How Much Agents To Start In The Instances (Can Be An Op

Hi folks, Is it possible to use an aws p3 instance (which as several GPUs) with one agent per GPU, all controlled through ClearML AWS AutoScheduler? So Clear...

aws mlops

3 years ago

0 Votes

16 Answers

1K Views

0 Votes 16 Answers 1K Views

Hello, ~3 Months Ago I Created A Trains-Server In A Machine With 30Gb Of Disk Space. Today I Wasn'T Able To Connect To Trains-Server, So I Checked The Server And Found That The Disk Full. I Ran:

Hello, ~3 months ago I created a trains-server in a machine with 30gb of disk space. Today I wasn't able to connect to trains-server, so I checked the server...

clearml

4 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, There Is Small Bug In The Web Ui When Comparing Two Experiments Scalars: If The Two Tasks Have The Same Name, Then Clicking On The “Maximize Graph” Button On One Scalar Series To Get The Bigger View On That Scalar Series, Then The Color Of Both Series

Hi, there is small bug in the web UI when comparing two experiments scalars: If the two tasks have the same name, then clicking on the “Maximize graph” butto...

clearml

3 years ago

0 Votes

8 Answers

942 Views

0 Votes 8 Answers 942 Views

Hi! I Have A Question Regarding Performances Of The Clearml-Server: Are The Calls From The Agents Made Asynchronously/In A Non Blocking Separate Thread? Is The Connection To The Clearml-Server Expected To Be A Bottleneck If The Clearml-Server Is Far From

Hi! I have a question regarding performances of the clearml-server: are the calls from the agents made asynchronously/in a non blocking separate thread? is t...

clearml

3 years ago

0 Votes

7 Answers

982 Views

0 Votes 7 Answers 982 Views

Hi, I Think There Is A Small Bug In The

Hi, I think there is a small bug in the Experiment running time column of the workers-and-queues/workers page: they do not match the time reported in the exp...

clearml

3 years ago

0 Votes

3 Answers

982 Views

0 Votes 3 Answers 982 Views

Hi There, I Recently Updated Clearml Server To 1.7.0, And Found The Following

⚠️ Hi there, I recently updated clearml server to 1.7.0, and found the following critical regression: When I reset an experiment, it is actually deleted 😵 ,...

clearml

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Is There An Option To Make Trains-Agent Create Experiment Virtualenvs With

Is there an option to make trains-agent create experiment virtualenvs with --system-site-packages parameter?

clearml

4 years ago

0 Votes

1 Answers

973 Views

0 Votes 1 Answers 973 Views

Hi There, Is It Safe To Use Clearml (Trains >= 0.17) With The Trains Ignite Handler? Should We Wait For The Update On Their Side?

Hi there, is it safe to use ClearML (trains >= 0.17) with the trains ignite handler? Should we wait for the update on their side?

clearml

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hello, Is It Possible For The Clearml-Agent In Docker Mode To Not Pull A Specific Docker Image, But To Build One From The Experiment Repository Using The Dockerfile And .Dockerignore Of The Experiment Repository?

Hello, is it possible for the clearml-agent in docker mode to not pull a specific docker image, but to build one from the experiment repository using the Doc...

clearml

2 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Btw I Saw A Bug In The Web Ui That Is Rather Frustrating: When I Add Some Metric Columns To A Project Page, If I Refresh The Page Manually With F5, All The Changes I Made On The Columns Are Rolled-Back, As If They Were Not Saved. Same Happens With The Res

Btw I saw a bug in the web UI that is rather frustrating: When I add some metric columns to a project page, if I refresh the page manually with F5, all the c...

clearml

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Congrats On The Clearml-Serving 0.9.0 Release! I’Ll Try It For Sure!

Congrats on the clearml-serving 0.9.0 release! I’ll try it for sure!

clearml

2 years ago

0 Votes

12 Answers

941 Views

0 Votes 12 Answers 941 Views

Hi, I Encounter A Weird Behavior: I Have A Task A That Schedules A Task B. Task B Is Executed On An Agent, But With An Old Commit

Hi, I encounter a weird behavior: I have a task A that schedules a task B. Task B is executed on an agent, but with an old commit 🤔 although the branch is p...

mlops

4 years ago

0 Votes

4 Answers

997 Views

0 Votes 4 Answers 997 Views

Hi, What Happens Exactly When I Execute The Following Command:

Hi, what happens exactly when I execute the following command: trains-agent daemon --gpus 0 --queue default &In my code, how to know which GPU to choose insi...

clearml

4 years ago

0 Votes

3 Answers

931 Views

0 Votes 3 Answers 931 Views

Hi, I Am Getting An Error While Running

Hi, I am getting an error while running task.mark_stopped() , any idea why? (clearml 1.0.2, clearml-agent 1.0.0, python 3.6) File "/home/machine/.clearml/ven...

clearml

3 years ago

0 Votes

12 Answers

1K Views

0 Votes 12 Answers 1K Views

Hi, Where Can I Find The Server Parameter To Control When The Server Is Unregistering An Agent After Not Receiving Updates? Currently It'S Quite Long (30Mins) And This Prevents The Autoscaler From Launching A New Agent

Hi, where can I find the server parameter to control when the server is unregistering an agent after not receiving updates? Currently it's quite long (30mins...

mlops

one year ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, I Am Considering Making Automated Backups Of My Clearml-Server Using Amazon Ebs Snapshots. Should I Be Concerned With The Same Problem Described Here >

Hi, I am considering making automated backups of my clearml-server using Amazon EBS snapshots. Should I be concerned with the same problem described here > h...

clearml

3 years ago

0 Votes

22 Answers

1K Views

0 Votes 22 Answers 1K Views

Hi There, I Used

Hi there, I used clearml-task to send a script to be executed remotely. When being executed remotely Task.current_task() returns None, how should I get the c...

clearml

2 years ago

0 Votes

13 Answers

2K Views

0 Votes 13 Answers 2K Views

Hi, I Am Trying To Use The Clearml-Agent In Docker Mode To Run An Experiment, But It Seems To Fail Passing The Clearml.Conf File To The Docker Container:

Hi, I am trying to use the clearml-agent in docker mode to run an experiment, but it seems to fail passing the clearml.conf file to the docker container: Exe...

clearml

one year ago

Show more results

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

Awesome! Thanks! 🙏

3 years ago

0 Hi There, I Am Running A Clearml-Agent In Services Mode (With Docker) On A Machine With Two Disks: One With The Os (8Go, 91% Space Used) And One For The Data (100Go, 40% Space Used). When Executing The Auto-Scaler Task In This Agent, I Get The Following E

Worked like a charm 👌

3 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

I followed https://github.com/NVIDIA/nvidia-docker/issues/1034#issuecomment-520282450 and now it seems to be setting up properly

3 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

0.17.1

3 years ago

0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

I killed both trains-agent and restarted one to have a clean start. This way it correctly spin up docker containers for services tasks. So probably the bug comes when a bug occurs while setting up a task, it cannot go back to the main task. I would need to do some tests to validate that hypothesis though

4 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

But I see in the agent logs:
Executing: ['docker', 'run', '-t', '--gpus', '"device=0"', ...

3 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

I got some progress TimelyPenguin76 , Now the task runs and I get the error from docker:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

3 years ago

0 Hi, Together With

(It would be nice to have all the Pypi releases tagged in github btw)

4 years ago

0 Hey Again

Awesome!

4 years ago

0 Hi, Where Can I Find The Server Parameter To Control When The Server Is Unregistering An Agent After Not Receiving Updates? Currently It'S Quite Long (30Mins) And This Prevents The Autoscaler From Launching A New Agent

, causing it to unregister from the server (and thus not remain there).

Do you mean that the agent actively notifies the server that it is going down? or the server infers that the agent is down after a timeout?

one year ago

0 Hi There, I Would Like To Report A Bug With The Resizing Of The Columns In The Projects View: It Doesn’T Work As Expected. Please Look At The Behavior Of The Resizing On The Following Screen Recording

Hi DeterminedCrab71 Version: 1.1.1-135 • 1.1.1 • 2.14

3 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

Alright I have a followup question then: I used the param --user-folder “~/projects/my-project”, but any change I do is not reflected in this folder. I guess I am in the docker space, but this folder is not linked to my the folder on the machine. Is it possible to do so?

3 years ago

0 Hi Everyone, Now I Am Evaluating Clearml. I Have A Question About How To Handle Datasets. Does Clearml Provide Any Function To Manage Datasets? Or Do We Need To Manage Them By Ourselves? In Our Usecase, We Update Datasets Little By Little Over Days Or W

This is no coincidence - Any data versioning tool you will find are somehow close to how git works (dvc, etc.) since they aim to solve a similar problem. In the end, datasets are just files.
Where clearml-data stands out imo is the straightfoward CLI combined with the Pythonic API that allows you to register/retrieve datasets very easily

3 years ago

(I am not part of the awesome ClearML team, just a happy user 🙂 )

3 years ago

I will let the team answer you on that one 🙂

3 years ago

Will it freeze/crash/break/stop the ongoing experiments?

3 years ago

0 Hey There, Would It Be Possible To Make Clearml-Agents Support Both Docker Mode And Venv Mode At The Same Time? Ie. Not Requiring To Be Restarted To Switch The Mode. The Mode Should Be Define On The Task Level: I Start An Experiment And Define Whether It

Default would be venv, only use docker if an image is passed. Use case: not have to duplicate all queues to accept both docker and venv agents on the same instances

2 years ago

0 Hi, I Have A Local Package That I Use To Train My Models. To Start Training, I Have A Script That Calls

Hi AgitatedDove14 , initially I was doing this, but then I realised that with the approach you suggest all the packages of the local environment also end up in the “installed packages”, while in reality I only need the dependencies of the local package. That’s why I use _update_requirements , with this approach only the package required will be installed in the agent

2 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

AgitatedDove14 Yes exactly, I tried the fix suggested in the github issue urllib3>=1.25.4 and the ImportError disappeared 🙂

4 years ago

AgitatedDove14 In theory yes there is no downside, in practice running an app inside docker inside a VM might introduce slowdowns. I guess it’s on me to check whether this slowdown is negligible or not

2 years ago

0 Hi, Just Want To Report A Small Bug In The Clearml Dashboard: After Queuing An Experiment, If I Change The Experiment Queue, Then Go Back To The Experiment Info Tab, The Queue Property Still Shows The Previous Queue

SuccessfulKoala55 In the new queue

3 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

Yes, it works now! Yay!

3 years ago

0 Hey There

Alright, thanks SuccessfulKoala55 !

3 years ago

I was rather wondering why clearml was taking space while I configured it to use the /data volume. But as you described AgitatedDove14 it looks like an edge case, so I don’t mind 🙂

3 years ago

0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

Alright, I had a look in the /tmp/.trains_agent_daemon_outabcdef.txt logs, not many insights from here. For the moment, I simply started a new trains-agent daemon in services mode and I will wait to see what happens.

4 years ago

AgitatedDove14 Is it possible to shut down the server while an experiment is running? I would like to resize the volume and then restart it (should take ~10 mins)

3 years ago

0 Is There An Option To Make Trains-Agent Create Experiment Virtualenvs With

Just found yea, very cool! Thanks!

4 years ago

0 Hi, I Just Updated Clearml Server 1.0 Using

I just checked if something changed in https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_config.html#web-login-authentication

3 years ago

0 Hey Guys, I Am Trying To Plan What I Need To Do In Order To Efficiently Use Clearml With Spot Instances 1) Detecting When Spot Instance Is Down And Experiment Is Aborted 2) Extracting S3 Address Of The Latest Checkpoint From Clearml Api 3) Starting New E

we use task.models[] 🙂

3 years ago

Hi DilapidatedDucks58 , I did that already, but I am reusing the same experiment instead of merging two experiments. Step 4 can be seen as:
Update the experiment status to stopped (if it is failed, you won’t be able to re-enqueue it) Set a parameter of that task to point to the latest checkpoint and load it (you can also infer it directy: I simply add a tag to the task resume , and check at runtime if this tag exists, if yes, I fetch the latest checkpoint of the task) Use https://clea...

3 years ago

Show more results