JitteryCoyote63

214 Questions, 1021 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

979 × Eureka!

Questions 214
Answers 1021

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, I Think I Found A Small Bug:

Hi, I think I found a small bug: Clone an experiment Enqueue it on a queue with no workers Delete the queue Try to Dequeue the experimentThe last operation w...

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, In The "Choose Compared Experiments" View Of The Webui, Would It Be Possible To Add A Toggle To Include Archived Experiments In The Results Of The Search? Also Add The Task Type Field?

Hi, in the "Choose compared experiments" view of the WebUI, would it be possible to add a toggle to include archived experiments in the results of the search...

clearml

2 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hey, I Moved My Trains-Server To Another Machine, Zipping The /Opt/Trains/Data Folder As Described In The Docs

Hey, I moved my trains-server to another machine, zipping the /opt/trains/data folder as described in the docs https://allegro.ai/docs/deploying_trains/train...

mlops

4 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi There, I Think There Is A Bug With Clearml Sdk V0.17.5Rc2: When Running A Task Locally, The Dashboard Doesnt Not Shows The Task As Finished Once The Task Is Finished

Hi there, I think there is a bug with clearml sdk v0.17.5rc2: when running a task locally, the dashboard doesnt not shows the task as finished once the task ...

clearml

3 years ago

0 Votes

5 Answers

978 Views

0 Votes 5 Answers 978 Views

Hi There! I Have A Question Regarding S3 Access: I Created A S3 User With Read/Write Access But Not Delete, And Trains Seems To Requires Delete Permissions (See Errors Below). Why Does It Need Delete Permissions?

Hi there! I have a question regarding s3 access: I created a s3 user with read/write access but not delete, and trains seems to requires delete permissions (...

clearml

4 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

Hello, I am getting ValueError: Could not get access credentials for ' s3://my-bucket ' , check configuration file ~/trains.conf but I did specify them in my...

clearml

4 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hi Guys, With The New Venv Caching Available In Clearml, I Have The Following Problem: I Force My Pip Requirements To Be:

Hi guys, with the new venv caching available in clearml, I have the following problem: I force my pip requirements to be: torch==1.7.1 pytorch-ignite clearml...

clearml

3 years ago

0 Votes

5 Answers

941 Views

0 Votes 5 Answers 941 Views

Hi, I Have A Long Running Experiment That Was Running On Aws Instance That Got Killed After ~4 Days With The Following Reason:

Hi, I have a long running experiment that was running on AWS instance that got killed after ~4 days with the following reason: STATUS REASON: Forced stop (no...

clearml

2 years ago

0 Votes

2 Answers

928 Views

0 Votes 2 Answers 928 Views

Hi, Another Idea For Clearml Web Ui: In The Projects View, If I Have Several Experiments Being Enqueued And I Sort By “Started” Ascending (Newest On Top), I Expect To See Enqueued Experiments At The Very Top, While They Are Shown At The Very Bottom - Woul

Hi, another idea for ClearML web UI: in the projects view, if I have several experiments being enqueued and I sort by “Started” ascending (newest on top), I ...

clearml

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi There, Would It Be Possible For The Autoscaler To Support Stopping Instances Instead Of Terminating Them? My Use Case Is The Following: I Am Continuing My Journey With The Clearml-Session Tool, And In Case The Clearml-Session Is Running In A Ec2 Inst

Hi there, would it be possible for the autoscaler to support stopping instances instead of terminating them? My use case is the following: I am continuing my...

mlops remote-ssh

2 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi, How Does

Hi, how does agent.enable_git_ask_pass works? I am using the clearml-agent in docker mode and my experiment is stuck at downloading a private dependency: Clo...

mlops

one year ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Hi, I Recently Updated Clearml-Server To 1.7 And I Am Getting A Lot Of The Following Errors Since Today On Any Experiment (I Didn'T Had This Error Before):

Hi, I recently updated clearml-server to 1.7 and I am getting a lot of the following errors since today on any experiment (I didn't had this error before): 1...

clearml

2 years ago

0 Votes

18 Answers

1K Views

0 Votes 18 Answers 1K Views

Hey There, I Would Like To Increase The

Hey there, I would like to increase the ulimit for the number of files opened at the same time in a ec2 instance. According to this https://stackoverflow.com...

clearml

3 years ago

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

Hi, I Would Like To Create Backups Of My Trains-Server Periodically. I Was Thinking About Creating A Service Task Under The Devops Project. The Backup Task Would:

Hi, I would like to create backups of my trains-server periodically. I was thinking about creating a service task under the devops project. The backup task w...

clearml

3 years ago

0 Votes

13 Answers

988 Views

0 Votes 13 Answers 988 Views

Hello, In The Following Context:

Hello, in the following context: controller_task = Task.init(...) # This will clone the parent task, enqueue and wait for finished status data_processing_tas...

clearml

4 years ago

0 Votes

13 Answers

1K Views

0 Votes 13 Answers 1K Views

Hey There, Is It Possible For A Clearml Pipeline Step To Log A Folder Instead Of Numpy/Pickle Objects? Looking At The Docs,

Hey there, Is it possible for a clearml pipeline step to log a folder instead of numpy/pickle objects? Looking at the docs, monitor_artifacts could be what I...

clearml

2 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, In A Subproject, Would It Be Possible To Hide The Parent Project If It Is Empty?

Hi, in a subproject, would it be possible to hide the parent project if it is empty?

clearml

3 years ago

0 Votes

3 Answers

983 Views

0 Votes 3 Answers 983 Views

Hi Guys, Is It Possible To Spin Up Two Agents On One Gpu? Something Like

hi guys, is it possible to spin up two agents on one GPU? Something like trains-agent daemon --gpus 0 --queue default & trains-agent daemon --gpus 0 --queue ...

clearml

3 years ago

0 Votes

4 Answers

985 Views

0 Votes 4 Answers 985 Views

Hi Guys, I Got A Very Unexpected Error Today On In One Of My Agents:

Hi guys, I got a very unexpected error today on in one of my agents: ... Collecting tqdm Using cached tqdm-4.48.2-py2.py3-none-any.whl (68 kB) Processing /ro...

clearml

4 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi There, I Just Updated Clearml-Server To 1.8.0 And I See The Following But In The Comparison Of Scalars: All The Graphs Are Compressed To The Left When The Experiment Name Is Too Long In The Legend. I Will Now Try In 1.7.0 (It Was Not The Case In 1.6.0)

Hi there, I just updated clearml-server to 1.8.0 and I see the following but in the comparison of Scalars: All the graphs are compressed to the left when the...

clearml

2 years ago

0 Votes

17 Answers

1K Views

0 Votes 17 Answers 1K Views

Hi, I Have Another Bug To Report For Clearml-Server 1.2 (Self Hosted) In The Console Logs Of An Experiments, I Cannot See The Latest Logs. Eg My Experiment Is Done, But I Can Only See The Logs Of To The Installation Of The Packages. If I Download The Log

Hi, I have another bug to report for clearml-server 1.2 (self hosted) In the console logs of an experiments, I cannot see the latest logs. Eg my experiment i...

clearml

2 years ago

0 Votes

3 Answers

984 Views

0 Votes 3 Answers 984 Views

Hi There, I Recently Updated Clearml Server To 1.7.0, And Found The Following

⚠️ Hi there, I recently updated clearml server to 1.7.0, and found the following critical regression: When I reset an experiment, it is actually deleted 😵 ,...

clearml

2 years ago

0 Votes

20 Answers

1K Views

0 Votes 20 Answers 1K Views

Is It Possible To Run An Agent, Listen To The Services Queue Without Using Docker?

Is it possible to run an agent, listen to the services queue without using docker?

clearml

4 years ago

0 Votes

7 Answers

981 Views

0 Votes 7 Answers 981 Views

Hi, I Am Currently Using

Hi, I am currently using CLEARML_AGENT_GIT_USER and CLEARML_AGENT_GIT_PASS when starting my clearml-agent and I would like to switch to using a single auth t...

clearml

2 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hi, Coming Back With The Venv Caching: With The Following Setting:

Hi, coming back with the venv caching: with the following setting: I call Task._update_requirements(["."]) setup.py has the following install_requires=["my-p...

mlops

3 years ago

0 Votes

30 Answers

2K Views

0 Votes 30 Answers 2K Views

Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

Hi, I am giving another try to clearml-session and I am blocked at the current error shown when the CLI try to establish the tunneling: Starting SSH tunnel W...

remote-ssh

2 years ago

0 Votes

2 Answers

977 Views

0 Votes 2 Answers 977 Views

Hi, How Can I Search An Old Experiment Based On Its Commit Hash?

Hi, how can I search an old experiment based on its commit hash?

clearml

one year ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hey Guys, Quick Question: Is There A Tool Function To Know If A Task Id Is Valid? Not Verifying That The Task Itself Exists, Just That The Task Id Is The Correct Format

Hey guys, quick question: is there a tool function to know if a task id is valid? Not verifying that the task itself exists, just that the task id is the cor...

clearml

4 years ago

0 Votes

2 Answers

935 Views

0 Votes 2 Answers 935 Views

First Link In Hyperparameter Optimization Page Is Broken >

First link in hyperparameter optimization page is broken > https://allegro.ai/docs/examples/examples_hyperparam_opt/

clearml

4 years ago

0 Votes

23 Answers

946 Views

0 Votes 23 Answers 946 Views

Hi, I Would Like To Bring Awareness

Hi, I would like to bring awareness on this issue , this impacts my work as I cannot install the older version of torch (1.11.0)

clearml

one year ago

Show more results

0 Hey There, Would It Be Possible To Make Clearml-Agents Support Both Docker Mode And Venv Mode At The Same Time? Ie. Not Requiring To Be Restarted To Switch The Mode. The Mode Should Be Define On The Task Level: I Start An Experiment And Define Whether It

AgitatedDove14 In theory yes there is no downside, in practice running an app inside docker inside a VM might introduce slowdowns. I guess it’s on me to check whether this slowdown is negligible or not

2 years ago

0 Hi, Just Want To Report A Small Bug In The Clearml Dashboard: After Queuing An Experiment, If I Change The Experiment Queue, Then Go Back To The Experiment Info Tab, The Queue Property Still Shows The Previous Queue

SuccessfulKoala55 In the new queue

3 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

Yes, it works now! Yay!

3 years ago

0 Hey There

Alright, thanks SuccessfulKoala55 !

3 years ago

0 Hi There, I Am Running A Clearml-Agent In Services Mode (With Docker) On A Machine With Two Disks: One With The Os (8Go, 91% Space Used) And One For The Data (100Go, 40% Space Used). When Executing The Auto-Scaler Task In This Agent, I Get The Following E

I was rather wondering why clearml was taking space while I configured it to use the /data volume. But as you described AgitatedDove14 it looks like an edge case, so I don’t mind 🙂

3 years ago

0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

Alright, I had a look in the /tmp/.trains_agent_daemon_outabcdef.txt logs, not many insights from here. For the moment, I simply started a new trains-agent daemon in services mode and I will wait to see what happens.

4 years ago

AgitatedDove14 Is it possible to shut down the server while an experiment is running? I would like to resize the volume and then restart it (should take ~10 mins)

3 years ago

0 Hi, I Just Updated Clearml Server 1.0 Using

I just checked if something changed in https://allegro.ai/clearml/docs/docs/deploying_clearml/clearml_server_config.html#web-login-authentication

3 years ago

0 Hey Guys, I Am Trying To Plan What I Need To Do In Order To Efficiently Use Clearml With Spot Instances 1) Detecting When Spot Instance Is Down And Experiment Is Aborted 2) Extracting S3 Address Of The Latest Checkpoint From Clearml Api 3) Starting New E

we use task.models[] 🙂

3 years ago

Hi DilapidatedDucks58 , I did that already, but I am reusing the same experiment instead of merging two experiments. Step 4 can be seen as:
Update the experiment status to stopped (if it is failed, you won’t be able to re-enqueue it) Set a parameter of that task to point to the latest checkpoint and load it (you can also infer it directy: I simply add a tag to the task resume , and check at runtime if this tag exists, if yes, I fetch the latest checkpoint of the task) Use https://clea...

3 years ago

0 Hey Again

Very cool! Run two train-agent daemons, one per GPU on the same machine, with default Nvidia/CUDA Docker This is close to my use case, I just would like to run these two daemons not with docker, would that be possible? I should just remove the --docker nvidia/cuda param right?

4 years ago

0 Hey Again

trains-agent daemon --gpus 0 --queue default & trains-agent daemon --gpus 1 --queue default &

4 years ago

0 Hi, I Updated To Clearml-Server 1.4.0 And I Am Uncomfortable With The New Table/Detail View, Is There A Way To Disable It And Use The Previous One (On Click -> Open Details)?

I guess I’ll get used to it 😄

2 years ago

0 Hi, I Would Like To Follow-Up In This

SuccessfulKoala55 , This is not the exact corresponding request (I refreshed the tab since then), but the request is an events.get_task_logs , with the following content:

2 years ago

0 Hi, I Would Like To Follow-Up In This

Well actually I do see many errors like that in the browser console:

2 years ago

0 Hi, I Would Like To Follow-Up In This

As a quick fix, can you test with auto refresh (see top right button with the pause sign you have on the video)

That doesn’t work unfortunately

2 years ago

0 Hi, I Would Like To Follow-Up In This

So the new EventsIterator is responsible for the bug.
Is there a way for me to easily force the WebUI to always use the previous endpoint (v1.7)? I saw in the diff changes v1.1.0 > v1.2.0 that ES version was bumped to 7.16.2. I am using an external ES cluster, and its version is still 7.6.2. Can it be that the incompatibility comes from here? I’ll update the cluster to make sure it’s not the case

2 years ago

0 Hi, I Would Like To Follow-Up In This

Super! I’ll give it a try and keep you updated here, thanks a lot for your efforts 🙏

2 years ago

0 Hi, I Would Like To Follow-Up In This

Hi SuccessfulKoala55 , AgitatedDove14 ,
I updated to 1.4.0 (Web UI shows: WebApp: 1.5.0-186 • Server: 1.5.0-186 • API: 2.18 )
Unfortunately the bug is still there 😞
I don’t see errors in the console anymore though!

I had another look and modified a events.get_task_logs request with a super old timestamp to try to retrieve all logs, this returned me only the few logs already displayed in the console. So I think the problem doesn’t come from the WebUI, but from the...

2 years ago

0 Hi, I Would Like To Follow-Up In This

I am happy if I can be of any help to fix that 😄

2 years ago

0 Hi, I Would Like To Follow-Up In This

Another error that just popped up:

2 years ago

0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

correct, you could also use

Task.create

that creates a Task but does not do any automagic.

Yes, I didn't use it so far because I didn't know what to expect since the doc states:
"Create a new, non-reproducible Task (experiment). This is called a sub-task."

4 years ago

0 Hey, What Is The Exact Difference Between

I tested by installing flask in the default env -> which was installed in the ~/.local/lib/python3.6/site-packages folder. Then I created a venv with flag --system-site-packages . I activated the venv and flask was indeed available

4 years ago

0 Hi There, I Used

So I guess the problem is that the following snippet:
from clearml import Task Task.init()Should be added before the if __name__ == "__main__": ?

2 years ago

0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

I still don't see why you would change the type of the cloned Task, I'm assuming the original Task had the correct type, no?

Because it is easier for me that I create a training task out of the controller task by cloning it (so that parameters are prefilled and I can set the parent task id)

4 years ago

0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

Thanks for the hack! The use case is the following: I have a controler that creates training/validation/testing tasks by cloning (so that the parent task id is properly set to the controler). Otherwise I could simply create these tasks with Task.init, but then I would need to set manually the parent task for each one of these tasks, probably with a similar hack, right?

4 years ago

0 Hi, I Have A Question Regarding The Aws_Autoscaler: It Usually Takes ~Hours To Get A Gpu Instance Nowadays. I Was Thinking, It Would Be Much More Interesting To Stop The Instances (Clearml-Agents) Instead Of Terminating Them Once They Are Inactive, So Tha

No I agree, it’s probably not worth it

2 years ago

0 Are The Env Variables Passed To Trains-Agent Available In Experiments Run By This Trains-Agent?

Awesome, thanks!

4 years ago

0 Are The Various Task Types Available In 0.15? I Am Getting

AgitatedDove14 Is it fixed with trains-server 0.15.1?

4 years ago

0 Are The Various Task Types Available In 0.15? I Am Getting

awesome, thank you 👍

4 years ago

Show more results