JitteryCoyote63

214 Questions, 1021 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

979 × Eureka!

Questions 214
Answers 1021

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Hey There, I Moved The Clearml S3 Bucket Where I Stored All My Clearml Data From One S3 Bucket To Another And Now I Realized That All The Models/Experiments Logged In The Clearml-Server Still Refer To The Old S3 Bucket. Is There A Way To Update All The Re

Hey there, I moved the clearml s3 bucket where I stored all my clearml data from one s3 bucket to another and now I realized that all the models/experiments ...

clearml

3 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, A Small Bug (Not Really A Bug) In The Autoscaler: I Have P3.2Xlarge Instances That Take A Long Time To Shutdown. With

Hi, a small bug (not really a bug) in the autoscaler: I have p3.2xlarge instances that take a long time to shutdown. With polling_interval_time_min=1 , the a...

mlops

3 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, I See That There Is A New Parameter In Aws Autoscaler:

Hi, I see that there is a new parameter in aws autoscaler: max_spin_up_time_min - What is the difference with max_idle_time_min ?

aws

3 years ago

0 Votes

12 Answers

1K Views

0 Votes 12 Answers 1K Views

Hi, I Deleted Some Archived Experiments In Clearml Server 1.0 And The Popup In The Dashboard Showed “The Following Artifacts Were Not Deleted”, With A List Of Files That Are Under

Hi, I deleted some archived experiments in clearml server 1.0 and the popup in the dashboard showed “the following artifacts were not deleted”, with a list o...

clearml

3 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hi, I Am Getting The Following Errors In The Experiments I Am Currently Running:

Hi, I am getting the following errors in the experiments I am currently running: 2021-06-25 17:11:47,911 - clearml.Metrics - ERROR - Action failed <504/0: ev...

clearml

3 years ago

0 Votes

25 Answers

987 Views

0 Votes 25 Answers 987 Views

Hi, I Have Another Problem

Hi, I have another problem 😅 in one of my agent, one experiment started without torch using GPU. In the logs of the experiment shared below, we can see that...

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi There, Would It Be Possible For The Autoscaler To Support Stopping Instances Instead Of Terminating Them? My Use Case Is The Following: I Am Continuing My Journey With The Clearml-Session Tool, And In Case The Clearml-Session Is Running In A Ec2 Inst

Hi there, would it be possible for the autoscaler to support stopping instances instead of terminating them? My use case is the following: I am continuing my...

mlops remote-ssh

2 years ago

0 Votes

15 Answers

1K Views

0 Votes 15 Answers 1K Views

Hi, How Can I Get The Logs From The Pytorch Ignite Early Stopping Handler To Be Logged In Clearml?

Hi, how can I get the logs from the pytorch ignite early stopping handler to be logged in clearml?

pytorch

3 years ago

0 Votes

1 Answers

995 Views

0 Votes 1 Answers 995 Views

Hi, I Encounter The Following Bug With Clearml 0.17.5Rc2: When I Start A Task Locally And That Task Raises Cuda Out Of Memory, The Command Returns But The Process Is Not Killed, And Therefore The Gpu Ram Is Not Freed

Hi, I encounter the following bug with clearml 0.17.5rc2: When I start a task locally and that task raises cuda out of memory, the command returns but the pr...

clearml

3 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi, I Cannot Manage To Start Trains-Server 0.16 With The Docker-Compose File, The Trains-Elastic Container Fails With The Following Error:

Hi, I cannot manage to start trains-server 0.16 with the docker-compose file, the trains-elastic container fails with the following error:

clearml

4 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hey There, Does Trains Support

Hey there, Does trains support clicks ? (entry points defined with that library)

clearml

4 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

The “Manage Queue” Option In The Right Tab On A Queued Experiment Is Broken In V1.0 (It Does Nothing)

The “Manage queue” option in the right tab on a queued experiment is broken in v1.0 (it does nothing)

clearml

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Quick Question: How Can I Clone A Task And Change The Cloned Task Type? I See No Task.Set_Type() Function

Quick question: How can I clone a task and change the cloned task type? I see no Task.set_type() function

clearml

4 years ago

0 Votes

19 Answers

1K Views

0 Votes 19 Answers 1K Views

I Guess One Experiment Is Running Backwards In Time

I guess one experiment is running backwards in time 😄

clearml

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Another One: What Is The Difference Between Task.Connect() And Task.Set_Parameter?

Another one: What is the difference between Task.connect() and Task.set_parameter?

clearml

4 years ago

0 Votes

5 Answers

935 Views

0 Votes 5 Answers 935 Views

How Can I Do The Following? (Basically, Filtering By Task Type)

How can I do the following? (basically, filtering by task type) Task.get_tasks(project_name="my-project", task_name="my-task", task_filter=dict(type="trainin...

clearml

4 years ago

0 Votes

3 Answers

970 Views

0 Votes 3 Answers 970 Views

Hey! Would It Be Possible To Tag The Rc Releases In The Different Repos? So That One Knows What Is Inside?

Hey! Would it be possible to tag the RC releases in the different repos? So that one knows what is inside?

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, There Is A "Bug" Introduced In The Latest Version Of Clearml-Server: When An Experiment Is In "Full Screen View", In The Console Tab, The Auto Refreshing Of The Console Makes The Console Disappearing For A Short Moment. When The Console Reappears, The

Hi, there is a "bug" introduced in the latest version of clearml-server: when an experiment is in "full screen view", in the console tab, the auto refreshing...

clearml

3 years ago

0 Votes

3 Answers

978 Views

0 Votes 3 Answers 978 Views

Hi Guys, Since I Am Done With Implementing The Aws Autoscaler, I Would Like To Share Some Pain Points That I Encountered In The Process With The Hope That They Can Be Documented To Help Other Users:

Hi guys, since I am done with implementing the AWS autoscaler, I would like to share some pain points that I encountered in the process with the hope that th...

aws

3 years ago

0 Votes

18 Answers

1K Views

0 Votes 18 Answers 1K Views

Hi, I Just Updated Clearml Server 1.0 Using

Hi, I just updated clearml server 1.0 using docker-compose down & docker-compose pull & docker-compose up -d , it worked ant it looks amazing! I found two pr...

clearml

3 years ago

0 Votes

23 Answers

1K Views

0 Votes 23 Answers 1K Views

Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

Hi, I started a trains-agent (0.15) in services mode (full command: trains-agent daemon --services-mode --detached --queue services --create-queue --docker u...

mlops

4 years ago

0 Votes

12 Answers

928 Views

0 Votes 12 Answers 928 Views

Hey, Would It Possible To Add An Option To Make

Hey, would it possible to add an option to make task.upload_artifact() blocking? (Not running in background)

clearml

4 years ago

0 Votes

1 Answers

967 Views

0 Votes 1 Answers 967 Views

Hi, I Have A Clearml-Agent (1.1.2) In A G4Dn.4Xlarge Aws Instance (With One T4 Gpu), That Reports

Hi, I have a clearml-agent (1.1.2) in a g4dn.4xlarge AWS instance (with one T4 GPU), that reports agent.cuda_version = 0 agent.cudnn_version = 0and does not ...

clearml

2 years ago

0 Votes

18 Answers

1K Views

0 Votes 18 Answers 1K Views

Hi, Kudos For The 0.15 Guys! I Am Having An Issue Related To Git Auth: I Have An Issue With Trains-Agent (0.15): It Does Not Use Git Creds While Trying To Clone A Private Repo:

Hi, kudos for the 0.15 guys! I am having an issue related to git auth: I have an issue with trains-agent (0.15): it does not use git creds while trying to cl...

mlops

4 years ago

0 Votes

20 Answers

1K Views

0 Votes 20 Answers 1K Views

Is It Possible To Run An Agent, Listen To The Services Queue Without Using Docker?

Is it possible to run an agent, listen to the services queue without using docker?

clearml

4 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, Is It Possible To Start A Clearml-Agent (Not In Docker Mode) On A Machine With A Gpu, But Enforce The Clearml-Agent To Not “See” The Gpu? So That The Experiments Run By This Agent Fail If They Try To Access A Gpu? Like The

Hi, is it possible to start a clearml-agent (not in docker mode) on a machine with a gpu, but enforce the clearml-agent to not “see” the gpu? So that the exp...

mlops

3 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Btw I Saw A Bug In The Web Ui That Is Rather Frustrating: When I Add Some Metric Columns To A Project Page, If I Refresh The Page Manually With F5, All The Changes I Made On The Columns Are Rolled-Back, As If They Were Not Saved. Same Happens With The Res

Btw I saw a bug in the web UI that is rather frustrating: When I add some metric columns to a project page, if I refresh the page manually with F5, all the c...

clearml

2 years ago

0 Votes

1 Answers

982 Views

0 Votes 1 Answers 982 Views

Hi, Is There A Way To Update The Setup Shell Script Via The Sdk?

Hi, is there a way to update the setup shell script via the SDK?

clearml

one year ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hi, I Face A Strange Behavior From The Clearml-Agent: It’S Running In Services Mode, Not In Docker Mode, Cpu Only. I Want To Execute Two Tasks On This Service Agent. One Works, The Other Always Fails After Being Enqueued And Picked By The Agent With The E

Hi, I face a strange behavior from the clearml-agent: it’s running in services mode, not in docker mode, cpu only. I want to execute two tasks on this servic...

mlops

3 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi Again, Is There A Way To Pass Secrets As Parameters Of A Task? I Have An Experiment That Requires Connecting To A Database, And I Need To Be Able To Pass The Creds As Task Params (Or In Another Way, I Don'T Know Yet). But I Don'T Want To Expose My Cred

Hi again, is there a way to pass secrets as parameters of a task? I have an experiment that requires connecting to a database, and I need to be able to pass ...

clearml

3 years ago

Show more results

0 Hey, What Is The Exact Difference Between

I hitted enter too fast ^^
Installing them globally via
$ pip install numpy opencv torch will install locally with warning:
Defaulting to user installation because normal site-packages is not writeable , therefore the installation will take place in ~/.local/lib/python3.6/site-packages , instead of the default one. Will this still be considered as global site-packages and still be included in experiments envs? From what I tested it does

4 years ago

0 Hi There, I Have A Bit Of A Problem With Aws Secrets: I Pass Keys As Env Var To Clearml-Agents To Retrieve Data From A Bucket In Us-East-1 But I Use A Bucket To Store Task Artifacts In A Bucket In Eu-Central-1. So When I Pass Aws Keys As Env Vars, The Tas

Yes that’s correct - the weird thing is that the error shows the right detected region

3 years ago

0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

Trying now your code… should take a couple of mins

3 years ago

SuccessfulKoala55 I want to avoid writing creds in plain text in the config file

3 years ago

But clearml does read from env vars as well right? It’s not just delegating resolution to the aws cli, so it should be possible to specify the region to use for the logger, right?

3 years ago

I am using clearml_agent v1.0.0 and clearml 0.17.5 btw

3 years ago

0 Hi, In The Aws Autoscaler, I Am Getting The Following Warning:

ok yes, that makes sense 👍

3 years ago

0 Hi, On Clearml-Server 1.5.0, In Scalar Graphs, The New Default Value Is “Show Closest Data On Hover”. Would It Be Possible To Make It Automatically Set To “Compare Data On Hover” When Comparing Multiple Experiments?

I’m not too fond of many user configurations, it’s confusing.

100% agree, nevertheless, how much is too many? Currently, there are only two settings in the user preferences category, so one more wouldn’t hurt?

however, clearml is open source, nothing stops you from adding the code and sending a PR

I’d be super happy to contribute yes! Nevertheless, I am not sure where to start: clearml-server repo? clearml-web repo?

2 years ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

CostlyOstrich36 How is clearml-session setting the ssh config?

2 years ago

0 Hi, I Have Another Problem

btw shoulnd't it be CUDA_VERSION=11.0 ?

4 years ago

0 Hi, Kudos For The 0.15 Guys! I Am Having An Issue Related To Git Auth: I Have An Issue With Trains-Agent (0.15): It Does Not Use Git Creds While Trying To Clone A Private Repo:

Done! Also I tried to use git cache ( https://git-scm.com/docs/git-credential-cache ) as a workaround (hoping that the first time it clones the experiment repo, it caches the creds for the next times, but I then get a different error: fatal: unable to find a suitable socket path; use --socket )

4 years ago

0 Hi, Another Bug To Report With The Aws_Auto_Scaler Using 1.1.2:

for now it just works

3 years ago

0 Hi Guys, I Had Several Times Now The Following Errors Poping In Agents While Executing A Task:

Sure, where can I find this file?

4 years ago

0 Hello, ~3 Months Ago I Created A Trains-Server In A Machine With 30Gb Of Disk Space. Today I Wasn'T Able To Connect To Trains-Server, So I Checked The Server And Found That The Disk Full. I Ran:

Stopping the server Editing the docker-compose.yml file, adding the logging section to all services Restarting the serverDocker-compose freed 10Go of logs

4 years ago

0 Hi, I Am Trying To Use Omegaconf With Task.Connect_Configuration And I Get The Following Error:

Yes that’s what I did initially, but eventually I decided that it’s too much complexity added for nothing really, I’d rather drop omegaconf and if one day clearml supports it out of the box take advantage of it

2 years ago

0 Hi, Although

TimelyPenguin76 clearml 0.17.5 and clearml-agent 0.17.2

3 years ago

0 Hey, I Have One Question Regarding The Cleanup_Service Task In The Devops Project: Does It Assume That The Agent In Services Mode Is In The Trains-Server Machine?

Thanks TimelyPenguin76 and AgitatedDove14 ! I would like to delete artifacts/models related to the old archived experiments, but they are stored on s3. Would that be possible?

4 years ago

0 Hey There, Since A Bit I Often Find Experiments Being Stuck While Training A Model. It Seems To Happen Randomly And I Could Not Find A Reproducible Scenario So Far, But It Happens Often Enough To Be Annoying (I'D Say 1 Out Of 5 Experiments). The Symptoms

Hi AgitatedDove14 , sorry somehow this message got lost 😄
clearml version is the latest at the time, 1.7.1 Yes, I always see the "model uploaded completed" for such stuck tasks I am using python 3.8.10

2 years ago

0 Hi There, Is It Possible To Configure The Clearml-Agent To Run Some Commands Before Running Each Experiment It Launches? Eg.

yes but they are in plain text and I would like to avoid that

3 years ago

0 Hi, From Within An Experiment, How Can I Intercept The Signal That The Experiment Was Aborted And Execute A Cleanup Function? I Tried To Intercept Sigint And Sigterm, Unsuccessfully:

How exactly is the clearml-agent killing the task?

2 years ago

0 Hi, From Within An Experiment, How Can I Intercept The Signal That The Experiment Was Aborted And Execute A Cleanup Function? I Tried To Intercept Sigint And Sigterm, Unsuccessfully:

yes

2 years ago

0 Hi, I Encounter A Weird Behavior: I Have A Task A That Schedules A Task B. Task B Is Executed On An Agent, But With An Old Commit

The task I cloned from is not the one I though

4 years ago

0 Hi, Is It Possible To Disable Some Of The System Metrics Monitored? And Also Downsample The Rate Of Logging?

AgitatedDove14 I see that the default sample_frequency_per_sec=2. , but in the UI, I see that there isn’t such resolution (ie. it logs every ~120 iterations, corresponding to ~30 secs.) What is the difference with report_frequency_sec=30. ?

3 years ago

0 Hi, Is It Possible To Disable Some Of The System Metrics Monitored? And Also Downsample The Rate Of Logging?

Nice, thanks!

3 years ago

0 How Can I Do The Following? (Basically, Filtering By Task Type)

I found, the filter actually has to be an iterable:
Task.get_tasks(project_name="my-project", task_name="my-task", task_filter=dict(type=["training"])))

4 years ago

0 How Can I Do The Following? (Basically, Filtering By Task Type)

Thanks!

4 years ago

0 Hi, I Am Trying To Use The Clearml-Agent In Docker Mode To Run An Experiment, But It Seems To Fail Passing The Clearml.Conf File To The Docker Container:

The rest of the configuration is set with env variables

one year ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

After I started clearml-session

2 years ago

0 Hi, I Am Trying To Use The Clearml-Agent In Docker Mode To Run An Experiment, But It Seems To Fail Passing The Clearml.Conf File To The Docker Container:

might be worth documenting 😄

one year ago

0 Hey There, Happy New Year To All Of You

Hi AgitatedDove14 , thanks for the answer! I will try adding 'multiprocessing_context='forkserver' to the DataLoader. In the issue you linked, nirraviv mentionned that forkserver was slower and shared a link to another issue https://github.com/pytorch/pytorch/issues/15849#issuecomment-573921048 where someone implemented a fast variant of the DataLoader to overcome the speed problem.
Did you experiment any drop of performances using forkserver? If yes, did you test the variant suggested i...

3 years ago

Show more results