JitteryCoyote63

214 Questions, 1021 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

979 × Eureka!

Questions 214
Answers 1021

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, It Seems That The

Hi, It seems that the package_manager.pip_version has been removed from the https://allegro.ai/docs/references/trains_ref/#agent , although still being shown...

clearml

4 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Are The Various Task Types Available In 0.15? I Am Getting

Are the various task types available in 0.15? I am getting > 2020-06-09 12:58:53,287 - trains.Task - WARNING - Retrying, previous request failed : 'custom' i...

clearml

4 years ago

0 Votes

17 Answers

1K Views

0 Votes 17 Answers 1K Views

Hi, I Have Another Bug To Report For Clearml-Server 1.2 (Self Hosted) In The Console Logs Of An Experiments, I Cannot See The Latest Logs. Eg My Experiment Is Done, But I Can Only See The Logs Of To The Installation Of The Packages. If I Download The Log

Hi, I have another bug to report for clearml-server 1.2 (self hosted) In the console logs of an experiments, I cannot see the latest logs. Eg my experiment i...

clearml

2 years ago

0 Votes

12 Answers

964 Views

0 Votes 12 Answers 964 Views

Hey, Would It Possible To Add An Option To Make

Hey, would it possible to add an option to make task.upload_artifact() blocking? (Not running in background)

clearml

4 years ago

0 Votes

9 Answers

1K Views

0 Votes 9 Answers 1K Views

Hi, I Want To Upgrade Clearml Server From 1.1 To 1.2 (Self Hosted). I Have The Following Setup:

Hi, I want to upgrade clearml server from 1.1 to 1.2 (self hosted). I have the following setup: /dev/nvme0n1p1 30G 21G 8.9G 70% / <- This is where /opt/clear...

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, I Have A Question About

Hi, I have a question about https://clear.ml/docs/latest/docs/references/sdk/logger#report_scatter3d : Would it be possible to pass a matplotlib figure in 3d...

clearml

2 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Hey There, I Moved The Clearml S3 Bucket Where I Stored All My Clearml Data From One S3 Bucket To Another And Now I Realized That All The Models/Experiments Logged In The Clearml-Server Still Refer To The Old S3 Bucket. Is There A Way To Update All The Re

Hey there, I moved the clearml s3 bucket where I stored all my clearml data from one s3 bucket to another and now I realized that all the models/experiments ...

clearml

3 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

The “Manage Queue” Option In The Right Tab On A Queued Experiment Is Broken In V1.0 (It Does Nothing)

The “Manage queue” option in the right tab on a queued experiment is broken in v1.0 (it does nothing)

clearml

3 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi There, I Think There Is A Bug With Clearml Sdk V0.17.5Rc2: When Running A Task Locally, The Dashboard Doesnt Not Shows The Task As Finished Once The Task Is Finished

Hi there, I think there is a bug with clearml sdk v0.17.5rc2: when running a task locally, the dashboard doesnt not shows the task as finished once the task ...

clearml

3 years ago

0 Votes

27 Answers

1K Views

0 Votes 27 Answers 1K Views

Hi, I Have An Agent That Is Running Two Experiments At The Same Time: One That Was Running For A Long Time (11H) And One That The Agent Picked Up Afterwards, While The First One Was Still Running. Context: I Have 3 Agents Up (Not In Docker Mode) And All O

Hi, I have an agent that is running two experiments at the same time: one that was running for a long time (11h) and one that the agent picked up afterwards,...

mlops

4 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Hi, I Would Like To Use Pytorch3D==0.5.0 With Torch==1.9.1 On Cuda Version 110, Locally It Works, But The Clearml Agent Fails Setting Up The Environment With The Following Error:

Hi, I would like to use pytorch3d==0.5.0 with torch==1.9.1 on cuda version 110, locally it works, but the clearml agent fails setting up the environment with...

mlops

3 years ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

Hi, I Have A Question Regarding The Aws_Autoscaler: It Usually Takes ~Hours To Get A Gpu Instance Nowadays. I Was Thinking, It Would Be Much More Interesting To Stop The Instances (Clearml-Agents) Instead Of Terminating Them Once They Are Inactive, So Tha

Hi, I have a question regarding the aws_autoscaler: It usually takes ~hours to get a GPU instance nowadays. I was thinking, it would be much more interesting...

mlops

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hey There Again, I Am Not Sure To Understand What Is The Difference Between Storagemanager And Storagehelper And Which One To Use?

Hey there again, I am not sure to understand what is the difference between StorageManager and StorageHelper and which one to use?

clearml

4 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Hi There

Hi there 🙂 Task.get_parameters() returns an empty dict from within a trains-agent task being executed. When I execute it outside, it works properly. Is it i...

clearml

4 years ago

0 Votes

9 Answers

1K Views

0 Votes 9 Answers 1K Views

Another Strange Behavior Of The Python Sdk Cli: After Executing Python My_Task.Py, Where My_Task.Py Creates And Send To The Queue An Experiment, The Command Returns But After Some Time Some Messages Are Printed In The Console, Such As

Another strange behavior of the python SDK CLI: after executing python my_task.py, where my_task.py creates and send to the queue an experiment, the command ...

clearml

3 years ago

0 Votes

18 Answers

1K Views

0 Votes 18 Answers 1K Views

Hello There, I Would Like To Do Run Cleanup Code In Case The User Aborts One Task From The Dashboard (The Agent Is Not Using The Task In Docker). What Signal Should I Listen For In The Task?

Hello there, I would like to do run cleanup code in case the user aborts one task from the dashboard (the agent is not using the task in docker). What signal...

mlops

4 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hi, Are The Experiments Logs Stored In S3 Or In The Trains-Server? (When Using S3 As Artifact Storage)

Hi, are the experiments logs stored in s3 or in the trains-server? (When using s3 as artifact storage)

clearml

3 years ago

0 Votes

10 Answers

921 Views

0 Votes 10 Answers 921 Views

Hi, Just Want To Report A Small Bug In The Clearml Dashboard: After Queuing An Experiment, If I Change The Experiment Queue, Then Go Back To The Experiment Info Tab, The Queue Property Still Shows The Previous Queue

Hi, just want to report a small bug in the clearml dashboard: after queuing an experiment, if I change the experiment queue, then go back to the experiment I...

clearml

3 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi, I Recently Updated My Clearml To 1.1.2 And A Code That Was Working Before Now Behaves Completely Differently: I Am Using The Following To Log Debug Samples:

Hi, I recently updated my clearml to 1.1.2 and a code that was working before now behaves completely differently: I am using the following to log debug sampl...

clearml

3 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Not Very Important, But Small Suggestion For The Web Ui: Under The Queues Tab, In The Queues Wait Time Graph, Would It Be Possible To Switch From Seconds To Minutes? When Waiting For Aws Instances, Usually It Can Take Up To An Hour, So Having 3.3K Seconds

Not very important, but small suggestion for the web UI: under the QUEUES tab, in the queues wait time graph, would it be possible to switch from seconds to ...

aws

2 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Hi, On Clearml-Server 1.5.0, In Scalar Graphs, The New Default Value Is “Show Closest Data On Hover”. Would It Be Possible To Make It Automatically Set To “Compare Data On Hover” When Comparing Multiple Experiments?

Hi, on clearml-server 1.5.0, in scalar graphs, the new default value is “Show closest data on hover”. Would it be possible to make it automatically set to “C...

clearml

2 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hi, Some Properties Of The Task Object Are Not Listed In The Documentation (Such As Task.Parent, Which Is Not Clear Whether It Is The Parent Task Object Itself Or The Id Of The Parent Task).

Hi, some properties of the Task object are not listed in the documentation (such as task.parent, which is not clear whether it is the parent task object itse...

clearml

4 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Hey, I Have One Question Regarding The Cleanup_Service Task In The Devops Project: Does It Assume That The Agent In Services Mode Is In The Trains-Server Machine?

Hey, I have one question regarding the cleanup_service task in the DevOps project: Does it assume that the agent in services mode is in the trains-server mac...

mlops

4 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

Hi, I Have A Local Package That I Use To Train My Models. To Start Training, I Have A Script That Calls

Hi, I have a local package that I use to train my models. To start training, I have a script that calls task._update_requirements([".", "torch==1.11.0"]) . I...

mlops

2 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hi, I Have A Question Regarding The Aws-Autoscaler: Am I Understanding Correctly That:

Hi, I have a question regarding the aws-autoscaler: am I understanding correctly that: max_idle_time_min=5 max_spin_up_time_min=10 polling_interval_time_min=...

mlops

3 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, In The "Choose Compared Experiments" View Of The Webui, Would It Be Possible To Add A Toggle To Include Archived Experiments In The Results Of The Search? Also Add The Task Type Field?

Hi, in the "Choose compared experiments" view of the WebUI, would it be possible to add a toggle to include archived experiments in the results of the search...

clearml

2 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

Hi, Is There A Way To Update The Setup Shell Script Via The Sdk?

Hi, is there a way to update the setup shell script via the SDK?

clearml

2 years ago

0 Votes

1 Answers

948 Views

0 Votes 1 Answers 948 Views

Hi There, Would It Be Possible To Add Some Neural Architecture Search Example, As For The Hyperparameter Optimizer Examples?

Hi there, would it be possible to add some Neural Architecture Search example, as for the HyperParameter Optimizer examples?

clearml

3 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Hi Guys, Last Night One Of Our Agents (0.16.1) Was Disconnected From Our Trains-Server While Executing An Experiment. I Saw That Because The Experiment It Was Running Had The Status Aborted And I Could Not See The Agent In The List Of Available Workers. H

Hi guys, Last night one of our agents (0.16.1) was disconnected from our trains-server while executing an experiment. I saw that because the experiment it wa...

mlops

4 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hi, Is Clearml-Server Compatible With Latest Versions Of Es ( > 7.6.2)?

Hi, is clearml-server compatible with latest versions of ES ( > 7.6.2)?

clearml

3 years ago

Show more results

0 Hey There, Happy New Year To All Of You

Hi AgitatedDove14 , so I ran 3 experiments:
One with my current implementation (using "fork") One using "forkserver" One using "forkserver" + the DataLoader optimizationI sent you the results via MP, here are the outcomes:
fork -> 101 mins, low RAM usage (5Go constant), almost no IO forkserver -> 123 mins, high RAM usage (16Go, fluctuations), high IO forkserver + DataLoader optimization: 105 mins, high RAM usage (from 28Go to 16Go), high IO
CPU/GPU curves are the same for the 3 experiments...

4 years ago

0 Hi, If I Am Starting My Training With The Following Command:

I opened an https://github.com/pytorch/ignite/issues/2343 in ignite’s repo and a https://github.com/pytorch/ignite/pull/2344 , could you please have a look? There might be a bug in clearml Task.init in distributed envs

3 years ago

0 Hi, If I Am Starting My Training With The Following Command:

And I am wondering if only the main process (rank=0) should attach the ClearMLLogger or if all the processes within the node should do that

3 years ago

0 Hi, Did Anyone Experiment With Running On The Aws Autoscaler On Spots And Knows Whether There Is Configuration For Retry Policy When Spot Get Evacuated Mid-Job?

Hi there, yes I was able to make it work with some glue code:
Save your model, optimizer, scheduler every epoch Have a separate thread that periodically pulls the instance metadata and check if the instance is marked for stop, in this case, add a custom tag eg. TO_RESUME Have a services that periodically pulls failed experiments from the queue with the tag TO_RESUME, force marking them as stopped instead of failed and reschedule them with as extra-param the last checkpoint

3 years ago

0 Hey There, I Moved The Clearml S3 Bucket Where I Stored All My Clearml Data From One S3 Bucket To Another And Now I Realized That All The Models/Experiments Logged In The Clearml-Server Still Refer To The Old S3 Bucket. Is There A Way To Update All The Re

Hi SuccessfulKoala55 , will I be able to update all references to the old s3 bucket using this command?

3 years ago

0 Hi, Another Bug To Report With The Aws_Auto_Scaler Using 1.1.2:

clearml 1.1.1 works

3 years ago

0 Hi Guys, With The New Venv Caching Available In Clearml, I Have The Following Problem: I Force My Pip Requirements To Be:

I had this problem before

3 years ago

0 Hi, In One Of My Agents With Cuda Version: 11.1 (From Nvidia-Smi), Clearml Agent 0.17.1 Detects Version 100 (I Can See From Experiments Logs:

thanks for clarifying! Maybe this could be clarified in the agent logs of the experiments with something like the following?
agent.cuda_driver_version = ... agent.cuda_runtime_version = ...

3 years ago

0 Hey There, Would It Be Possible To Make Clearml-Agents Support Both Docker Mode And Venv Mode At The Same Time? Ie. Not Requiring To Be Restarted To Switch The Mode. The Mode Should Be Define On The Task Level: I Start An Experiment And Define Whether It

AgitatedDove14 In theory yes there is no downside, in practice running an app inside docker inside a VM might introduce slowdowns. I guess it’s on me to check whether this slowdown is negligible or not

2 years ago

0 Hi, I Am Giving Another Try To Clearml-Session And I Am Blocked At The Current Error Shown When The Cli Try To Establish The Tunneling:

Well not really

2 years ago

0 Hi, Is It Possible To Pass Environment Variables To Agents Created By The Aws Autoscaler Service?

I'll try with that; https://github.com/allegroai/clearml/compare/master...H4dr1en:add-aws-params

3 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

I followed https://github.com/NVIDIA/nvidia-docker/issues/1034#issuecomment-520282450 and now it seems to be setting up properly

3 years ago

0 Hi There, I Have A Problem With Pyjwt: I Am Using

You already fixed the problem with pyjwt in the newest version of clearml/clearml-agents, so all good 😄

3 years ago

0 Hey, I Moved My Trains-Server To Another Machine, Zipping The /Opt/Trains/Data Folder As Described In The Docs

I was able to fix by applying for a license and registering it

4 years ago

0 Hi, I Am Getting The Following Errors In The Experiments I Am Currently Running:

Also what is the benefit of having by default index.number_of_shards = 1 for the metrics and the logs indices? Having more allows to scale and later move them in separate nodes if needed - the default heap size being 2Gb, it should be possible, or?

3 years ago

0 Hi, How Can I Get The Logs From The Pytorch Ignite Early Stopping Handler To Be Logged In Clearml?

(by console you mean in the dashboard right? or the terminal?)

3 years ago

0 Hi, I Just Updated Clearml Server 1.0 Using

Hi SuccessfulKoala55 , How can I now if I log in in this free access mode? I assume it is since in the login page I only see login field, not password field

3 years ago

0 Hello, ~3 Months Ago I Created A Trains-Server In A Machine With 30Gb Of Disk Space. Today I Wasn'T Able To Connect To Trains-Server, So I Checked The Server And Found That The Disk Full. I Ran:

Guys the experiments I had running didn't fail, they just waited and reconnected, this is crazy cool

4 years ago

0 Hi, How Can I Get The Logs From The Pytorch Ignite Early Stopping Handler To Be Logged In Clearml?

with my hack yes, without, no

3 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

I got some progress TimelyPenguin76 , Now the task runs and I get the error from docker:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

3 years ago

0 Hi, I Deleted All Archived Experiments In A Project And I Just Realized All Experiments Of All Projects Were Deleted (Clearml Server V1.0.0)

Hi CumbersomeCormorant74 yes, this is almost the scenario: I have a dozen of projects. In one of them, I have ~20 archived experiments, in different states (draft, failed, aborted, completed). I went to this archive, selected all of them and deleted them using the bulk delete operation. I had several failed delete popups. So I tried again with smaller bulks (like 5 experiments at a time) to localize the experiments at the origin of the error. I could delete most of them. At some point, all ...

3 years ago

0 Hi There, I Have A Problem With Pyjwt: I Am Using

yes -> but I still don't understand why the post_packages didn't work, could be worth investigating

3 years ago

0 Hi, How Can I Get The Logs From The Pytorch Ignite Early Stopping Handler To Be Logged In Clearml?

AgitatedDove14 I was able to redirect the logger by doing so:
clearml_logger = Task.current_task().get_logger().report_text early_stopping = EarlyStopping(...) early_stopping.logger.debug = clearml_logger early_stopping.logger.info = clearml_logger early_stopping.logger.setLevel(logging.DEBUG)

3 years ago

0 Hi, I Update Recently To Clearml-Server 1.2 (Self Hosted), Great Job! I Am Seeing The Popup Asking For S3 Creds Often When Navigating In Debug Samples. I Set Them Multiple Times Under Settings > Configuration > Web App Cloud Access, But For Some Reason It

it also happens without hitting F5 after some time (~hours)

2 years ago

0 Hi, Is There A Way To Stop A Clearml-Agent From Within An Experiment? Or Block It To Prevent It Running Any Other Task?

The simple workaround I imagined (not tested) at the moment is to sleep 2 minutes after closing the task, to keep the clearml-agent busy until the instance is shutted down:
self.clearml_task.mark_stopped() self.clearml_task.close() time.sleep(120) # Prevent the agent to pick up new tasks

3 years ago

0 Hi, Is There A Way To Stop A Clearml-Agent From Within An Experiment? Or Block It To Prevent It Running Any Other Task?

My use case it: in a spot instance marked for termination after 2 mins by aws, I want to close a task and prevent the clearml-agent to pick up a new task after.

3 years ago

0 Hi, Is There A Way To Stop A Clearml-Agent From Within An Experiment? Or Block It To Prevent It Running Any Other Task?

I want the clearml-agent/instance to stop right after the experiment/training is “paused” (experiment marked as stopped + artifacts saved)

3 years ago

0 Hi Guys, Coming This Time To Share An Idea Of A Killer Feature For Clearml

as it's also based on pytorch-ignite!

I am not sure to understand, what is the link with pytorch-ignite?

We're in the brainstorming phase of what are the best approaches to integrate, we might pick your brain later on

Awesome, I'd be happy to help!

3 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

So the problem comes when I do
my_task.output_uri = " s3://my-bucket , trains in the background checks if it has access to this bucket and it is not able to find/ read the creds

4 years ago

0 Hey, Often I Want To Compare Scalars Of Two Experiments With The Same Name But With Different Tags. In The Scalars Comparison Tab, I Cannot See Which Experiment Is Which Because I Don’T See The Tags. Usually, I Rename The Experiments So That I Can Identif

yes, something like that

3 years ago

Show more results