DilapidatedParrot58

42 Questions, 205 Answers

Active since 10 January 2023

Last activity 2 years ago

Reputation

Badges 1

186 × Eureka!

Questions 42
Answers 205

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

Hey Guys, I'M Experiencing Seemingly Random Problems With The Experiments. There Are 4 Gpus And 8 Workers (2 Workers Per Gpu) , And Sometimes Experiments Randomly Fail (Or Complete) In The Middle Of The Epoch Without Any Additional Info In The Logs. What

hey guys, I'm experiencing seemingly random problems with the experiments. there are 4 GPUs and 8 workers (2 workers per GPU) , and sometimes experiments ran...

clearml

4 years ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Hey Guys, Do You Have Any Plans To Add Functionality To Export Training Config With All Hyperparameters To The Different Formats, Such As Training Command Line Command, Yaml, Etc.?

hey guys, do you have any plans to add functionality to export training config with all hyperparameters to the different formats, such as training command li...

clearml

5 years ago

0 Votes

6 Answers

2K Views

0 Votes 6 Answers 2K Views

I’M Interested In Learning More About Internals Of Clearml Server - For Example, How Elasticsearch, Mongodb, And Redis Are Used Internally. Are There Any Materials Available?

I’m interested in learning more about internals of ClearML Server - for example, how ElasticSearch, MongoDB, and Redis are used internally. are there any mat...

clearml

2 years ago

0 Votes

20 Answers

1K Views

0 Votes 20 Answers 1K Views

Hey Guys, I'M Trying To Run An Experiment Using Trains-Agent. I Have A Custom Docker Image With Nightly Versions Of Pytorch And Our Own Library Installed From A Private Repo. I Was Assuming That These Packages Will Be Automatically Available To Trains Dur

hey guys, I'm trying to run an experiment using trains-agent. I have a custom Docker image with nightly versions of pytorch and our own library installed fro...

pytorch

4 years ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

There Is Something Weird Going On With Console Log After Latest Updates Of Clearml Server. It Doesn'T Show The Latest Updates, Instead It Often Jumps To The Seemingly Random Parts Of The Console Output

there is something weird going on with console log after latest updates of ClearML Server. it doesn't show the latest updates, instead it often jumps to the ...

clearml

2 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

What Is The Right Way To Increase Number Of Retries When Using

what is the right way to increase number of retries when using StorageManager.get_local_copy?

clearml

2 years ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

When We Train The Models, We Often Choose Checkpoint Based On The Validation Accuracy, But Test Set Accuracy (Or Specific Class Validation Accuracy) Is Not Necessarily The Best For This Checkpoint. Right Now There Are Options To Add Columns With Max And L

when we train the models, we often choose checkpoint based on the validation accuracy, but test set accuracy (or specific class validation accuracy) is not n...

clearml

4 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Is There Any Way To Post Slack Alerts For The Frozen Experiments? (Eg, After Server Restart They Sometimes Get Stuck In Running Mode, Or

is there any way to post Slack alerts for the frozen experiments? (eg, after server restart they sometimes get stuck in Running mode, or https://github.com/p...

clearml

4 years ago

0 Votes

18 Answers

1K Views

0 Votes 18 Answers 1K Views

I Updated Trains-Server Today, And Now It'S Very Unstable, Web Interface Randomly Stops Working. Anyone Had The Same Problem? I'Ve Never Had Any Problems With Updating The Server Before

I updated trains-server today, and now it's very unstable, Web interface randomly stops working. anyone had the same problem? I've never had any problems wit...

clearml

4 years ago

0 Votes

3 Answers

2K Views

0 Votes 3 Answers 2K Views

Here I Am Again... Can'T Find How To Create A Custom Queue

here I am again... can't find how to create a custom queue

clearml

4 years ago

0 Votes

16 Answers

1K Views

0 Votes 16 Answers 1K Views

Yo Guys, I'M Getting

yo guys, I'm getting Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(, 'Connection to O...

clearml

4 years ago

0 Votes

5 Answers

2K Views

0 Votes 5 Answers 2K Views

Hey Guys, I Am Trying To Plan What I Need To Do In Order To Efficiently Use Clearml With Spot Instances 1) Detecting When Spot Instance Is Down And Experiment Is Aborted 2) Extracting S3 Address Of The Latest Checkpoint From Clearml Api 3) Starting New E

hey guys, I am trying to plan what I need to do in order to efficiently use ClearML with spot instances 1) detecting when spot instance is down and experimen...

clearml

3 years ago

Show more results