DilapidatedParrot58

42 Questions, 205 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

186 × Eureka!

Questions 42
Answers 205

0 Votes

7 Answers

979 Views

0 Votes 7 Answers 979 Views

There Is Something Weird Going On With Console Log After Latest Updates Of Clearml Server. It Doesn'T Show The Latest Updates, Instead It Often Jumps To The Seemingly Random Parts Of The Console Output

there is something weird going on with console log after latest updates of ClearML Server. it doesn't show the latest updates, instead it often jumps to the ...

clearml

one year ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Hey Guys, Thanks For Creating Slack Workspace, That'S Really Cool. Question - Are We Missing Smth Or Is Currently Not Possible To Pass S3 Credentials Via Env Variables? We Forked Trains And Added A Simple Fix (

hey guys, thanks for creating Slack workspace, that's really cool. question - are we missing smth or is currently not possible to pass S3 credentials via env...

clearml

4 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Step 3 Task (

Step 3 Task ( https://github.com/allegroai/trains/blob/master/examples/pipeline/step3_train_model.py ) - Loads the processed data (from Step 2) and clearml a...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hey Guys, Do You Have Any Tutorials Or Examples Of Intergration With Dvc?

hey guys, do you have any tutorials or examples of intergration with dvc?

clearml

4 years ago

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

Downloading Output Artifacts From S3 By Clicking On The Download Button Next To Model Url Was Great, But Since We Moved From Aws To Yandex.Cloud, This Feature Doesn'T Work. Any Chance You Could Support Other Cloud Providers?

downloading output artifacts from S3 by clicking on the download button next to Model URL was great, but since we moved from AWS to Yandex.Cloud, this featur...

clearml

2 years ago

0 Votes

29 Answers

990 Views

0 Votes 29 Answers 990 Views

I'M Using Tensorboard Summarywriter To Add Scalar Metrics For The Experiment. If Experiment Crashed, And I Want To Continue It From Checkpoint, For Some Reason It Plots Metrics In A Really Weird Way. Even Though I Pass Global_Step=Epoch To The Summarywrit

I'm using Tensorboard SummaryWriter to add scalar metrics for the experiment. if experiment crashed, and I want to continue it from checkpoint, for some reas...

clearml

3 years ago

0 Votes

4 Answers

1K Views

0 Votes 4 Answers 1K Views

Feature Request: Clearml Prints Github Token In The Log, When There Is "Repository Not Found" Error. It Would Be Nice If Could Hide It

feature request: ClearML prints GitHub token in the log, when there is "repository not found" error. it would be nice if could hide it

clearml

3 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

We Have A Use Case Where An Experiment Consists Of Multiple Docker Containers. For Example, One Container Works On Cpu Machine, Preprocesses Images And Puts Them Into Queue. The Second One (Main One) Resides On Gpu Machine, Reads Tensors And Targets From

we have a use case where an experiment consists of multiple docker containers. for example, one container works on CPU machine, preprocesses images and puts ...

clearml

2 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hey Guys, I Keep Getting "Failed Parsing Task Parameter" Warning For The Arguments Such As This One:

hey guys, I keep getting "Failed parsing task parameter" warning for the arguments such as this one: parser.add_argument( "--dataset_mean", type = float, nar...

clearml

3 years ago

0 Votes

30 Answers

1K Views

0 Votes 30 Answers 1K Views

Is Is Possible To Pass Custom

is is possible to pass custom https://clear.ml/docs/latest/docs/configs/env_vars/ to ClearML agents?

clearml

2 years ago

0 Votes

13 Answers

1K Views

0 Votes 13 Answers 1K Views

It Would Be Nice To Group Experiments Within Projects Use Cases:

it would be nice to group experiments within projects use cases: hyperparameter sweep (10 experiments with different learning rate) finetuning models (for ex...

clearml

2 years ago

0 Votes

9 Answers

1K Views

0 Votes 9 Answers 1K Views

Hey Everyone, There'S A Bug That We Experience After Moving To The New Server And Domain. If You Click On The Experiment Name While Viewing Its Details, You Get A 404 Error Because There'S Missing "Experiments" Part In The Address. Details In The Thread

hey everyone, there's a bug that we experience after moving to the new server and domain. if you click on the experiment name while viewing its details, you ...

clearml

2 years ago

0 Votes

7 Answers

1K Views

0 Votes 7 Answers 1K Views

I'M Getting A Lot Of Errors When Running Cleanup Service

I'm getting A LOT of errors when running cleanup service Failed deleting the following URIs - script fails to delete image and text files ERROR - Failed dele...

clearml

2 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

Clearml-Init Doesn'T Ask For Ports, And Our Server Exposes Ports That Are Different From Default Ones. It Would Be Great To Have An Option To Change Default Ports For Api, File And Web Servers, Otherwise Initialization Fails With Wrong Creds Error

clearml-init doesn't ask for ports, and our server exposes ports that are different from default ones. it would be great to have an option to change default ...

clearml

2 years ago

0 Votes

10 Answers

1K Views

0 Votes 10 Answers 1K Views

What Is The Right Way To Increase Number Of Retries When Using

what is the right way to increase number of retries when using StorageManager.get_local_copy?

clearml

2 years ago

0 Votes

16 Answers

1K Views

0 Votes 16 Answers 1K Views

Anyone Having Problems With Clearml Slowing Down Pytorch Experiments? Auto_Connect_Framework={“Pytorch”: False} Helps, But It’S Not A Great Solution. We Think It’S Related To Clearml Trying To Do Something At Each Dataloader Iteration. We’Ll Try To Provid

anyone having problems with ClearML slowing down pytorch experiments? auto_connect_framework={“pytorch”: False} helps, but it’s not a great solution. we thin...

pytorch

2 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Two Annoying Visual Bugs In Clearml Server Ui After Latest Update:

two annoying visual bugs in ClearML Server UI after latest update: experiment status is still shown as “Aborted” after successful resetting until you refresh...

clearml

2 years ago

0 Votes

7 Answers

980 Views

0 Votes 7 Answers 980 Views

When We Train The Models, We Often Choose Checkpoint Based On The Validation Accuracy, But Test Set Accuracy (Or Specific Class Validation Accuracy) Is Not Necessarily The Best For This Checkpoint. Right Now There Are Options To Add Columns With Max And L

when we train the models, we often choose checkpoint based on the validation accuracy, but test set accuracy (or specific class validation accuracy) is not n...

clearml

3 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

Hey Guys, Do You Have Any Plans To Add Functionality To Export Training Config With All Hyperparameters To The Different Formats, Such As Training Command Line Command, Yaml, Etc.?

hey guys, do you have any plans to add functionality to export training config with all hyperparameters to the different formats, such as training command li...

clearml

4 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hey Guys, Here I Am Again With Another Question

hey guys, here I am again with another question 😃 after the latest update, I’m getting this error when I’m trying to compare scalars for more than 10 experi...

clearml

4 years ago

0 Votes

20 Answers

1K Views

0 Votes 20 Answers 1K Views

Hey Guys, I'M Trying To Run An Experiment Using Trains-Agent. I Have A Custom Docker Image With Nightly Versions Of Pytorch And Our Own Library Installed From A Private Repo. I Was Assuming That These Packages Will Be Automatically Available To Trains Dur

hey guys, I'm trying to run an experiment using trains-agent. I have a custom Docker image with nightly versions of pytorch and our own library installed fro...

pytorch

4 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Is There Any Way To Export Csv With Max Metrics And Hyperparameters For Selected Experiments?

is there any way to export CSV with max metrics and hyperparameters for selected experiments?

clearml

3 years ago

0 Votes

8 Answers

1K Views

0 Votes 8 Answers 1K Views

Hey Guys, I'M Experiencing Seemingly Random Problems With The Experiments. There Are 4 Gpus And 8 Workers (2 Workers Per Gpu) , And Sometimes Experiments Randomly Fail (Or Complete) In The Middle Of The Epoch Without Any Additional Info In The Logs. What

hey guys, I'm experiencing seemingly random problems with the experiments. there are 4 GPUs and 8 workers (2 workers per GPU) , and sometimes experiments ran...

clearml

4 years ago

0 Votes

25 Answers

1K Views

0 Votes 25 Answers 1K Views

I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

I'm probably stupid, but how do I specify worker name? usecase - I want to create two workers using the same GPU, and new worker just overwrites the old one

clearml

4 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

I’M Interested In Learning More About Internals Of Clearml Server - For Example, How Elasticsearch, Mongodb, And Redis Are Used Internally. Are There Any Materials Available?

I’m interested in learning more about internals of ClearML Server - for example, how ElasticSearch, MongoDB, and Redis are used internally. are there any mat...

clearml

2 years ago

0 Votes

11 Answers

1K Views

0 Votes 11 Answers 1K Views

We Can’T Add Overview To The Subprojects (Btw Thank You So Much For Subprojects, This Is Probably The Best Feature Ever Introduced To Trains/Clearml). Is It Intended? When I Click Overview For The Subproject, It Just Shows An Empty Page Without Any Button

we can’t add overview to the subprojects (btw thank you SO MUCH for subprojects, this is probably the best feature ever introduced to trains/clearml). is it ...

clearml

3 years ago

0 Votes

16 Answers

1K Views

0 Votes 16 Answers 1K Views

Yo Guys, I'M Getting

yo guys, I'm getting Retrying (Retry(total=2, connect=2, read=5, redirect=5, status=None)) after connection broken by 'ConnectTimeoutError(, 'Connection to O...

clearml

4 years ago

0 Votes

18 Answers

1K Views

0 Votes 18 Answers 1K Views

I Updated Trains-Server Today, And Now It'S Very Unstable, Web Interface Randomly Stops Working. Anyone Had The Same Problem? I'Ve Never Had Any Problems With Updating The Server Before

I updated trains-server today, and now it's very unstable, Web interface randomly stops working. anyone had the same problem? I've never had any problems wit...

clearml

4 years ago

0 Votes

2 Answers

1K Views

0 Votes 2 Answers 1K Views

Feature Request: We Have Several Servers With Multiple Gpus, And Atm We Have To Manually Check Which Gpu Has Enough Memory Before Queuing Each Experiment Into The Right Queue. It Would Be Cool If We Could Set Required Gpu Memory Parameter For Each Experim

feature request: we have several servers with multiple GPUs, and atm we have to manually check which GPU has enough memory before queuing each experiment int...

clearml

3 years ago

0 Votes

5 Answers

1K Views

0 Votes 5 Answers 1K Views

Is There Any Way To Post Slack Alerts For The Frozen Experiments? (Eg, After Server Restart They Sometimes Get Stuck In Running Mode, Or

is there any way to post Slack alerts for the frozen experiments? (eg, after server restart they sometimes get stuck in Running mode, or https://github.com/p...

clearml

3 years ago

Show more results

0 Hey Guys, Here I Am Again With Another Question

hoooraaaay

4 years ago

0 After Recent Clearml Server Update, Whenever I Clone An Experiment, The Default Project For The Draft Copy Is The First Project In The List. Previously, It Would Be The Project Which I Am Cloning This Experiment From. This Was Much More Convenient. Is Thi

will do, thanks

one year ago

0 It Would Be Nice To Group Experiments Within Projects Use Cases:

more like collapse/expand, I guess. or pipelines that you can compose after running experiments to see that experiments are connected to each other

2 years ago

0 Downloading Output Artifacts From S3 By Clicking On The Download Button Next To Model Url Was Great, But Since We Moved From Aws To Yandex.Cloud, This Feature Doesn'T Work. Any Chance You Could Support Other Cloud Providers?

even without port

2 years ago

0 I Updated Trains-Server Today, And Now It'S Very Unstable, Web Interface Randomly Stops Working. Anyone Had The Same Problem? I'Ve Never Had Any Problems With Updating The Server Before

I'll get back to you with the logs when the problem occurs again

4 years ago

0 I Updated Trains-Server Today, And Now It'S Very Unstable, Web Interface Randomly Stops Working. Anyone Had The Same Problem? I'Ve Never Had Any Problems With Updating The Server Before

hmmm allegroai/trains:latest whatever it is

4 years ago

0 I Updated Trains-Server Today, And Now It'S Very Unstable, Web Interface Randomly Stops Working. Anyone Had The Same Problem? I'Ve Never Had Any Problems With Updating The Server Before

I decided to restart the containers one more time, this is what I got.

I had to restart Docker service to remove the containers

4 years ago

0 Hey Everyone, There'S A Bug That We Experience After Moving To The New Server And Domain. If You Click On The Experiment Name While Viewing Its Details, You Get A 404 Error Because There'S Missing "Experiments" Part In The Address. Details In The Thread

I'm not sure it's related to the domain switch since we upgraded to the newest ClearML server version at the same time

2 years ago

awesome, thank you

2 years ago

if you click on the experiment name here, you get 404 because link looks like this:
https://DOMAIN/projects/PROJECT_ID/EXPERIMENT_ID
when it should look like this:
https://DOMAIN/projects/PROJECT_ID/experiments/EXPERIMENT_ID

2 years ago

0 Clearml-Init Doesn'T Ask For Ports, And Our Server Exposes Ports That Are Different From Default Ones. It Would Be Great To Have An Option To Change Default Ports For Api, File And Web Servers, Otherwise Initialization Fails With Wrong Creds Error

sorry, my bad, after some manipulations I made it work. I have to manually change HTTP to HTTPS in config file for Web and Files (not API) server after initialization, but besides that it works

2 years ago

just updated, problem persists

2 years ago

0 We Can’T Add Overview To The Subprojects (Btw Thank You So Much For Subprojects, This Is Probably The Best Feature Ever Introduced To Trains/Clearml). Is It Intended? When I Click Overview For The Subproject, It Just Shows An Empty Page Without Any Button

yeah, it works for the new projects and for the old projects that have already had a description

3 years ago

1.3.0

2 years ago

0 I'M Getting A Lot Of Errors When Running Cleanup Service

I updated S3 credentials, I'll check if they work later

it doesn't explain inability to delete logged images and texts though

2 years ago

0 I'M Getting A Lot Of Errors When Running Cleanup Service

self-hosted ClearML server 1.2.0
SDK version 1.1.6

2 years ago

0 Yo Clearml Folks! How To Force-Reinstall Package From Github In Installed Packages? Tried Different Strategies (Using @Commit_Id, Versioning, Flag --Force-Reinstall), And It Keeps Saying That Requirement Is Already Satisfied (Old Version Of The Package Is

Requirement already satisfied (use --upgrade to upgrade): celsusutils==0.0.1

3 years ago

thanks, this one worked after we changed the package version

3 years ago

okay, what do I do if it IS installed?

3 years ago

from the experiment log

3 years ago

in the docker image

3 years ago

0 What Is The Right Way To Increase Number Of Retries When Using

isn't this parameter related to communication with ClearML Server? I'm trying to make sure that checkpoint will be downloaded from AWS S3 even if there are temporary connection problems

there's https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/s3.html#boto3.s3.transfer.TransferConfig parameter in boto3, but I'm not sure if there's an easy way to pass this parameter to StorageManager

2 years ago

0 What Is The Right Way To Increase Number Of Retries When Using

I'm not sure since names of these parameters do not match with boto3 names, and num_download_attempt is passed https://github.com/allegroai/clearml/blob/3d3a835435cc2f01ff19fe0a58a8d7db10fd2de2/clearml/storage/helper.py#L1439 as container.config.retries

2 years ago

0 Anyone Having Problems With Clearml Slowing Down Pytorch Experiments? Auto_Connect_Framework={“Pytorch”: False} Helps, But It’S Not A Great Solution. We Think It’S Related To Clearml Trying To Do Something At Each Dataloader Iteration. We’Ll Try To Provid

it’s a pretty standard pytorch train/eval loop, using pytorch dataloader and https://docs.monai.io/en/stable/_modules/monai/data/dataset.html

2 years ago

we’re using latest ClearML server and client version (1.2.0)

2 years ago

0 Hey Guys, I'M Experiencing Seemingly Random Problems With The Experiments. There Are 4 Gpus And 8 Workers (2 Workers Per Gpu) , And Sometimes Experiments Randomly Fail (Or Complete) In The Middle Of The Epoch Without Any Additional Info In The Logs. What

it might be that there is not enough space on our SSD, experiments cache a lot of preprocessed data during the first epoch...