DilapidatedParrot58

42 Questions, 205 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

186 × Eureka!

Answers 205

0 Is Is Possible To Pass Custom

right now we can pass github secrets to the clearml agent training containers ( CLEARML_AGENT_GIT_PASS) to install private repos

we need a way to pass secrets to access our database with annotations

2 years ago

0 Hey Guys, I Keep Getting "Failed Parsing Task Parameter" Warning For The Arguments Such As This One:

not necessarily, there are rare cases when container keeps running after experiment is stopped or aborted

will do!

3 years ago

0 Is Is Possible To Pass Custom

ah, I see, I still keep it in agent.extra_docker_arguments

2 years ago

0 Downloading Output Artifacts From S3 By Clicking On The Download Button Next To Model Url Was Great, But Since We Moved From Aws To Yandex.Cloud, This Feature Doesn'T Work. Any Chance You Could Support Other Cloud Providers?

yes. we upload artifacts to Yandex.Cloud S3 using ClearML. we set " s3://storage.yandexcloud.net/clearml-models " as output uri parameter and add this section to the config:
{
host: " http://storage.yandexcloud.net "
key: "KEY"
secret:"SECRET_KEY",
secure: true
}

this works like a charm. but download button in UI is not working

2 years ago

0 Hey Guys The First Time I'M Seeing This Behavior I'M Adding A New User To /Opt/Trains/Config/Apiserver.Conf And Restarting The Containers. All Old Users Are Able To Log In, But Not The New One (Invalid User/Password Combination). Any Ideas?

JIC - trains still works after that, it's just that the new user is not added and hence is not able to login

4 years ago

0 Anyone Having Problems With Clearml Slowing Down Pytorch Experiments? Auto_Connect_Framework={“Pytorch”: False} Helps, But It’S Not A Great Solution. We Think It’S Related To Clearml Trying To Do Something At Each Dataloader Iteration. We’Ll Try To Provid

we’re using latest ClearML server and client version (1.2.0)

2 years ago

0 Yo Clearml Folks! How To Force-Reinstall Package From Github In Installed Packages? Tried Different Strategies (Using @Commit_Id, Versioning, Flag --Force-Reinstall), And It Keeps Saying That Requirement Is Already Satisfied (Old Version Of The Package Is

from the experiment log

3 years ago

0 It Would Be Nice To Group Experiments Within Projects Use Cases:

that's right
for example, there are tasks A, B, C
we run multiple experiments for A, finetune some of them in separate tasks, then choose one or more best checkpoints, run some experiments for task B, choose the best experiment, and finally run task C

so we get a chain of tasks: A - A-ft - B- C

ClearML pipeline doesn't quite work here because we would like to analyze results of each step before starting next task

but it would be great to see predecessors of each experiment in the chain

2 years ago

0 When We Train The Models, We Often Choose Checkpoint Based On The Validation Accuracy, But Test Set Accuracy (Or Specific Class Validation Accuracy) Is Not Necessarily The Best For This Checkpoint. Right Now There Are Options To Add Columns With Max And L

I guess, this could overcomplicate ui, I don't see a good solution yet.

as a quick hack, we can just use separate name (eg "best_val_roc_auc") for all metric values for the current best checkpoint. then we can just add columns with the last value of this metric

3 years ago

0 Is Is Possible To Pass Custom

it works, but it's not very helpful since everybody can see a secret in logs:
Executing: ['docker', 'run', '-t', '--gpus', '"device=0"', '-e', 'DB_PASSWORD=password']

2 years ago

0 Feature Request: We Have Several Servers With Multiple Gpus, And Atm We Have To Manually Check Which Gpu Has Enough Memory Before Queuing Each Experiment Into The Right Queue. It Would Be Cool If We Could Set Required Gpu Memory Parameter For Each Experim

got it, thanks!

3 years ago

okay, what do I do if it IS installed?

3 years ago

0 It Would Be Nice To Group Experiments Within Projects Use Cases:

more like collapse/expand, I guess. or pipelines that you can compose after running experiments to see that experiments are connected to each other

2 years ago

0 There Is Something Weird Going On With Console Log After Latest Updates Of Clearml Server. It Doesn'T Show The Latest Updates, Instead It Often Jumps To The Seemingly Random Parts Of The Console Output

yes

one year ago

0 Hey Guys, Here I Am Again With Another Question

fantastic, everything is working perfectly
thanks guys

4 years ago

0 Step 3 Task (

https://allegro.ai/clearml/docs/docs/examples/frameworks/pytorch/notebooks/table/tabular_training_pipeline.html

3 years ago

0 Step 3 Task (

https://allegro.ai/clearml/docs/docs/examples/pipeline/pipeline_controller.html

3 years ago

0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

the weird part is that the old job continues running when I recreate the worker and enqueue the new job

4 years ago

0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

thanks! I need to read all parts of documentation really carefully =) for some reason, couldn't find this section

4 years ago

0 Is Is Possible To Pass Custom

1.2.3

2 years ago

0 Is Is Possible To Pass Custom

agent.hide_docker_command_env_vars.extra_keys: ["DB_PASSWORD=password"]

like this? or ["DB_PASSWORD", "password"]

2 years ago

0 Hey Guys, I Keep Getting "Failed Parsing Task Parameter" Warning For The Arguments Such As This One:

we have a baremetal server with ClearML agents, and sometimes there are hanging containers or containers that consume too much RAM. unless I explicitly add container name in container arguments, it will have a random name, which is not very convenient. it would be great if we could set default container name for each experiment (e.g., experiment id)

3 years ago

0 Is Is Possible To Pass Custom

I guess I could edit docker-compose.yaml

2 years ago

0 It Would Be Nice To Group Experiments Within Projects Use Cases:

hard to say, maybe just “related experiments” in experiment info would be enough. I’ll think about it

2 years ago

thanks, this one worked after we changed the package version

3 years ago

0.15.0

4 years ago

0 Clearml-Init Doesn'T Ask For Ports, And Our Server Exposes Ports That Are Different From Default Ones. It Would Be Great To Have An Option To Change Default Ports For Api, File And Web Servers, Otherwise Initialization Fails With Wrong Creds Error

sorry, my bad, after some manipulations I made it work. I have to manually change HTTP to HTTPS in config file for Web and Files (not API) server after initialization, but besides that it works

2 years ago

0 We Can’T Add Overview To The Subprojects (Btw Thank You So Much For Subprojects, This Is Probably The Best Feature Ever Introduced To Trains/Clearml). Is It Intended? When I Click Overview For The Subproject, It Just Shows An Empty Page Without Any Button

yeah, it works for the new projects and for the old projects that have already had a description

3 years ago

0 I'M Using Tensorboard Summarywriter To Add Scalar Metrics For The Experiment. If Experiment Crashed, And I Want To Continue It From Checkpoint, For Some Reason It Plots Metrics In A Really Weird Way. Even Though I Pass Global_Step=Epoch To The Summarywrit

https://github.com/allegroai/clearml/issues/496

3 years ago

this is how it looks if I zoom in on the epochs that ran before the crash

3 years ago

Show more results