Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
DilapidatedParrot58
Moderator
42 Questions, 205 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

186 × Eureka!
0 Votes
8 Answers
1K Views
0 Votes 8 Answers 1K Views
3 years ago
0 Votes
6 Answers
1K Views
0 Votes 6 Answers 1K Views
hey guys, I keep getting "Failed parsing task parameter" warning for the arguments such as this one: parser.add_argument( "--dataset_mean", type = float, nar...
3 years ago
0 Votes
29 Answers
991 Views
0 Votes 29 Answers 991 Views
3 years ago
0 Votes
20 Answers
1K Views
0 Votes 20 Answers 1K Views
4 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
hey guys, do you have any plans to add functionality to export training config with all hyperparameters to the different formats, such as training command li...
4 years ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
hey guys, is there a ready script that can delete all models from S3 (or other storage) that are related to deleted or archived experiments?
3 years ago
0 Votes
2 Answers
976 Views
0 Votes 2 Answers 976 Views
one year ago
0 Votes
11 Answers
1K Views
0 Votes 11 Answers 1K Views
3 years ago
0 Votes
27 Answers
1K Views
0 Votes 27 Answers 1K Views
hey guys, I keep getting trains_agent: ERROR: Connection Error: it seems *api_server* is misconfigured. Is this the TRAINS API server http://apiserver:8008 ?...
3 years ago
0 Votes
25 Answers
1K Views
0 Votes 25 Answers 1K Views
I'm probably stupid, but how do I specify worker name? usecase - I want to create two workers using the same GPU, and new worker just overwrites the old one
4 years ago
0 Votes
30 Answers
1K Views
0 Votes 30 Answers 1K Views
4 years ago
0 Votes
7 Answers
1K Views
0 Votes 7 Answers 1K Views
any chance StorageManager could re-download files only if their size is different from file in cache (as an option)?
3 years ago
Show more results questions
0 What Is The Right Way To Increase Number Of Retries When Using

isn't this parameter related to communication with ClearML Server? I'm trying to make sure that checkpoint will be downloaded from AWS S3 even if there are temporary connection problems

there's https://boto3.amazonaws.com/v1/documentation/api/latest/reference/customizations/s3.html#boto3.s3.transfer.TransferConfig parameter in boto3, but I'm not sure if there's an easy way to pass this parameter to StorageManager

2 years ago
0 What Is The Right Way To Increase Number Of Retries When Using

I'm not sure since names of these parameters do not match with boto3 names, and num_download_attempt is passed https://github.com/allegroai/clearml/blob/3d3a835435cc2f01ff19fe0a58a8d7db10fd2de2/clearml/storage/helper.py#L1439 as container.config.retries

2 years ago
2 years ago
0 When We Train The Models, We Often Choose Checkpoint Based On The Validation Accuracy, But Test Set Accuracy (Or Specific Class Validation Accuracy) Is Not Necessarily The Best For This Checkpoint. Right Now There Are Options To Add Columns With Max And L

I guess, this could overcomplicate ui, I don't see a good solution yet.

as a quick hack, we can just use separate name (eg "best_val_roc_auc") for all metric values for the current best checkpoint. then we can just add columns with the last value of this metric

3 years ago
0 Yo Guys, I'M Getting

I get "The connection has timed out" when I'm trying to reach 8081 port

4 years ago
0 Is Is Possible To Pass Custom

ah, I see, I still keep it in agent.extra_docker_arguments

2 years ago
0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

thanks! I need to read all parts of documentation really carefully =) for some reason, couldn't find this section

4 years ago
0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

the weird part is that the old job continues running when I recreate the worker and enqueue the new job

4 years ago
0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

our GPUs are 48GB, so it's quite wasteful to only run one job per GPU
yeah, I'm aware of that, I would have to make sure they don't fail to infamous CUDA out of memory, but still

4 years ago
0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

another stupid question - what is the proper way to delete a worker? so far I've been using pgrep to find the relevant PID 😃

4 years ago
0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

that's right, I have 4 GPUs and 4 workers. but what if I want to run two jobs simultaneously at the same GPU

4 years ago
4 years ago
0 I'M Using Tensorboard Summarywriter To Add Scalar Metrics For The Experiment. If Experiment Crashed, And I Want To Continue It From Checkpoint, For Some Reason It Plots Metrics In A Really Weird Way. Even Though I Pass Global_Step=Epoch To The Summarywrit

task = Task.get_task(task_id = args.task_id)
task.mark_started()
task.set_parameters_as_dict(
{
"General": {
"checkpoint_file": model.url,
"restart_optimizer": False,
}
}
)
task.set_initial_iteration(0)
task.mark_stopped()
Task.enqueue(task = task, queue_name = task.data.execution.queue)

3 years ago
Show more results compactanswers