Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
49 Questions, 8126 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

25 × Eureka!
0 Hello! Does Anyone Know How To Do

Hi @<1603198134261911552:profile|ColossalReindeer77>

Hello! does anyone know how to do

HPO

when your parameters are in a

Hydra

Basically hydra parameters are overridden with "Hydra/param"
(this is equivalent to the "override" option of hydra in CLI)

2 years ago
0 Hi, Does Anyone Use Mlflow / Weight & Biases /

Could you disable the windows anti-virus firewall and test?

5 years ago
0 Hello I'M Running A Local Agent . While Its Running The Task I Get This Error. Any Suggestion? Uccessfully Installed Numpy-1.24.4 Found Pytorch Version Torch==2.0.1 Matching Cuda Version 0 Found Pytorch Version Torchaudio==2.0.2 Matching Cuda Version 0 Er

You should manually remove the cudatoolkit from the installed packages section in the UI, then try to send it to the agent and see if it works. The question is how it ended there in the first place

2 years ago
0 I'M Using Tensorboard Summarywriter To Add Scalar Metrics For The Experiment. If Experiment Crashed, And I Want To Continue It From Checkpoint, For Some Reason It Plots Metrics In A Really Weird Way. Even Though I Pass Global_Step=Epoch To The Summarywrit

maybe I should use explicit reporting instead of Tensorboard

It will do just the same 😞

there is no method for settingΒ 

last iteration

, which is used for reporting when continuing the same task. maybe I could somehow change this value for the task?

Let me double check that...

overwriting this value is not ideal though, because for :monitor:gpu and :monitor:machine ...

That is a very good point

but for the metrics, I explicitly pass th...

4 years ago
0 What Would Be The Best Way To Approach This Flow?

Because by definition the Task already exists

3 years ago
0 Hi All, I Am Trying To Execute Somewhat Custom Hpo Scheme With Clearml. I Would Want That A Single Running Python Script Will Be Able To Sample The Optimizer, Init A Task And Report The Result Multiple Times. I Didn'T Find Anything Similar In The Docs Or

we have some other parts, and for some cases we get initialization time can be about 10 times the experiment time

Before I dive into some agent in agent hacking, I would consider "caching" this preprocessing on an auxiliary Task as an artifact. Basically add another argument for the auxiliary Task, and fetch the data from it (obviously you will need to run it once before the optimizer launches the first experiment).
Now that is out of the way (which really would be the preferred engin...

4 years ago
0 Hello Regarding The Clearml-Agent Daemon, Is It Possible To Set Up The

Hi UnsightlyHorse88
Hmm, try adding to your clearml.conf file:
agent.cpu_only = trueif that does not work try adding to the OS environment
export CLEARML_CPU_ONLY=1

3 years ago
0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

How did you define the decorator of "train_image_classifier_component" ?
Did you define:
@PipelineDecorator.component(return_values=['run_model_path', 'run_tb_path'], ...Notice two return values

3 years ago
0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

I located the issue, I'm assuming the fix will be in the next RC πŸ™‚
(probably tomorrow or before the weekend)

3 years ago
0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

Hi PanickyMoth78
So the current implantation of the pipeline parallelization is exactly like python async function calls:
for dataset_conf in dataset_configs: dataset = make_dataset_component(dataset_conf) for training_conf in training_configs: model_path = train_image_classifier_component(training_conf) eval_result_path = eval_model_component(model_path)Specifically here since you are passing the output of one function to another, image what happens is a wait operation, hence it ...

3 years ago
0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

Come to think about it, maybe we should have "parallel_for" as a utility for the pipeline since this is so useful

3 years ago
0 Hi, I'Ve Recently Upgraded To 0.15.1 From 0.14.2, And For Some Reason A Code That Previously Worked In Which I'M Getting The Tags Of A Model Using

PompousBeetle71 you can also use ModelOutput.update_weights_package to store multiple files at once (they will all be packaged into a single zip, and unpacked when you get them back via ModelInput). Would that help?

5 years ago
0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

These paths are

pathlib.Path

. Would that be a problem?

No need to worry, it should work (i'm assuming "/src/clearml_evaluation/" actually exists on the remote machine, otherwise useless πŸ™‚

3 years ago
0 1St: Is It Possible To Make A Pipeline Component Call Another Pipeline Component (As A Substep)? Or Only The Controller Can Do It? 2Nd: I Am Trying To Call A Function Defined In The Same Script, But Unable To Import It. I Passing The Repo Parameter To The

1st: is it possible to make a pipeline component call another pipeline component (as a substep)

Should work as long as they are in the same file, you can however launch and wait any Task (see pipelines from tasks)

2nd: I am trying to call a function defined in the same script, but unable to import it. I passing the repo parameter to the component decorator, but no change, it always comes back with "No module named <module>" after my

from module import function

c...

3 years ago
0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

I assume now it downloads "more" data as this is running in parallel (and yes I assume that before it deleted the files it did not need)
But actually, at east from a first glance, I do not think it should download it at all...
Could it be that the "run_model_path" is a "complex" object of a sort, and it needs to test the values inside ?

3 years ago
0 Anyone Knows Why This Happens?

Yes the "epoch_loss" is the training epoch loss (as expected I assume).

thought that was just the loss reported at the end of the train epoch via tf

It is, isn't that what you are seeing ?

3 years ago
0 Task Struck At

Thanks @<1523701713440083968:profile|PanickyMoth78> for pining, let me check if I can find something in the commit log, I think there was a fix there...

2 years ago
0 Hi, I Recently Started Evaluating Trains. Given That Tensorboard Is Much More Mature, And Our Team Is Used To It, I Think It Is Likely We Won’T Want To Stop Using Tensorboard Completely And Just Switch To Trains. But I Am Thinking It Could Be Pretty Use

Hi LivelyLion31 I missed your S3 question, apologies. What did you guys end up doing?
BTW you could always upload the entire TB log folder as artifact, it's simple task.upload_artifact('tensorboard', './tblogsfolder')

5 years ago
0 Hi, I Have A Question About

Hi SoggyFrog26
Yes, it is stored at ~/.clearml_data.json
Notice you can always change it by passing --id dataset_id

4 years ago
0 Hi, Is There Any Way To Get Experiment Debug Images Programmatically?

Basically try with the latest RC πŸ™‚
pip install trains 0.15.2rc0

5 years ago
0 Hi All, I Am Getting A Bunch Of This Kind Of Log Messages "Clearml.Storage - Info - Starting Upload: /Tmp/.Clearml.Upload_Model_6Ou50Pb1.Tmp =>" I Am Pretty Sure They Happen As A Part Of The Model Initialization About 10 Of Those, My Guess Is That Every T

RipeGoose2 models are automatically registered
i.e. added to the models artifactory, but it only points to where the files are stored
Only if you are passing the output_uri argument to the Task.init, they will be actually uploaded.
If you want to disable this behavior you can pass
Task.init(..., auto_connect_frameworks={'pytorch': False})

4 years ago
Show more results compactanswers