Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
48 Questions, 8049 Answers
  Active since 10 January 2023
  Last activity 5 months ago

Reputation

0

Badges 1

25 × Eureka!
0 Are The Various Task Types Available In 0.15? I Am Getting

Damn, JitteryCoyote63 seems like a bug in the backend, it will not allow you to change the task type to the new types 😞

4 years ago
0 Getting This Error At

Any idea why the Pipeline Controller is Running despite the task passing?

What do you mean by "the task passing"

3 years ago
0 I'M Using Tensorboard Summarywriter To Add Scalar Metrics For The Experiment. If Experiment Crashed, And I Want To Continue It From Checkpoint, For Some Reason It Plots Metrics In A Really Weird Way. Even Though I Pass Global_Step=Epoch To The Summarywrit

maybe I should use explicit reporting instead of Tensorboard

It will do just the same 😞

there is no method for settingΒ 

last iteration

, which is used for reporting when continuing the same task. maybe I could somehow change this value for the task?

Let me double check that...

overwriting this value is not ideal though, because for :monitor:gpu and :monitor:machine ...

That is a very good point

but for the metrics, I explicitly pass th...

2 years ago
0 I Am Hosting Clearml Server And I Faced Issue With Closing Datasets. For Some Reason Closing Datasets Ends Up With The Word "Killed" For Datasets More Than 2.5Gb (See Screenshot) The Question Is What Is The Reason Of The Issue? How To Upload Datasets Size

Hi SmugLizard24

The question is what is the reason of the issue?

That is a good question, could it be out of memory? (trying to compress or send the file in one chunk?)

3 years ago
0 Hello! Since Today I Get

@<1523701868901961728:profile|ReassuredTiger98> what are you getting with:

nvidia-smi

And here:

ls -la /usr/local/
3 years ago
0 "Clearml-Data Sync --Folder ." Doesn'T Work

Hi @<1631102016807768064:profile|ZanySealion18>
sorry missed that one

The cache doesn't work, it attempts to download the dataset every time.

just making sure the dataset itself contains all the files?

Once I used clearml-data add --folder * CLI everything works correctly (though all files recursively ended up in the root, I had luck all were named differently).

Not sure I follow here, is the problem the creation of the dataset of fetching it? is this a single version or multi...

3 months ago
0 Hi All, I'Ve Successfully Run A Task Locally, And Now I'M Trying To Clone It And Send It To A Queue. It Looks Like The Environment Is Built Successfully, But It Hangs Here:

Okay I have an idea, it could be a lock that another agent/user is holding on the cache folder or similar
Let me check something

2 months ago
0 I Have Set

Hi Guys, just curious here, what's was the final issue?
Also out of curiosity, what does that mean? "1.12.2 because some bug that make fastai lag 2x" ?

4 months ago
0 Are The Various Task Types Available In 0.15? I Am Getting

Please do, just so it wont be forgotten (it won't but for the sake of transparency )

4 years ago
0 Hello! Since Today I Get

Sure, let's do that πŸ™‚

3 years ago
0 Hello! Since Today I Get

Damn, okay I'll make sure we fix the order.
Could you verify the ~= works as intended (if the order id correct)

3 years ago
0 Hi All, I'Ve Successfully Run A Task Locally, And Now I'M Trying To Clone It And Send It To A Queue. It Looks Like The Environment Is Built Successfully, But It Hangs Here:

My understanding is that on remote execution Task.init is supposed to be a no-op right?

Not really a no-op, it would sync Argpasrer and the like, start background reporting services etc.

This is so odd! literally nothing printed
Can you tell me something about the node "mrl-plswh100:0" ?
is this like a sagemaker node? we have seen things similar where Python threads / subprocesses are not supported and instead of python crashing it just hangs there

2 months ago
0 Is It Possible To Upload A Hyperdataset? Or Can We Only Upload Datasts

Hi @<1727497172041076736:profile|TightSheep99>
Yes it can, it will upload the meta-data as well as the files (it will also do de-dup and will not upload files that already exist in the dataset based on the hash of teh file content)

2 months ago
0 Is It Possible To Upload A Hyperdataset? Or Can We Only Upload Datasts

and they don't know how to write code, is this still possible?

well this means there is some standard of the data, right? what is that standard? unfortunately in our space there is no standard fort data, it's just too generic, so everyone always end with custom parsing of a sort.
Does that make sense ?

2 months ago
0 Is The App/Ui/Backend Customizable? Any Tutorials For That?

CleanWhale17 per your request :)

An automated ML Pipeline πŸ‘ Automated Data Source Integration πŸ‘ Data Pooling and Web Interface for Manual Annotation of Images(Seg. / Classif) [Allegro Enterprise] or users integrate with open-source Storage of Annotation output files(versioned JSON) πŸ‘ Online-Training Β Support(for Dataset Shifts) [Not Sure what you mean] Data Pre-processessing (filter/augment) [Allegro Enterprise] or users integrate with open-source Data-set visualization(stats...

4 years ago
0 Can I Run A Random Task From A Queue? Like This

I'm running hyper parameter optimzation on LSF cluster where every task is an LSF job running without clearml-agent

WOW this is so cool! 🎊

2 years ago
0 Hi, Is There Any Documentation For Setting Up And Using Ssl Certs With The Clearml Server And Agent?

HI @<1687643893996195840:profile|RoundCat60>
Are you running on AWS ?

3 years ago
0 Hi, Is There Any Documentation For Setting Up And Using Ssl Certs With The Clearml Server And Agent?

So assuming they are all on the same LB IP: You should do:
LB 8080 (https) -> instance 8080
LB 8008 (https) -> instance 8008
LB 8081 (https) -> instance 8081

It might also work with:
LB 443 (https) -> instance 8080

3 years ago
0 Hi All, I'Ve Successfully Run A Task Locally, And Now I'M Trying To Clone It And Send It To A Queue. It Looks Like The Environment Is Built Successfully, But It Hangs Here:

confirmed that the change had been added by

Make sure you see them in the Task log in the UI (the agent print it when it starts)

Any insight on how we can reproduce the issue?

Can this be reproducible using a simple script that we can also run?

2 months ago
3 years ago
0 Hi All, I Am Having Trouble Using The

Notice: dataset_rgb.list_files() will list the content of the dataset, Not the local files:
e.g.: /folder/myfile.ext and not /hone/user/cache/folder/myfile.ext
So basically i think you are just not passing actual files, you should probably do:
for local_file in Path(folder_rgb).rglob('*'): ...

3 years ago
0 Hi Guys, I Have Been Running The Clearml-Serving For A While Now And I Realize That From Time To Time After A Couple Of Hours The Serving Task (Control Plane) That Is Configured Through The Cli Goes Into Status Abort. This Happens Even Though All The Pods

Hi @<1569858449813016576:profile|JumpyRaven4>
What's the clearml-serving version you are running ?

This happens even though all the pods are healthy and the endpoints are processing correctly.

The serving pods are supposed to ping "I'm alive" and that should verify the serving control plan is alive.
Could it be no requests are being served ?

7 months ago
0 Hello! Since Today I Get

okay, I'll make sure we order it correctly

3 years ago
0 Hi All, I Have A Question Regarding Multi-Node Training Using The Clearml-Agent. What Is The Recommended Setup In This Case? Say I Have 3 Nodes With 3 Agents Running On Them. How Do I Make Sure They All Run The Same Job?

So in a simple "all-or-nothing"

Actually this is the only solution unless preemption is supported, i.e. abort running Task to free-up an agent...
There is no "magic" solution for complex multi-node scheduling, even SLURM will essentially do the same ...

3 years ago
Show more results compactanswers