Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
48 Questions, 8051 Answers
  Active since 10 January 2023
  Last activity 7 months ago

Reputation

0

Badges 1

25 × Eureka!
0 Hi I Have An Issue Where Experiments Are All Showing That They Started From Iteration 0. This Is Even True For Experiments Which I Know Used To Show The Correct Iteration, So It Seems To Be Due To An Update Of The Web Interface. Here You Can See That Sup

No, an old experiment changed, nothing was rerun

ohh, that is odd. I think the max iteration value is stored on the DB, which is odd if it changed after an update.
BTW: just making sure, could it be these Tasks were imported ? (i.e. offline execution + import)

2 years ago
0 What Would Be The Best Way To Approach This Flow?

can configuration objects refer to one-another internally in ClearML?

Interesting, please explain?

2 years ago
0 Hello Guys, I Read About Trains Some Days Ago And Think It Is Exectly What I Was Looking For, So I Ran The Docker Image And Started Thinking Of What I Would Like To Do And The Processing Steps I Would Like To Automize Which I Currently Run Manually Trigge

Hi WickedGoat98
This sounds like a great design (obviously you have scale in mind ๐Ÿ˜‰ ) Feel free to ask "stupid" questions, based on what you already wrote I doubt they will be
A few questions that come to mind (probably a few others after):
You mentioned FS synchronization, from where? i.e. what is the single source of truth ? K8s (Rancher 2.0 is basically k8s manager) can take care of mounting volumes, so no need to sync, is this a valid solution ?

BTW : (you can drag and drop an i...

4 years ago
0 Hi I Have An Issue Where Experiments Are All Showing That They Started From Iteration 0. This Is Even True For Experiments Which I Know Used To Show The Correct Iteration, So It Seems To Be Due To An Update Of The Web Interface. Here You Can See That Sup

and about a month later for some reason the initial iteration seems to have changed to 0

Hmm, I see your point. Just so I fully understand, your are not saying Old experiments were changed, but new experiments (running the same code-ish) have a totally different max iterations value. Is this correct ?

2 years ago
0 Hi, Is There Any Documentation For Setting Up And Using Ssl Certs With The Clearml Server And Agent?

We're not using a load balancer at the moment.

The easiest way is to add ELB and have amazon add the httpS on top (basically a few clicks on their console)

3 years ago
0 Hello Guys, I Read About Trains Some Days Ago And Think It Is Exectly What I Was Looking For, So I Ran The Docker Image And Started Thinking Of What I Would Like To Do And The Processing Steps I Would Like To Automize Which I Currently Run Manually Trigge

The data I'm syncing by an data provider wich supports only an ftp connection....

Right ... that makes sense :)

No worries WickedGoat98 , feel free to post questions when they arise. BTW: we are now improving the k8s glue, so by the time you get there the integration will be even easier ๐Ÿ™‚

4 years ago
0 Hi, What Happens Exactly When I Execute The Following Command:

Hi JitteryCoyote63
The NVIDIA_VISIBLE_DEVICES is set automatically for the process the trains-agent spins, so from your code, it is transparent, you can only "see" GPU 0.
(Obviously not using docker you can forcefully change the OS environment in runtime, but you should avoid that ;))

4 years ago
0 Currently, To Provide Ssh Access To The Docker Images For A Task,

The .ssh is mounted, but the owner is my local user,

sudo -H clearml-agent ...to allow sudo to access home

3 years ago
0 Hi Everyone. I'M New To Trains. I Do Not Have Sudo Access To My Departmental Servers. Can I Still Use Trains Beyond The Demo Server?

ScantWorm7
Tensorboard is automatically captured and sent to the trains server. This is in addition to the local copy of your TB files. Actually in most cases the local copy is redundant

4 years ago
one year ago
0 Hi There

Okay, I think I understand, but missing something. It seems you call get_parameters from old API , is your code actually calling get_parameters ? The trains-agent runs the code externally, whatever happens inside the agent should have now effect on the code. So who exactly is calling the task.get_parameters, and well, why ? :)

4 years ago
0 Question About The File Server. Currently, We Have A Machine With Minio Installed, And All File Communication Is Made Using The Minio Sdk Client. [Minio Is Just Like An S3 Bucket, Fully Compliant With S3 Protocol]. In The Examples I'Ve Seen The

To store all the debug samples, also it can store all the models (if you configure the output_uri=' http://file_server_here:8081 ') Yes: instead of the file server have 's3://<ip_of_minio>:9000/bucket' make sure you add the credentials for the minio in the trains.conf Yes, basically once you have the creendtials in the trains.conf, you could do StorageManager.get_local_copy('s3://<minio>:9000/bucket/file') (also upload of course ๐Ÿ™‚ )

4 years ago
0 Hi Everyone. I'M New To Trains. I Do Not Have Sudo Access To My Departmental Servers. Can I Still Use Trains Beyond The Demo Server?

Hmm you will have to set the trains-server on a machine somewhere, it can be any machine win / Mac / Linux

4 years ago
0 Hi, I Have A Pre-Processing Steps Not Been Implemented In Python, But Being A Shell Script Calling Wget To Synchronize Data And Creating Intermediate Sqlite Dbs By A Script Been Implemented In 'R' And Would Like To Ask, If Trains Can Be Used Just To Trigg

Hi WickedGoat98

Will I need to wrap their execution in python by system calls?

That would probably be the easiest solution ๐Ÿ™‚

Then you can plug it into your pipeline as a preprocessing Task:

You can check this example:
https://github.com/allegroai/trains/tree/master/examples/pipeline

4 years ago
0 Dear Clearml Community, I Am Looking For A Way To Properly Resume A Training In A Way That Initial Scalars Get Reused And Expanded. Clearml Feature For Reusing The Same Task Works Fine (When Using

Hi @<1663354518726774784:profile|CrookedSeal85>

However, I systematically notice a jump of some number of "ghost iterations" when resuming my trainings...

Try the following:

task = Task.init(..., continue_last_task=0

from the Task.init docstring (Notice this value can be both boolean and integer)

        :param bool continue_last_task: Continue the execution of a 
...
          - An integer - Specify initial iteration offset (override the auto automatic last_iteratio...
10 months ago
2 years ago
0 I Have A Question About The Clearml Self Hosted Instance, I Notice There Is Elastic Search, Mondodb, And Redis In The Helm Chart Are These Required Or Can We Bring Our Own? I'M Wondering What Happens If I Were To Host The Instance And One Of These Were

I'm wondering what happens if i were to host the instance and one of these were to go down from time to time in production, as the deployments provided by the helm chart are not redundant.

Long story short, it will break the clearml-server, please do not take them down, if you do need to do that, also take down the clearml-server. The python clients will wait until it is up again, so no session would be destroyed

2 years ago
0 Hi, I Am Trying To Setup Multi-Node Training With Pytorch Distributeddataparallel. Ddp Requres A Launch Script With A Set Of Parameters To Be Run On Each Node. One Of These Parameters Is Master Node Address. I Am Currently Using The Following Scheme:

preempting lower priority tasks to allow a higher priority task to come in

Well this is usually outside of the scope of "single researcher" / "tiny team"...
This typically a large scale problem
That said, it will be fairly easy to write a service that aborts Tasks, "tags them to be "continued", then later (at night?!) push them back into a queue... wdyt?

2 years ago
0 Hi, I Am Wondering Why Do I Need To Create Files Before Applying Diff ?

Thanks DefeatedOstrich93
Let me check if I can reproduce it.

3 years ago
0 Question About The Storage Manager. Assuming I Have An Object That Updates Frequently And Always Saved At The Same Path (E.G.

But adding a simpleย 

force_download

ย flag to theย 

get_local_copy

That's sounds like a good idea

4 years ago
0 Hi There

JitteryCoyote63 do you have an idea on how I can reproduce it?

4 years ago
0 If I Am Using The Demo Servers, Do I Need To Do Something Special To Use

HealthyStarfish45
No, it should work ๐Ÿ™‚

4 years ago
Show more results compactanswers