Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
48 Questions, 8049 Answers
  Active since 10 January 2023
  Last activity 6 months ago

Reputation

0

Badges 1

25 × Eureka!
one year ago
0 I Wanted To Suggest Something. We'Re Creating A Lot Of Projects And It Starts Getting A Bit Difficult To Navigate Through Them. I Think An Option To Have A Hierarchy In The Projects Can Be Very Useful.

let's call it an applicative project which has experiments and an abstract/parent project, or some other name that group applicative projects.

That was my way of thinking, the guys argued it will soon "deteriorate" into the first option :)

4 years ago
0 Thanks For Releasing This Awesome Experiment Manager! I Was Logging A Single Training Session On Multiple Gpus (Using Detectron2), And Torch.Mp Is Called For Each Gpu. This Creates A Separate Task In Trains For Each Gpu, And Only One Of The Tasks Has The

Since this fix is all about synchronizing different processes, we wanted to be extra careful with the release. That said I think that what we have now should be quite stable. Plan is to have the RC available right after the weekend.

4 years ago
0 Thanks For Releasing This Awesome Experiment Manager! I Was Logging A Single Training Session On Multiple Gpus (Using Detectron2), And Torch.Mp Is Called For Each Gpu. This Creates A Separate Task In Trains For Each Gpu, And Only One Of The Tasks Has The

Hi VexedKangaroo32 , funny enough this is one of the fixes we will be releasing soon. There is a release scheduled for later this week, right after that I'll put here a link to an RC containing a fix to this exact issue.

4 years ago
0 Dear Clearml Community, I Am Looking For A Way To Properly Resume A Training In A Way That Initial Scalars Get Reused And Expanded. Clearml Feature For Reusing The Same Task Works Fine (When Using

Just one more question, do you have any idea about how I could change the x-axis label from "Iterations" to "Epochs"

You mean in the UI (i.e. just the title) ? or are you actually reporting iterations instead of epochs? and if so is this auto connected to tensorboard or is it reported manually ?

8 months ago
0 Hello, I Am Currently Trying To Install Unsloth On My Clearml Agent. However After Trying Many Different Approaches, There Seems To Be An Issue With Installing It From Github. The Closest I Come To An Installation Is With The Following Code:

Hi @<1637624975324090368:profile|ElatedBat21>
I think that what you want is:

Task.add_requirements("unsloth", "@ git+
")
task = Task.init(...)

after you do that, what are you seeing in the Task "Installed Packages" ?

3 months ago
0 Hi All. How Do I Read Out The Pipeline Configuration (Attached To The Pipeline Via Connect_Configuration) Inside A Task, Which I Created Via Add_Function_Step() For The Same Pipeline. We Have Tried Use A Pre_Execute_Callback, And Then Acess `Node.Job.Task

Hi @<1543766544847212544:profile|SorePelican79>
You want the pipeline configuration itself, not the pipeline component, correct?

pipeline = Task.get_task(Task.current_task().parent)
conf_text = pipeline.get_configuration_object(name="config name")

conf_dict = pipeline.get_configuration_object_as_dict(name="config name")
11 months ago
0 Hello All , Good Morning ! Can You Help Better Understand The Distinction Of Cleargpt? How Is It Different From Chatgpt And What Gpt Model Are We Using In Clearml ? Thank You In Advance !

still it is a chatgpt interface correct ?

Actually, no. And we will change the wording on the website so it is more intuitive to understand.
The idea is you actually train your own model (not chatgpt/openai) and use that model internally, which means everything is done inside your organisation, from data through training and ending with deployment. Does that make sense ?

11 months ago
0 Crazy Idea:

got to love the ascii art 😍

11 months ago
0 Crazy Idea:

This is awesome man !

11 months ago
0 For Clearml Serving, If I Am Trying To Deploy 100 Models On A Gpu That Can Handle 5 Concurrently, But Each One Will Be Sporadically Used (Fine Tuned Models Trained For Different Customers), Can Clearml-Serving Automatically Load And Unload Models Based Up
  • Triton server does not support saving models off to normal RAM for faster loading/unloadingCorrect, the enterprise version also does not support RAM caching

Therefore, currently, we can deploy 100 models when only 5 can be concurrently loaded, but when they are unloaded/loaded (automatically by ClearML), it will take a few seconds because it is being read from the the SSD, depending on the size.

Correct, there is also deserializing CPU time (imaging unpickling 20GB file, this takes ...

11 months ago
0 Hi

Thanks SarcasticSparrow10 !
I'll later reply the Github issue (for better visibility)
But my initial thoughts:
(1) I think this was suggested, and hopefully we will get to implementing it, I can definitely see the value. Meanwhile you can achieve some of the functionality with the experiment table and custom columns πŸ™‚
(2) "Don't display the performance metric" -> isn't that important? what am I missing?
(3) Hmm you mean just extra columns?
(4) sounds like a bug
(5) is this a plotly issue?...

3 years ago
0 Hi Guys, I Managed To Set Up A Kubernetes Cluster And Install Trains Into It. While Testing My Set-Up I Run The Test_Reporting.Py Example

Hi WickedGoat98

"Failed uploading to //:8081/files_server:"

Seems like the problem. what do you have defined as files_server in the trains.conf

3 years ago
0 Hi All - I Have A Question To Ask (And Not Sure If There Is A Channel For Faqs So Sorry For Putting It Here) ... I Am Using Trains In Combination With Pycharm'S Remote Debugging. I Have The Pycharm Plugin Installed. When The Experiment Ends, I Get

Hmm, yes this fits the message. Which basically says that it gave up on analyzing the code because it run out of time. Is the execution very short? Or the repo very large?

4 years ago
0 Hi Everybody. When I Want To Force The Agent To Not Reproduce My Local Pip Environment, I Add

My question is what should be the path to the requirements.txt file?
Is it relative to the repo base?

This is actually in runtime (i.e. when running the code), so relative to the working directory. Make sense ? (you can specify absolute path, probably something I would avoid in the code base though...)

2 years ago
3 years ago
0 Hi There! Some Background Info Before I Put Forward My Question: I'M Writing-Up A Small Script To Help Me Manage My Tasks. Specifically I Often Need To Abort (And Archive) A

Hi StickyMonkey98

aΒ 

very

Β large number of running and pending tasks, and doing that kind of thing via the web-interface by clicking away one-by-one is not a viable solution.

Bulk operations are now supported , upgrade the clearml-server to 1.0.2 πŸ™‚

Is it possible to fetch a list of tasks via Task.get_tasks,

Sure:
Task.get_tasks(project_name='example', task_filter=dict(system_tags=['-archived']))

3 years ago
0 Hi All, I Have A Question Regarding Multi-Node Training Using The Clearml-Agent. What Is The Recommended Setup In This Case? Say I Have 3 Nodes With 3 Agents Running On Them. How Do I Make Sure They All Run The Same Job?

So in a simple "all-or-nothing"

Actually this is the only solution unless preemption is supported, i.e. abort running Task to free-up an agent...
There is no "magic" solution for complex multi-node scheduling, even SLURM will essentially do the same ...

3 years ago
0 Hi All, I Have A Question Regarding Multi-Node Training Using The Clearml-Agent. What Is The Recommended Setup In This Case? Say I Have 3 Nodes With 3 Agents Running On Them. How Do I Make Sure They All Run The Same Job?

The problem is not really for the agents to wait (this is easily solved by additional high priority queue) the problem is will you have a "free" agent... you see my point ?

3 years ago
0 Hey, I Have Many Python Files. In The First Python File I Use The Following Line. Parameters = Task.Connect(Input) Now I Change The Hyperparameters On The Graphical Interface. But Now I Need The Hyperparameters In Every Python File. How Do I Have Access T

Hi ProudChicken98
task.connect(input) preserves the types based on the "input" dict types, on the flip side get_parameters returns the string representation (as stored on the clearml-server).
Is there a specific reason for using get_parameters over connect ?

3 years ago
Show more results compactanswers