Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
49 Questions, 8094 Answers
  Active since 10 January 2023
  Last activity 10 months ago

Reputation

0

Badges 1

25 × Eureka!
0 With

So when the agent fire up it get's the hostname, which you can then get from the API,

I think it does something like "getlocalhost", a python function that is OS agnostic

3 years ago
0 With

So if you set it, then all nodes will be provisioned with the same execution script.

This is okay in a way, since the actual "agent ID" is by default set based on the machine hostname, which I assume is unique ?

3 years ago
0 With

Hopefully once things calm down at work I will find more time.

Sounds good πŸ™‚

3 years ago
0 Hi, Just To Check. Does The K8S Glue Install Torch By Default? I'M Getting

SubstantialElk6
Hmm do you have torch in the "installed packages" section of the Task ?
(This what the agent is using to setup the environment inside the docker, running as a pod)

3 years ago
0 Hi, Just To Check. Does The K8S Glue Install Torch By Default? I'M Getting

SubstantialElk6 "Execution Tab" scroll down you should have "Installed Packages" section, what do you have there?

3 years ago
0 In Order For A New Worker To Come Online In My K8 Cluster, Do I Need To Have An Ec2 Startup Script Init The Agent/Config, And Then Start The Daemon? Do I Have To Do This Manually Is This A Better Way?

So I'd create the queue in the UI, then update the helm yaml as above, and install? How would I add a 3rd queue?

Same process?!

Also I'd like to create the queues pragmatically, is that possible?

Yes, you can, you can also pass an argument for the agent to create the queue if it does not already exist, just add --create-queue to the agent execution commandline

2 years ago
0 Hi, Just To Check. Does The K8S Glue Install Torch By Default? I'M Getting

just to check. Does the k8s glue install torch by default?

SubstantialElk6 what do you mean the glue installs torch ?
The glue will take a Task from the queue create a k8s job (basically use the same docker and inside the docker run get the agent to execute the requested Task). Where would the "torch" come into play?

3 years ago
0 Hello. I'M Interested In Dynamic Gpu Feature. But I Can'T Find Any Information On How It Works. Can You Help Me With It? Is It Possible To Try It Somewhere ?

ItchyJellyfish73
Unfortunately this needs backend support, and only available in the enterprise version, what is your use case for it? (It was designed to allow out of the box bare-metal multi gpu dynamic allocation, think DGX with 8 GPUs that instead of spinning down agents when you want to change the queue->num-gpu mapping you can do it on the fly)

3 years ago
0 Hello, I Am Currently Learning How To Build Pipelines With Clearml. I'Ve Created A Pipeline That Has Five Steps, In Which Each Step Depends On The Previous One (Step 2 Depends On Step 1, Step 3 Depends On Step 2, Etc.). However, My Step 4 Depends On Step

Hi @<1523704757024198656:profile|MysteriousWalrus11>

"parents": [

  "step_two", 
  "step_four" 
], 

Seems like step 5 depends on steps 2+4 , how did you create it? what did the console say ?
Could it be your not actually passing any output from step3 ? how is it dependent on it ?

one year ago
0 Running Into A Strange Issue—

Seems correct.
I'm assuming something is wrong with the key/secret quoting ?!
Could you generate another one and test it ?
(you can have multiple key/secretes on the same user)

3 years ago
0 Hello I'M New Here, I Found This Error When Testing My Tensorflow / Keras Model. I Already Create The Model Endpoint By Running Command 'Clearml-Serving --Id <Service_Id> Model Add --Engine Triton --Endpoint "Model_Name"... '. Also My Tensorflow / Keras M

MoodyCentipede68 from your log

clearml-serving-triton | E0620 03:08:27.822945 41 model_repository_manager.cc:1234] failed to load 'test_model_lstm2' version 1: Invalid argument: unexpected inference output 'dense', allowed outputs are: time_distributed

This seems the main issue of triton failing to.load
Does that make sense to you? how did you configure the endpoint model?

2 years ago
0 Hello I'M New Here, I Found This Error When Testing My Tensorflow / Keras Model. I Already Create The Model Endpoint By Running Command 'Clearml-Serving --Id <Service_Id> Model Add --Engine Triton --Endpoint "Model_Name"... '. Also My Tensorflow / Keras M

MoodyCentipede68 can you post the full docker-compose log (from spinning it until you get the error?)
You can just pipe the output to a file with :
docker-compose ... up > log.txt

2 years ago
0 Looking At The Docs.. I Couldn'T Find A Way To Cleanup The Experiments... Only Archive Them... I Also Noticed

PompousParrot44 obviously you can just archive a task and run the cleanup service, it will actually delete archived tasks older than X days.
https://github.com/allegroai/trains/blob/master/examples/services/cleanup/cleanup_service.py

4 years ago
0 Hi - What Is The Difference Between

TeenyFly97 the TL;DR is:
Task.close() should be called when you previously used Task.init (i.e the code creating the task)
Task.mark_stopped() should be called to stop a remote Task running.
I hope it helps πŸ™‚

4 years ago
0 My Other Issue Is That If I Want To Compare Two Experiments The Scalar Plots Do Not Load ( Loading Forever ). If I Select To Show Only The Minimum Values That One Loads And Also The Other Menu Points Working In The Comparison Mode Except That.

Hi @<1600299043865497600:profile|MagnificentSeaurchin90>
Any chance you can provide more info on the error?

if I want to compare two experiments the scalar plots do not load ( loading forever ).

I'm assuming the issue is the Plots tab? or is it the Scalars? what do you have in the Plots? can you send an image of the single experiment ?

one year ago
0 Hi. Help

at least you did not change permission of your K8s etcd folder πŸ˜„

2 years ago
0 Hi. Help

Hi PanickyMoth78

I had several pipeline components getting it and uploading files to is concurrently.

Should not be a problem

I've attached it's log file which only mentions skipping one file (a warning)

So what exactly is the error you are getting?

2 years ago
0 Hi, When I Use Task.Get_Logger().Report_Table, I Go The Ui After The Experiment Finishes And I Download The Table (Under Results > Plots), It Gives Me A Json File. How Can I Use It? It Seems To Follow A Structure Specific To Clearml, How Can I For Example

how can I for example convert it back to a pandas dataframe?

You can always report csv file with report_media as well, or if this is not for debugging maybe an artifact ?

3 years ago
0 Are There Any Particular System Dependencies Needed To Enable

there is a bug wherein both

Task.current_task()

and

Logger.current_logger()

return

None

.

This is not a bug this means something broke, the environment variable CLEARML_TASK_ID Has to be set inside the agent's process
How are you running it? (also log πŸ™‚ , you can DM so it is not public here)

10 months ago
Show more results compactanswers