Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
48 Questions, 8049 Answers
  Active since 10 January 2023
  Last activity 5 months ago

Reputation

0

Badges 1

25 × Eureka!
0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

If you take a look here, the returned objects are automatically serialized and stored on the files server or object storage, and also deserialized when passed to the next step.
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py

You can of course do the same manually

one year ago
0 Hello! How Can I Use "Report_Scatter2D" In Order To Report Timestamp In The X-Axis?

SweetGiraffe8 Works when I'm using plotly...
Can you please copy paste the code with the plotly, it's probably something I'm missing

3 years ago
0 Hi Everyone. I Have An Issue With The Simple Pipeline - It Runs Two Similar Nn Training Steps (Tf2.3, Windows10, Python 3.7) With Only Difference Is A Batch Size. I'M Running First Separately Each Step To Have Them In Clearml Project Page. Then I Run Pipe

Hi BattyLion34
I might have a solution, in order to make sure the two agents are not sharing the "temp" folder:
create two copies of ~/clearml.conf , let's call them :
~/clearml_service.conf ~/clearml_agent.confThen in each one select a different venvs_dir see here:
https://github.com/allegroai/clearml-agent/blob/822984301889327ae1a703ffdc56470ad006a951/docs/clearml.conf#L90
for example:
~/.clearml/venvs-builds1 ~/.clearml/venvs-builds2Now start the two agents with:
The service age...

3 years ago
0 Dear Clearml Community, I Am Looking For A Way To Properly Resume A Training In A Way That Initial Scalars Get Reused And Expanded. Clearml Feature For Reusing The Same Task Works Fine (When Using

Oh I see, basically a UI feature.
I'm assuming this is not just changing the x-axis in the UI, but somehow store the x-axis as part of the scalars reported?

8 months ago
0 Hi! Can Someone Show Me An Example Of How

BTW: I think an easy fix could be:
if running_remotely(): pipeline.start() else: pipeline.create_draft()

2 years ago
0 Hi. Question About Dataset Upload Errors: When Uploading A

My apologies you are correct 1.8.1rc0 🙂

one year ago
0 Hi. I Get Some Problem With Clearml Agent. I Start Training On My Local Device, Clone Run, And Start This Run In Docker On Cluster. But, Seems Like Clearml Agent Сaches Environment(Package Weels, Python Version, Etc). Can I Config Clearml Agent To Not Сac

Hi StickyBlackbird93
Yes, this agent version is rather old ( clearml_agent v1.0.0 )
it had a bug where pytorch wheel aaarch broke the agent (by default the agent in docker mode, will use the latest stable version, but not in venv mode)
Basically upgrade to the latest clearml-agent version it should solve the issue:
pip3 install -U clearml-agemnt==1.2.3BTW for future debugging, this is the interesting part of the log (Notice it is looking for the correct pytorch based on the auto de...

2 years ago
0 Hello, I Am Looking For A Way To Increase Number Of Images Saved In Results>Debug Samples. Looks Like There Is A Limit Of 100 Images Per Experiment, And All Images Saved After Are Not Displayed In Web Client. I Like To Have First Batch With Predictions V

Ohh sorry. task_log_buffer_capacity is actually internal buffer for the console output, on how many lines it will store before flushing it to the server.
To be honest, I can't think of a reason to expose / modify it...

3 years ago
0 Hi! I Noticed A Bug Related To Reusing The Same Component In A Pipeline. I Have Prepared A Mock Example So That You Can Reproduce It:

Thanks GiganticTurtle0
So the bug is "mock_step" is storing "NUMBER_2" argument value in the second instance?

2 years ago
0 Heyo, After Building Some Custom Pipelining Functionality On Mlflow, I Started Looking For Better Software That Can Beat What I Created - With A Similar Amount Of Effort. Problem Has Been That Up Till Now, All I Found Could Make Things Way Better But Al

I think my question is more about design, is a ModelPipeline class a self contained pipeline? (i.e. containing all the different steps or is it a single step in a pipeline)

one year ago
0 Base_Template_Keras_Simply.Py

DeliciousBluewhale87 could you send the full log of the Task?

3 years ago
0 Hi All! I Have Methods Inside Notebooks That I Made Available To Clis Using Nbdev
  • In a notebook, create a method and decorate it by fastai.script’s @call_parse .Any chance you have a very simple code/notebook to reference (this will really help in fixing the issue)?
one year ago
0 Hi, I Am Wondering Why Do I Need To Create Files Before Applying Diff ?

Thanks DefeatedOstrich93
Let me check if I can reproduce it.

3 years ago
0 Hi, I'Ve Got A Quick Question About

SuperiorPanda77 I have to admit, not sure what would cause the slowness only on GCP ... (if anything I would expect the network infrastructure would be faster)

2 years ago
0 We Are Facing Performance Issues Of Our Self-Hosted Clearml Server Looking At The Cpu Utilization \ Memory \ Networking We Couldn'T Identify A Bottleneck We Are At The Moment Using ~100 Workers For Some Hpo, And The Main Performance Issues We Observe Are

Hi DepressedChimpanzee34
I think main issue here is slow response time from the API server, I "think" you can increase the number of API server processes, but considering the 16GB, I'm not sure you have the headroom.
At peak usage, how much free RAM so you have on the machine ?

2 years ago
0 Hi All, Is There A Way To Schedule The Tasks From The Queue Onto The Gpu Instances Based On Factors Such As Gpu Utilisation, Number Of Cpu Cores Present, Free Memory Or Custom Parameters Such As Priority Of The Task, Estimated Time Etc?

Hi CharmingPuppy6
Basically yes there is.
The way clearml is designed, is to have queues abstract different types pf resources. for example a queue for single gpu jobs (let's nam "single_gpu") and a queue for dual gpu jobs (let's name it "single_gpu").
Then you spin agents on machines and have the agents pull jobs from specific queues based on the hardware they have. For example we can have a 4 GPU machine with 3 agents, one agent connect to 2xGPUs and pulling Tasks from the "dual_gpu...

3 years ago
0 Hi, We'Re Facing An Error When Uploading Model Checkpoints To Clearml During Training (Using Clearml Version 1.9.0 And Pytorch Lightning 1.7.6), Anyone Knows How To Solve? Thanks! The Error: Clearml.Storage - Error - Failed Uploading: Httpsconnectionpool(

Hi TightDog77 _

HTTPSConnectionPool(host='

', port=443): Max retries exceeded with url: /upload/storage/v1/b/models/o?uploadType=resumable (Caused by SSLError(SSLError(1, '[SSL: DECRYPTION_FAILED_OR_BAD_RECORD_MAC] decryption failed or bad record mac (_ssl.c:2633)')))

This seems like a network error to GCP, (basically GCP python package thows it)
Are you always getting this error? is this something new ?

one year ago
0 Anyone Doing Sagemaker With Clearml - Something Like The K8S Glue But The Tasks Are Pulled Into Sagemaker Training Jobs

Do you have any experience and things to watch out for?

Yes, for testing start with cheap node instances 🙂
If I remember correctly everything is preconfigured to support GPU instances (aka nvidia runtime).
You can take one of the templates from here as a starting point:
https://aws.amazon.com/blogs/compute/running-gpu-accelerated-kubernetes-workloads-on-p3-and-p2-ec2-instances-with-amazon-eks/

3 years ago
0 Hi! I Deployed Clearml Server Along With Jupyterhub On Azure K8S (Aks). The Way It Works Is That Every User Is Assigned A New Pod That Is Spawned With A Docker Image Of A Choice (One Of Them With Clearml Sdk Installed). I Managed To Configure Most Of The

GreasyPenguin66 Nice !!!
Very cool setup, and kudos on making it work with multiple users!
Quick question, shouldn't the JUPYTERHUB_API_TOKEN env variable be enough to gain access to the server? Why did you need to add it to the 'nbserver-x.json' as well?

3 years ago
0 Another Question: Is It Possible To Specify In Which Directory To Save All The Files That Clearml-Agent Creates (E.G. Cache Files Or Results Of The Currently Running Experiments)

What do you mean cache files ? Cache is machine specific and is set in the clearml.conf file.
Artifacts / models are uploaded to the files server (or any other object storage solution)

3 years ago
0 Hello! I'M Using A

Hi SillySealion58

"keep N best checkpoints" logic in my training loop.

If this is the usecase, may I suggest overwriting them locally? (the same will happen on the remote storage) This is exactly how the lightning / ignite feature is implemented

one year ago
0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

I will probably just use everywhere an absolute path to be robust against different machine user accounts: /home/user/trains.conf

That sounds like good practice

Other than the wrong, trains.conf, I can't think of anything else... Well maybe if you have AWS environment variables with credentials ? they will override the conf file

3 years ago
0 + Side Question - Any Plans To Include Native Support For

We should update the readme 🙂

3 years ago
Show more results compactanswers