Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
48 Questions, 8043 Answers
  Active since 10 January 2023
  Last activity 5 months ago

Reputation

0

Badges 1

25 × Eureka!
0 Hi All, I Have A Question Regarding Clearml Task Initialization With Multithreading. I'M Using Python'S Joblib Library And The Parallel Class To Run An Experiment In Multiple Parallel Threads. The Experiment Runs To Completion But I Get Incomplete Mllogge

Hi @<1619867994005966848:profile|HungryTurtle13>

I'm using Python's joblib library and the Parallel class to run an experiment in multiple parallel threads.

I believe joblib creates subprocesses not threads, but yes you are correct,
Basically once Task.init is called, every forked/spawned process will be automatically logged to the main process Task (you can, and probably should call either Task.init or Task.current_task() from the forked processes, but this is just a detial)
The mai...

8 months ago
0 How Can I Do The Following? (Basically, Filtering By Task Type)

JitteryCoyote63 to filter out 'archived tasks' (i.e. exclude archived tasks)
Task.get_tasks(project_name="my-project", task_name="my-task", task_filter=dict(system_tags=["-archived"])))

4 years ago
0 I'M Looking At How Triggers Work In Clearml. Is There An Example, Maybe With Clearml Data And A Dataset Being Uploaded Or Some Other Example?

Also could you explain the difference between trigger.start() and trigger.start_remotely()

Start will start the trigger process (the one "watching the changes") locally (this makes sense for debugging etc.)
start_remotely will launch the trigger process on the "services" where it should live forever 🙂

Okay so when I add trigger_on_tags, the repetition issue is resolved.

Nice!

This problem occurs when I'm scheduling a task. Copies of the task keep being put on the queue ...

2 years ago
3 years ago
0 Hi! I Noticed A Bug Related To Reusing The Same Component In A Pipeline. I Have Prepared A Mock Example So That You Can Reproduce It:

Building the pipeline in runtime from external configuration is very cool!!
I think nested components is exactly the correct solution, and it is a great use case.

2 years ago
0 Is There A Nicer Way To Program The Color For Report_Scalar? By Default It Use A Color Scheme That Is Very Hard To Compare When I Have Multiple Lines. I Can Change It Manually But I Do Not Want To Repeat It For Every Experiment.

Hi EnviousStarfish54
Color coding on the entire UI is stored per user (I think that on your local cookies, but I might be wrong). Anyhow any title/series combination will have the select color regardless of the project.
This way you can configure once that loss is red and accuracy is green, etc.

3 years ago
0 I'M Looking At How Triggers Work In Clearml. Is There An Example, Maybe With Clearml Data And A Dataset Being Uploaded Or Some Other Example?

VexedCat68

But what's happening is, that I only publish a dataset once but every time it polls,

this seems wrong (i.e a bug?!), how do you setup the trigger ? is the Trigger Task constantly running or are you re-launching it?

2 years ago
0 Question About The File Server. Currently, We Have A Machine With Minio Installed, And All File Communication Is Made Using The Minio Sdk Client. [Minio Is Just Like An S3 Bucket, Fully Compliant With S3 Protocol]. In The Examples I'Ve Seen The

To store all the debug samples, also it can store all the models (if you configure the output_uri=' http://file_server_here:8081 ') Yes: instead of the file server have 's3://<ip_of_minio>:9000/bucket' make sure you add the credentials for the minio in the trains.conf Yes, basically once you have the creendtials in the trains.conf, you could do StorageManager.get_local_copy('s3://<minio>:9000/bucket/file') (also upload of course 🙂 )

3 years ago
0 Hi, Community! For The Test I Logged My New Model To Clearml-Server File Host And Take Models For Clearml-Serving From There. And It Works With Clearml-Serving Model Add, But For Clearml-Serving Model Auto-Update I Do Not Exactly Understand What Happens.

Hi AbruptHedgehog21
can you send the two models info page (i.e. the original and the updated one) ?
do you see the two endpoints ?
BTW: --version would add a version to the model (i.e. create a new endpoint with version "endpoint/{version}"

2 years ago
0 Hi, I Am Trying To Use The Aws Autoscaler To Assign Instance Profiles To New Machines. This Is A Better Way Than Managing Credentials. I Added The Configuration To The Autoscaler Config Like So:

it does appear on the task in the UI, just somehow not repopulated in the remote run if it’s not a part of the default empty dict…

Hmm that is the odd thing... what's the missing field ? Could it be that it is failing to Cast to a specific type because the default value is missing?
(also, is issue present in the latest clearml RC? It seems like a task.connect issue)

2 years ago
0 I'M Using Tensorboard Summarywriter To Add Scalar Metrics For The Experiment. If Experiment Crashed, And I Want To Continue It From Checkpoint, For Some Reason It Plots Metrics In A Really Weird Way. Even Though I Pass Global_Step=Epoch To The Summarywrit

maybe I should use explicit reporting instead of Tensorboard

It will do just the same 😞

there is no method for setting 

last iteration

, which is used for reporting when continuing the same task. maybe I could somehow change this value for the task?

Let me double check that...

overwriting this value is not ideal though, because for :monitor:gpu and :monitor:machine ...

That is a very good point

but for the metrics, I explicitly pass th...

2 years ago
0 I Have A Bunch Of Python Modules With Clearml Tasks. They Are Using 3Rd-Party Libraries But No Module Uses Code From Another Module. When I Run Such A Task Remotely - Then Clearml Deduces The Dependencies From Imports, Which Works Fine. Now I Decided To T

FiercePenguin76 the git repo should detect only clearml as required python package
Basically the steps are:
decide if the initial python entry script is a standlone script (i.e. no local imports) in the git repo (in your example "task_with_deps.py") If this is a "standlone script" only look for imports inside the calling python script, and list those packages under "installed packages" If this is Note a standalone script, go over All the python files inside the repository, look for "i...

2 years ago
0 Hi, I’M Trying To Figure Out What Do The Clearml Agents Use The Webserver Endpoint For And What Would Break If One Didn’T Have Access? For Context: I’M Trying To Have A Self-Hosted Server With Endpoints Accessible Publicly, But Securely. The Webserver En

Hi HollowFish37
I think I have good news for you, the clearml-agent is only communicating with the api endpoint, so as long as this is secure, you should be fine. Do notice that the default files server endpoint should be secure as well, as by default it will allow any upload/download

2 years ago
0 Hi, I Have A Main Task That Creates Additional Tasks In Subprocesses. I Wish To Call Task.Init From Inside Each Child-Task So They Would Be Indifferent To The Main Task (I Wish The Child Process To Behave As If It Was Executed Standalone). I Am Aware Trai

Try removing this magic environment that tells the sub-process there was already an Initialized Task.

import os env = dict(**os.environ) env.pop('TRAINS_PROC_MASTER_ID', None) 🙂

3 years ago
0 Hi There, I'Ve Been Trying To Work With Trains And I Wanted To Save A Folder As The Model Like When Using The "Transformers" Library. They Have This "Save_Pretrained" Method To Their Models. It Saves The Pytorch Model And You Detect It Well, But Only That

Hi PompousBeetle71 , Trains will log all the torch.save call, I'm assuming they do not actually use it for the rest of the files on that folder.
If you like to share a code snippet we could see if we could auto-magically log it You could use artifacts and store the entire folder. It will zip it an upload it. Then you can reuse it from other experiments. https://allegro.ai/docs/task.html?highlight=artifact#trains.task.Task.upload_artifact
Example:
` task.upload_artifact('transformer', './my_...

4 years ago
0 Hi, Just Checking.. Does Anyone Know Whether Clearml Enterprise Has Deployment Functionality..

DeliciousBluewhale87 out of curiosity , what do you mean by "deployment functionality" ? is it model serving ?

3 years ago
0 Hello, There'S A Particular Metric (Perplexity) I'D Like To Track, But Clearml Didn'T Seem To Catch It. Specifically, This "Evaluation" Section Of Run_Mlm.Py In The Transformers Repo:

Clearml automatically gets these reported metrics from TB, since you mentioned see the scalars , I assume huggingface reports to TB. Could you verify? Is there a quick code sample to reproduce?

3 years ago
0 Hi There, I Am Running A Clearml-Agent In Services Mode (With Docker) On A Machine With Two Disks: One With The Os (8Go, 91% Space Used) And One For The Data (100Go, 40% Space Used). When Executing The Auto-Scaler Task In This Agent, I Get The Following E

Maybe there is setting in docker to move the space used in a different location?

No that I know of...

I can simply increase the storage of the first disk, no problem with that

probably the easiest 🙂

But as you described 

 it looks like an edge case, so I don’t mind 

🙂

3 years ago
0 Looking At Clearml-Serving - Two Questions - 1, What’S The Status Of The Project 2. How Does One Say How A Model Is Loaded And Served Etc? For Example, If I Have A Spacy Ner Model, I Need To Specify Some Custom Code Right?

And other question is clearml-serving ready for serious use?

Define serious use? KFserving support is in the pipeline, if that helps.
Notice that clearml-serving is basically a control plane for the serving engine, not to neglect the importance of it, the heavy lifting is done by Triton 🙂 (or any other backend we will integrate with, maybe Seldon)

3 years ago
0 I'M Trying To Run A Task On An Agent. I'Ve Passed The Requirements File But It Isn'T Able To Install It. The Error Is In The Reply. Help Would Be Appreciated.

Hi VexedCat68
Could it be the python version is not the same? (this is the only reason not to find a specific python package version)

2 years ago
0 Hi, I Am Looking To Upload "Already Trained Models" As Experiments In My Clearml Server. How Should I Go About Doing That? Clearml Picks Up The Tensorboard Automatically While It'S Training And Reports It But How Would I Do This If I Had Everything Alread

SmarmyDolphin68 sadly if this was not executed with trains (i.e. the offline option of trains), this is not really doable (I mean it is, if you write some code and parse the TB 😉 but let's assume this is way to much work)
A few options:
On the next run, use clearml OFFLINE option, (i.e. in your code call Task.set_offline() , or set env variable CLEARML_OFFLINE_MODE=1) You can compress the upload the checkpoint folder manually, by passing the checkpoint folder, see https://github.com...

3 years ago
0 Afaiu By Default Trains Logs All Tensorboard Things, Can This Be Turned Off?

Hi HealthyStarfish45
You can disable the entire TB logging :
Task.init('examples', 'train', auto_connect_frameworks={'tensorflow': False})

3 years ago
0 Hi Folks, A Question Regarding The Clearml-Agent With K8S Glue. In The Agents We Mount An Nfs Volume So That Some Artifacts And Data Would Be Available For Training. I Have Seen That The K8S Glue Runs As Root (I Guess To Be Able To Spawn New Pods?), But

but I was wondering if there's any limitation in creating an image with a non root user to use as the actual worker?

SarcasticSquirrel56 non-root pods (containers) are fully supported,
I would recommend using the latest agent RC (that simplified a few things)
clearml-agent==1.4.0rc3

I see... because the problem it would be with permissions when creating artifacts to store in the "/shared" folder

You mean as output target for artifacts ?

especially for datasets (for th...

2 years ago
Show more results compactanswers