Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
48 Questions, 8049 Answers
  Active since 10 January 2023
  Last activity 5 months ago

Reputation

0

Badges 1

25 × Eureka!
0 Hey, So I'M Trying To Upload An Artefact To Clearml’S Fileserver(I Have A Self Hosted Clearml Server Running), I'Ve Uploaded The File Using Storagemanager.Upload_File(Path, Url) And Giving The Url As “

Hi WickedElephant66

So I'm trying to upload an artefact to clearml’s fileserver(I have a self hosted clearml server running),

Are you trying to upload an artifact? If so I would do:
task.upload_artifact('local file', artifact_object="/path/to/file")Or is it about Model files?
You can alst check how to upload artifacts / models here:
https://github.com/allegroai/clearml/blob/master/examples/reporting/artifacts.py
https://github.com/allegroai/clearml/blob/master/examples/reporti...

2 years ago
0 Hey, So I'M Trying To Upload An Artefact To Clearml’S Fileserver(I Have A Self Hosted Clearml Server Running), I'Ve Uploaded The File Using Storagemanager.Upload_File(Path, Url) And Giving The Url As “

Are Kwargs supported in functions decorated as a pipeline component?

They are, but I think the main issue is the casting, without prior knowledge, everything will be a tring

2 years ago
0 Hello I'M Running A Local Agent . While Its Running The Task I Get This Error. Any Suggestion? Uccessfully Installed Numpy-1.24.4 Found Pytorch Version Torch==2.0.1 Matching Cuda Version 0 Found Pytorch Version Torchaudio==2.0.2 Matching Cuda Version 0 Er

I see,
@<1571308003204796416:profile|HollowPeacock58> can you please send the full log?
(The odd thing is it is trying to install the python 3.10 version of torch, when your command line suggest it is running python 3.8)

one year ago
0 Hello I'M Running A Local Agent . While Its Running The Task I Get This Error. Any Suggestion? Uccessfully Installed Numpy-1.24.4 Found Pytorch Version Torch==2.0.1 Matching Cuda Version 0 Found Pytorch Version Torchaudio==2.0.2 Matching Cuda Version 0 Er

Can you do the following
Clone the Task you previously sent me the installed packages of, then enqueue the cloned task to the queue the agent with the conda.
Then send me the full log of the task that the agent run

one year ago
one year ago
0 Hi Everyone! I Am Using Clearml-Serving When I Am Trying To Add New Endpoint Like This

@<1569496075083976704:profile|SweetShells3> remove these from your pbtext:

name: "conformer_encoder"
platform: "onnxruntime_onnx"
default_model_filename: "model.bin"

Second, what do you have in your preprocess_encoder.py ?
And where are you getting the Error? (is it from the triton container? or from the Rest request?

one year ago
0 Hi Everyone! I Am Using Clearml-Serving When I Am Trying To Add New Endpoint Like This
 data["encoded_lengths"]

This makes no sense to me, data is a numpy array, not a pandas frame...

one year ago
0 Hi Everybody, I’M Getting Errors With Automatic Model Logging On Pytorch (Running On A Dockered Agent).

CrookedWalrus33 I found the issue, this is only failing with Python 3.6.
Let me check something

2 years ago
0 Hello, I'M Running A Ml Training Using

Hi FancyWhale93 you can disable the auto model uploading with
@PipelineDecorator.component(..., auto_connect_frameworks={'pytorch': False}) def step(): pass

2 years ago
0 Hi Everyone, I'M Using The

Hi AttractiveCockroach17

. Many of these experiments appear with status running on clearml even though they have finish running,

Could it be their process just terminated? (i.e. not properly shutdown) ?
How are you running these multiple experiments?
BTW: if the server does not see any change in a Task for (I think the default is 2 hours) it will automatically mark these Task as aborted

2 years ago
0 Hi There, How Can I Set The Model Metadata Using Code? The Model Object Has The

Hi IrritableGiraffe81
You can access the model object with, task.models['output']
To set the model metadata I would recommend making sure you have the latest clearml package, I think this is relatively new addition

2 years ago
0 Hi There, I Have A Pipeline That Query Data From A Neo4J Database. When I Run It Using

Hi IrritableGiraffe81
PipelineDecorator.debug_pipeline() runs everything as regular python functions, but "PipelineDecorator.run_locally()" is actually sumulating all the steps on the same local machine (so that it is easier to debug the "real" pipeline running on multiple machines)
What I think is happening is that the casting of the arguments passed to the component fail.
Basically the type hints are currently ignored (we are working on using them for casting in the next version)
but righ...

2 years ago
0 Is There Some Built-In Way In Clearml To Trigger Further Action On Task Fail (Or Pipeline Fail)?

I suppose one way to perform this is with a

that kicks

Yes, that was my thinking.

It seems more efficient to support a triggered response to task fail.

Not sure I follow this one, I mean the pipeline logic itself monitors the execution. If I'm not mistaken, try/except will catch a step that files, and a global will catch the entire pipeline. Am I missing something ?

2 years ago
0 Hello, I Have A Local Install Using The Docker Compose Approach. I'M Trying To Set

it's saved in a

lightning_logs

folder where i started the script instead.

It should be saved there + it should upload it to your file server
Can you send the Task log? (this is odd)

2 years ago
0 Also, For Selecting A Subset Of Experiments To Compare, It Looks Like Neptune Currently Has A More Advanced Solution (

You can already sort and filter experiments based on any hyper parameter or metric that the experiment reports, there is no need for any custom language query. Also all created filter/sorted table can be shared exactly as they are, so you can create leaderboards and share specific filters. You can also use the search bar in order to filter based on experiment name / comment. Tags will be added soon as well πŸ™‚

Example of custom columns is here (the screen grab is a bit old, now there is als...

4 years ago
0 Hello, Community. I Hope You Are All Doing Well. I'M Seeking Information Regarding A Specific Problem, Specially In The Field Of Computer Vision. Typically, An App In The Field Of Computer Vision Will Have Multiple Models, Each With Its Own Preprocessing,

what is the best approach to update the package if we have frequent update on this common code?

since this package has an indirect affect on the model endpoint, I would package with the preprocess code of the endpoint.
Each server is updating it's own local copy, and it will make sure it can take it and deploy it hand over hand without breaking its ability to serve these endpoints.
the "wastefulness" of holding multiple copies is negligible when comparing to a situation where everyone ...

7 months ago
0 Dear Clearml Community, I Am Trying To Optimize Storage On My Clearml File Server When Doing A Lot Of Experiments. To Achieve This, I Already Upload Only The Newest And Best Checkpoints To Clearml File Server Instead Of All Checkpoints. Another Component

However, regarding your recommendation of using

StorageManager

class to delete the URL, it seems that this class only contains methods for checking existence of files, downloading files and uploading files, but

no method

for actually

deleting

files based on their URL (see doc

and

).

Yes you are correct 😞 you should use a "deeper" class:

helper = StorageHelper.get(remote_url)
helper.delete(remo...
7 months ago
0 Hello, Community. I Hope You Are All Doing Well. I'M Seeking Information Regarding A Specific Problem, Specially In The Field Of Computer Vision. Typically, An App In The Field Of Computer Vision Will Have Multiple Models, Each With Its Own Preprocessing,

, but what I really want to achieve is to share this code:

You mean to share the code between them, unless this is a "preinstalled" package in the container, each endpoint has it's own separate set of modules / files
(this is on purpose, so you could actually change them, just image diff versions of the same common.py file)

7 months ago
0 Has Anyone Tried Using Clearml With Ray Based Distributed Training For Computer Vision Models Like Resnet?

Should work out of the box, maybe the only thing to notice is that you will get a Task for every local_rank 0 process
does that make sense ?

8 months ago
0 Hi, I Have A Task Which Uses Hydra For Configuration. I Want To Add This Taks To A Pipeline, And Pass The Full Hydra Config Objects To The Task. Is There A Way To Do It? I Get “Parameters Should Be In The Form Of “`Section-Name`/Parameter”, Example: “Args

Okay this is a bit tricky (and come to think about it, we should allow a more direct interface):
pipe.add_step(name='train', parents=['data_pipeline', ], base_task_project='xxx', base_task_name='yyy', task_overrides={'configuration.OmegaConf': dict(value=yaml.dump(MY_NEW_CONFIG), name='OmegaConf', type='OmegaConf YAML')} )Notice that if you had any other configuration on the base task, you should add them as well (basically it overwrites the configurati...

3 years ago
0 Hope Everyone'S Having A Nice Holiday Period. I'Ve Been Debating Between Cron And The Clearml Taskscheduler Cron Is The Solution I'M Currently Using But I Wanted To Understand The Advantages To Using The Taskscheduler. Right Now I'M Using The Classic Cro

I start the TaskScheduler, register a task, and stop the scheduler, how do I restart the TaskScheduler in a way that re-register the tasks?

if it's aborted, just re-enqueue it?
(it serializes itself and stores it's state on the Task object, so when re-launched it will deserialize from the last state)

9 months ago
0 Hi, I Have A Small Question Regarding K8S Clearml-Serving Behavior. I Have In My Cluster One Gpu Of 16Gb Ram, And Another One Of 24 Gb Ram. I Have A Llm Model Fitting The 24Gb But Not The 16Gb Gpu. When I Call The Endpoint, How Will I Know To Which Gpu I

Hi @<1556812486840160256:profile|SuccessfulRaven86>
Every clearml-serving session (you can have multiple different "sessions") is assumed to be homogeneous, this would mean it will serve the same models on as many nodes as possible supporting multiple models per pod.
In your example I think the easiest is to create two serving sessions one with a node selector for the 24GB node and another for the 16GB node, wdyt?

10 months ago
0 Hi All, Is There Anyway To Get The Id Of The Pipeline Using Pipeline Name? I Need The Id Of The Pipeline So That I Can Schedule The Pipeline To Run Via

Hi @<1587615463670550528:profile|DepravedDolphin12>

Is there anyway to get the id of the pipeline using pipeline name?

In the UI top right "details" panel should have the Pipeline ID
Is this what you are looking for ?

one year ago
0 Hello Everone, I Have Hosted Clearml Server And Trained A Yolov8 Model To Test My Installations. The Model Was Trained Successfully And I Tried To Optimize The Hyderparameters By Using The Sample Code From Clearml But Im Getting Some Error In Doing So An

btw, I looked deeper into the log:

  File "/tmp/tmpfa8ifmka.py", line 80, in <module>
    model.train(data='coco128.yaml',epochs=20)

I'm assuming this all starts here, I think that the pipeline is Not running the code from the same folder, and you are just missing the 'coco128.yaml' try to pass a full path, wdyt?

9 months ago
0 Hello Everone, I Have Hosted Clearml Server And Trained A Yolov8 Model To Test My Installations. The Model Was Trained Successfully And I Tried To Optimize The Hyderparameters By Using The Sample Code From Clearml But Im Getting Some Error In Doing So An

I think I was not able to fully express my point. Let me try again.
When you are running the pipeline Fully locally (both logic and components) the assumption is this is for debugging purposes.
This means that the code of each component is locally available, could that be a reason?

9 months ago
0 Hello, I Am Trying To Use The Sdk Function

Hi @<1644147961996775424:profile|HurtStarfish47>

. I see

Add image.jpg

being printed for all my data items ...

I assume you forgot to call upload ? the sync "marks" files for uploaded / deletion but the upload call actually does the work,
Kind of like git add / push , if that makes sense ?

9 months ago
0 Hi, I Am New Here, Can I Ask Question On Trains-Server Also?

Hi CooperativeFox72 ,
From the backend guys, long story short, upgrade your machine => more cpu cores , more processes , it is that easy πŸ™‚

4 years ago
Show more results compactanswers