Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
AgitatedDove14
Moderator
49 Questions, 8124 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

25 × Eureka!
0 Clearml Doesn'T Pick Up Model Checkpoints Automatically. Any Idea What Might Be Wrong? (Code Attached In The Thread). Thanks

Hi @<1631102016807768064:profile|ZanySealion18>

ClearML doesn't pick up model checkpoints automatically.

What's the framework you are using?
BTW:

Task.add_requirements("requirements.txt")

if you want to specify Just your requirements.txt, do not use add_requirements use:

Task.force_requirements_env_freeze(requirements_file="requirements.txt")

(add requirements with a filename does the same thing, but this is more readable)

one year ago
0 Hi, Is There A Concept Of An Agent Taking More Then One Job?

doing some extra "services"

what do you mean by "services" ? (from the system perspective any Task that is executed by an agent that is running in "services-mode" is a service, there are no actual limitation on what it can do ๐Ÿ™‚ )

4 years ago
0 Hi All, I Am Running Into Ssl Verification Issues With Trying To Upload Model Artifacts To Minio. We Are Running The Clearml Agent In A Container, Have Mounted A Ca Bundle To The Container And Referenced It On Env Vars So That Aws Cli/Boto And Requests Us

hey, that worked! what library is being used that reads that configuration?

It's passed to boto3, but the pyhon interface and aws cli use different configuration, I guess, because otherwise it should have worked...

3 years ago
0 Is Anyone Also Experiencing Network Error During Every Clearml Dataset Download? It'S Been A While And Almost Every Download Fails...

Hmm maybe we should add a test once the download is done, comparing the expected file size and the actual file size, and if they are different we should redownload ?

3 years ago
0 <image>

Releasing an RC

4 years ago
0 I'M Using Tensorboard Summarywriter To Add Scalar Metrics For The Experiment. If Experiment Crashed, And I Want To Continue It From Checkpoint, For Some Reason It Plots Metrics In A Really Weird Way. Even Though I Pass Global_Step=Epoch To The Summarywrit

maybe I should use explicit reporting instead of Tensorboard

It will do just the same ๐Ÿ˜ž

there is no method for settingย 

last iteration

, which is used for reporting when continuing the same task. maybe I could somehow change this value for the task?

Let me double check that...

overwriting this value is not ideal though, because for :monitor:gpu and :monitor:machine ...

That is a very good point

but for the metrics, I explicitly pass th...

3 years ago
0 , This Is A Great Tool For Visualizing All Your Experiments. I Wanted To Know That When I Am Logging Scalar Plots With Title As Train Loss And Test Loss They Are Getting Diplayed As Train Loss And Test Loss In The Scalar Tab. I Wanted That The Title Shoul

Create one experiment (I guess in the scheduler)
task = Task.init('test', 'one big experiment')
Then make sure the the scheduler creates the "main" process as subprocess, basically the default behavior)
Then the sub process can call Task.init and it will get the scheduler Task (i.e. it will not create a new task). Just make sure they all call Task init with the same task name and the same project name.

5 years ago
one year ago
0 Hi All, I'Ve Successfully Run A Task Locally, And Now I'M Trying To Clone It And Send It To A Queue. It Looks Like The Environment Is Built Successfully, But It Hangs Here:

Retrying (Retry(total=239, connect=240, read=240, redirect=240, status=240)) after connection broken by 'SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1129)'))': /auth.login

OH that makes sense I'm assuming on your local machine the certificate is installed but not on remote machines / containers
Add the following to your clearml.conf:

api.verify_certificate: false

[None](https...

one year ago
0 Hey Trains Riders, This Must Be Something Simple I Am Missing, But Still I Couldn'T Realize What The Problem Is. I Am Trying To Run Trains-Agent On My Experiments. Setup Of The Server And The Agent Is Fine, But I Am Struggling To Run Real Experiments (Not

Hi ColossalDeer61 ,

Xxx is the module where my main experiment script resides.

So I think there are two options,
Assuming you have a similar folder structure-main_folder
--package_folder
--script_folder
---script.py
Then if you set the "working directory" in the execution section to "." and the entry point to "script_folder/script.py", then your code could do:
from package_folder import ABC
2. After cloning the original experiment, you can edit the "installed packages", and ad...

5 years ago
0 Can Anyone Complete This [Demo](

Hi, what is host?

The IP of the machine running the ClearML server

one year ago
0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

Ohh I see, makes total sense. I'm assuming the code base itself can do both ๐Ÿ™‚

5 years ago
0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

Sorry that was a reply to:

Otherwise I could simply create these tasks with Task.init,

5 years ago
0 Hi, One More Question: When Creating A Task With Task.Init(), We Can Specify The

Hi JitteryCoyote63 , I have to admit, we have not thought of this scenario... what's the exact use case to clone a Task and change the type?

Obviously you can always change the task type, a bit of a hack but should work:
task._edit(type='testing')

5 years ago
0 Can Anyone Complete This [Demo](

I'm assuming the reason it fails is that the docker network is Only available for the specific docker compose. This means when you spin Another docker compose they do not share the same names. Just replace with host name or IP it should work. Notice this has nothing to do with clearml or serving these are docker network configurations

one year ago
0 Hello, I Have An Error While Installing Git Dependencies Of Local Package: So Far I Used Task.

JitteryCoyote63 you mean? (notice no brackets)
task.update_requirements(".")ย Either pass a text or a list of lines:
The safest would be '\n'.join(all_req_lines)

4 years ago
0 Hello, This Is The Following Python Code I Had Saved As Main.Py.

Seems like credentials error
Do you have everything setup correctly in your ~/clearml.conf ?

4 years ago
0 My Autoscaled Instance Fails When Running "Git Clone" On A Private Repo. I

Actually, dumb question: how do I set the setup script for a task?

When you clone/edit the Task in the UI, under Execution / Container you should have it
After you edit it, just push it into the execution with the autoscaler and wait ๐Ÿ™‚

2 years ago
3 years ago
0 Is It Not Possible To Add Artifacts To A Completed Task?

task = Task.get_task('task_id_here') task.mark_started(force=True) task.upload_artifact(..., wait_on_upload=True) task.mark_completed()

3 years ago
0 When Use Gcp Bucket As Files_Server + Yolov5 Train For Now Its Upload The Model In The End To

Yes, or at least credentials and API...
Maybe inside your code you can later copy the model into fixed location ?
This way you have the model in the model repository and a copy in a fixed location (StorageManager can upload to a specific bucket/folder with the same credentials you already have)
Would that work?

2 years ago
Show more results compactanswers