AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hi All—First Off, Thanks For Being Such A Helpful And Thorough Group Of People. I Learn A Ton Just Searching Through The Channel For Problems. I’M Seeing A Weird Issue. I Have A Conda Env On My Linux Machine, And I Can Successfully Run A Training Script

I can't seem to find a difference between the two, why would matplotlib get listed and pandas does not... Any other package that is missing?
BTW: as an immediate "hack" , before your Task.init call add the following:
Task.add_requirements("pandas")

3 years ago

0 Hello, I Am Looking For A Way To Increase Number Of Images Saved In Results>Debug Samples. Looks Like There Is A Limit Of 100 Images Per Experiment, And All Images Saved After Are Not Displayed In Web Client. I Like To Have First Batch With Predictions V

You mean I can do Epoch001/ and Epoch002/ to split them into groups and make 100 limit per group?

yes then the 100 limit is per "Epoch001" and another 100 limit for "Epoch002" etc. 🙂

3 years ago

0 Hi All, I Have Deployed A Clearml Server With Docker To One Of Our Local Machine. I Had Set Up The Filesserver Folder As Mount Point To The Cloud. How Easy Is It To Migrate Our Existing Experiments Later On To A Clearml Server That We Deploy In The Cloud

Oh, I was assuming you are passing the entire DB backups to the cloud.
Are you saying you just want the file server on the cloud ? if this is the case, I would just use S3

one year ago

0 When I Run An Experiment (Self Hosted), I Only See Scalars For Gpu And System Performance. How Do I See Additional Scalars? I Have

Thank you!!!

one year ago

0 Hi, Trying To Spin Up A Clearml Agent And Gettting This Error:

In the installed packages section it includes

pywin32 == 303

even though that is not in my requirements.txt.

So for some reason it is being detected (meaning your code base actually imports it in code)
But you can just remove it, either by manually editing the cloned Task (right click, reset, then you can edit the section), or via code
Task.ignore_requirements("pywin32") task = Task.init(...)

2 years ago

0 For Those Using Clearml For Model Storage - Do You Use It Just For Storing Checkpoints During Training, Or Do You Also Use It As A Canonical Storage Location For Fully Trained Models? Like For Services Using These Models That Are Deployed To Production, D

Hi ShallowArcticwolf27
First of all:

If the answer to number 2 is no, I'd loveee to write a plugin.

Always appreciated ❤

Now actually answering the Q:
Any torch.save (or any other framework save) will either register or automatically upload, the file (or folder) in the system. If this is a folder it will be zipped and uploaded, if a file just uploaded to to the assigned storage output (the cleaml-server, any object storage service, or shared folder). I'm not actually sure I...

3 years ago

0 Trying To Create A Data Pipeline On My Own. Wanted To Ask, For Each Batch Of Data, Do I Have To Create A New Dataset Object Or Do I Just Create One Dataset Object And Add Batches To It. If Its The Latter, Then How.

Basically lock the Task (so you cannot reset it or change it). Usually it also marks "ready to use" etc. It also will publish the models the Task created.

3 years ago

0 Hi, I Am Trying To Upload A Model But I Am Getting The Following Error:

SkinnyPanda43 issue verified, this seems to be related to python 3.9 and subprocesses.
Let me check what we can do

3 years ago

0 Hi All. I Am Struggling With Integrating Plots Into My Task. Without The Plotting Code, The Task Never Completes The Execution And Seems To Hang. Also, The Plots Are Not Visible In The Plots Tab. I Am Running A For Loop For Different Models And Attemptin

reproduced with matplotlib 3.1

3 years ago

0 Another Question: How Can I Make Clearml-Agent Use Pre-Installed Version From The Nvidia/Pytorch (

hm ReassuredTiger98 can you send the full log? I think it should have worked (but as you mentioned it might be conda/pip mix?!)

2 years ago

0 Another Question: How Can I Make Clearml-Agent Use Pre-Installed Version From The Nvidia/Pytorch (

ReassuredTiger98 yes this is odd:
also:
Warning, could not locate PyTorch torch==1.12 matching CUDA version 115, best candidate 1.12.0.dev20220407Seems like it found a matching version and did not use it...
Let me check that

2 years ago

0 Collecting Click Using Cached Click-8.0.1-Py3-None-Any.Whl (97 Kb)

So it makes sense it installs v8.0.1
(maybe originally you provided no version and it installed the latest one)
This is basically pip's doing the package version resolving

3 years ago

0 Collecting Click Using Cached Click-8.0.1-Py3-None-Any.Whl (97 Kb)

What do you have under the "installed packages" ?

3 years ago

0 Hi Anyone

Hi AstonishingWorm64
Is this the same ?
https://github.com/allegroai/clearml-serving/issues/1
(I think it was fixed on the later branch, we are releasing 0.3.2 later today with a fix)
Can you try:
pip install git+

3 years ago

0 Hi Anyone

FileNotFoundError: [Errno 2] No such file or directory: 'tritonserver': 'tritonserver'

This is oddd.
Can you retry with the latest from the github ?
pip install git+

3 years ago

0 Hi Anyone

(I'll make sure we reply on the issue as well later)

3 years ago

0 Hi Anyone

Bottom line the driver version in the host machine does not support the CUDA version you have in the docker container

3 years ago

0 Not Able To Resume A Hyper-Parameter Optmization.

Is this reproducible with the hpo example here:
https://github.com/allegroai/clearml/tree/400c6ec103d9f2193694c54d7491bb1a74bbe8e8/examples/optimization/hyper-parameter-optimization

What's your clearml version? (And is it possible you verify with the latest version?)

2 years ago

0 Wondering Why This Is The Case When Deploying The Clearml Server Locally

Open source defaults 😊

2 years ago

0 Hi Anyone

Hi AstonishingWorm64
I think you are correct, there is external interface to change the docker.
Could you open a GitHub issue so we do not forget to add an interface for that ?
As a temp hack, you can manually clone "triton serving engine" and edit the container image (under the execution Tab).
wdyt?

3 years ago

0 Greetings, Could You Please Clarify If It Is Possible To Reinstall All Packages Every Time? For Example, I Tried To Start The Agent With Docker Options And Got The Following Message:

How so? they are in one place? the creation of the venv is transparent, and the packages that are there are everything you have in the docker, plus the ability to override them from the UI.
What am I missing here ?

3 years ago

0 Hello, I'M Trying To Save A Keras Model As A Task Artifact, And Then Upload It From Another Task. Does Anyone Know The Syntax For That? What I'Ve Seen Is Not Quite Working.

Hi ConfusedPig65
Any keras model will be automatically uploaded if you pass an upload url to the Task init:
task = Task.init('examples', 'keras upload test', output_uri=" ")(You can also pass to output_uri s3://buckket/folder or change the default output_uri in the clearml.conf file)
After this line any keras model will be automatically uploaded (you will see it under the Artifacts Tab)
Accessing models from executed tasks:
` trains_task = Task.get_task('task_uid_here')
last_check...

3 years ago

0 Hello, I'M Trying To Save A Keras Model As A Task Artifact, And Then Upload It From Another Task. Does Anyone Know The Syntax For That? What I'Ve Seen Is Not Quite Working.

If you are using the latest RC:
pip install clearml==0.17.5rc5You can pass True it will use the "files_server" as configured in your clearml.conf
I used the http link as a filler to point to the files_server.
Make sense ?

3 years ago

0 Hello, I'M Trying To Save A Keras Model As A Task Artifact, And Then Upload It From Another Task. Does Anyone Know The Syntax For That? What I'Ve Seen Is Not Quite Working.

Then check in the clearml.conf under files_server
And use what you have there (for example http://localhost:8081 )

3 years ago

0 Hello, I'M Trying To Save A Keras Model As A Task Artifact, And Then Upload It From Another Task. Does Anyone Know The Syntax For That? What I'Ve Seen Is Not Quite Working.

You can check the keras example, run it twice, on the second time it will continue from the previous checkpoint and you will have input and output model.
https://github.com/allegroai/clearml/blob/master/examples/frameworks/keras/keras_tensorboard.py

3 years ago

0 Hello, I'M Trying To Save A Keras Model As A Task Artifact, And Then Upload It From Another Task. Does Anyone Know The Syntax For That? What I'Ve Seen Is Not Quite Working.

So I have a task that just loads a model, but I don't see it as an artifact in the UI

You should see it under Artifacts, Input model if you are calling Keras load function (or similar)

3 years ago

0 Hello, I'M Trying To Save A Keras Model As A Task Artifact, And Then Upload It From Another Task. Does Anyone Know The Syntax For That? What I'Ve Seen Is Not Quite Working.

🤞

3 years ago

0 Hi Everyone! Is Anybody Using Log-Scale Parameter Ranges For Hyper-Parameter Optimization? It Seems That There Is A Bug In The Hpbandster Module. I'M Getting Negative Learning Rates..

Hmm GreasyLeopard35 can you specify the range you are passing to the HPO, as well as the type of optimization class ? (grid/random/optuna etc.)

2 years ago

0 Hi Everyone! Is Anybody Using Log-Scale Parameter Ranges For Hyper-Parameter Optimization? It Seems That There Is A Bug In The Hpbandster Module. I'M Getting Negative Learning Rates..

` from clearml.automation.parameters import LogUniformParameterRange
sampler = LogUniformParameterRange(name='test', min_value=-3.0, max_value=1.0, step_size=0.5)
sampler.to_list()

Out[2]:
[{'test': 1.0},
{'test': 3.1622776601683795},
{'test': 10.0},
{'test': 31.622776601683793},
{'test': 100.0},
{'test': 316.22776601683796},
{'test': 1000.0},
{'test': 3162.2776601683795}] `

2 years ago

0 Hi Everyone! Is Anybody Using Log-Scale Parameter Ranges For Hyper-Parameter Optimization? It Seems That There Is A Bug In The Hpbandster Module. I'M Getting Negative Learning Rates..

GreasyLeopard35 I think you are on to something, I think UniformParameterRange just misses a min value:
https://github.com/allegroai/clearml/blob/fcad50b6266f445424a1f1fb361f5a4bc5c7f6a3/clearml/automation/parameters.py#L168
Should be:
[self.min_value + v*step_size for v in range(0, int(steps))]

2 years ago

Show more results