AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 5 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Another Question, Is It Possible To Run A Single Experiment Which Is Composed Of Multiple Steps Executed As Sequential Sub-Processes Where The Current Task Is Fetched As

👍

3 years ago

0 Hi All, I Use .Get_Local_Copy() To Get A Local Copy For Each Of My Artifacts Logged In A Task. I Currently Have 160 Files Which I Want To Get A Local Copy. Each Artifact Is A Numpy Array (.Npz File) Uploaded Using .Upload_Artifact() Before. When I Run .Ge

Hi ScatteredClams84

Is there any parameter that adjusts the "number of files that can be stored in the cache"? I am using clearml python version 1.0.3 to upload artifacts and get the artifacts back from a task. (edited)

Yes you are correct, the default value is 100 entries.
You can configure it in the clearml.conf, just add:
sdk.storage.cache.default_cache_manager_size = 1000or from code:
` from clearml.storage.cache import CacheManager
CacheManager.get_cache_manager(cache_file_...

3 years ago

0 How Does Clearml Associate Projects/Experiments With Git Repos? Can I Think Of It As Clearml Project = Git Repo And Clearml Experiment = Git Commit? What About Git Branches - Is There Any Way To Organize Things Such That Separate Branches Are Easy To Trac

Intersting!
I would also add that Task name is not unique and you can use to describe the "process / goal etc" which would make it pretty obvious to search / review from the UI.
Regrading models and branchs, Iw ould use the Task tags (you can have as many as you like) to tag the specific model type (or dev branch if the alg is diff), this means you can also easily filter based on the Tags in the UI.

can you use the Web UI to compare the artifacts from two separate subprojects?

Yes comp...

one year ago

0 Hi, I Have A File On Azure Blob, Which Will Be A Parent For Some Experiments, Which In Every One Of Them I Will Manipulate The Orig File. Now I Want To Create A Dataset, Define The Orig File As The Parent, And Then, While Creating Each Of The New Files, D

I can add files to the data set, even after I finish the experiment?

Correct
https://clear.ml/docs/latest/docs/clearml_data#creating-a-dataset
https://clear.ml/docs/latest/docs/guides/data%20management/data_man_cifar_classification
https://github.com/allegroai/clearml/blob/master/docs/datasets.md#create-dataset-from-code

3 years ago

0 Has Anyone Successfully Deployed Clearml On A Kube Cluster Utilizing Istio? I Don’T See Any Mention Of Istio In The Docs.

Hmm I think the easiest is using the helm chart:
https://github.com/allegroai/clearml-server-helm-cloud-ready
I know there is work on a teraform template, not sure about instio.
Is helm okay for you ?

3 years ago

0 Is It Not Possible To Fire And Forget A Pipeline? In This Example, What’S The Step To Just Trigger A Pipeline Without Waiting? Without The Pipe.Wait() I Don’T See The Stages Running

Correct 🙂

3 years ago

0 Is It Possible To Add A Callback For A Pipeline From A Step?

See Args section in the screenshot
"Args/counter"

3 years ago

0 Hi, I Am Trying To Start A Poc With Server And Agent And A Git Repository That Has A Submodule. I Don'T Need The Agent To Try To Fetch The Submodule, Is There A Way To Control The Clone Command? Avoid Calling Submodules?

FYI all the git pulls are cached even in docker mode so there is no "tax" to pay for pulling the sub-modules (only the first time of course)

4 years ago

0 Hello, There Is A Means To Export / Import Task Using Task.Export_Task, Task.Import_Task. Is There A Way To Preserve The Task Id When We Bring This Task From One Clearml Server To Another? Both Clearml Server Are Not Connected.

OddShrimp85
the Task id is UUID that is generated by the backend server, there is no real way to force it to have a specific value 😞

one year ago

0 Is There Any Documentation For

MelancholyElk85 that looks great, let me see how quickly we can push it (I think 1.1.5 needs to be pushed very soon, I'll check if we can have it before 🙂 )

2 years ago

0 Hello! I'M Just Starting Out With Clearml, And I Seem To Be Having Some Sort Of Conflict Between

SmallDeer34
I think this is somehow related to the JIT compiler torch is using.
My suspicion is that JIT cannot be initialized after something happened (like a subprocess, or a thread).
I think we managed to get around it with 1.0.3rc1.
Can you verify ?

3 years ago

0 Hey, Everybody! I Am A New User Of The Clearml Service, And I Would Like To Ask You About Your Experience With Clearml Working With An Aws Virtual Machine. My Problem Is That When The Aws Virtual Machine Is Killed, My Pipelines And Scheduling Stop Working

Hi @<1661904968040321024:profile|SpotlessOwl43>

My problem is that when the AWS virtual machine is killed, my Pipelines and Scheduling stop working because of the killed ClearML agent,

are you using the ClearML AWS autoscaler to spin that machine ? or are you spinning it manually ?

8 months ago

0 Hi, I Am Trying To Upload A Plot To An Existing Task Using The

SmarmyDolphin68 okay what's happening is the process exists before the actual data is being sent (report_matplotlib_figure is an async call, and data is sent in the background)
Basically you should just wait for all the events to be flushed
task.flush(wait_for_uploads=True)That said, quickly testing it it seems it does not wait properly (again I think this is due to the fact we do not have a main Task here, I'll continue debugging)
In the meantime you can just do
sleep(3.0)And it wil...

3 years ago

0 Web Server Ui Bug? When Trying To Extend The Width Of A Column In The Experiments Table, If You Try To Extend It More Then The Width Of The Column To The Right, It Doesn'T Do Anything..

Wait, how do I reproduce it on community server? Maybe it has something to do with number of columns ? Or whether it is already wider than the screen? What's your browser / OS ?

2 years ago

0 Hi All - I Am Expeiencing Some Weird Behavior Using Clearml Experiment Tracking With Hydra Configurations. My Hydra Omegaconf Configuration Object Is Not Always Being Picked Up, And I Am Unable To Consistently Reproduce It. Sometimes I Get The Omegaconf

Hi @<1655744373268156416:profile|StickyShrimp60>

My hydra OmegaConf configuration object is not always being picked up, and I am unable to consistently reproduce it.
... I am using clearml v1.14.4,

Hmm how can we reproduce it? what are you seeing what it does "miss" the hydra, i.e. are you seeing any Hydra section? how are you running the code (manually , agent ?)

6 months ago

0 I Have A Set Up An Agent, On A Gpu Machine, And Spun Up The Daemon In Docker Moder, And Specifically Specified A Gpu That It Will Work With. The Image Is Okay And I Verified That By Running

Hmmm could you attach the entire log?
Remove any info that you feel is too sensitive :)

4 years ago

0 Hi, I Run 'Manually' On My Local Machine With No Errors. Then, I Clone The Completed Task And Enqueue It. I Get To Stage When 'Environment Setup Completed Successfully'. But Right After I Get An Error Related To 'Connect' Method - Task.Connect(Config.Mode

@<1571308003204796416:profile|HollowPeacock58> seems like an internal issue copying this object config.model
This is a complex object, and it seems that for some reason
None

As a workaround just do not connect this object. it seems you cannot pickle it / copy it (see GH issue)

one year ago

0 For The Frameworks Which Are Supported In Built, Trains Stores The Trained Model As Output Model E.G. For Xgboost Here

PompousParrot44
you can always manually store/load models, example: https://github.com/allegroai/trains/blob/65a4aa7aa90fc867993cf0d5e36c214e6c044270/examples/reporting/model_config.py#L35 Sure, you can patch any frame work with something similar to what we do in xgboost, any such PR will be greatly appreciated! https://github.com/allegroai/trains/blob/master/trains/binding/frameworks/xgboost_bind.py

4 years ago

0 Hi All, I Am Getting A Bunch Of This Kind Of Log Messages "Clearml.Storage - Info - Starting Upload: /Tmp/.Clearml.Upload_Model_6Ou50Pb1.Tmp =>" I Am Pretty Sure They Happen As A Part Of The Model Initialization About 10 Of Those, My Guess Is That Every T

Hi RipeGoose2
What exactly is being uploaded ? Are those the actual model weights or intermediate files ?

3 years ago

0 Hey! I'M Having A Weird Issue When I Run Pip Freeze Locally It'S Showing Version "Clearml==0.17.5Rc6" But When I Initiate The Task It'S Always Starting With "Clearml==0.17.2" - This Version Isn'T Accepting Tags Through The Code Etc. (I'M Manually Fixing I

It seems stuck somewhere in the python path... Can you check in runtime what's os.environ['PYTHONPATH']

3 years ago

0 Is There An Elegant Way To Download All Images Posted In “Debug_Samples” From The Trains Server?

Or you want to generate it from a previously executed run?

3 years ago

0 After I Have Create A Task And Closed It In A Notebook, Any Activity Seems To Trigger Another Task. For Example:

How can I ensure that additional tasks aren’t created for a notebook unless I really want to?

TrickySheep9 are you saying two Tasks are created in the same notebook without you closing one of them ?
(Also, how is the git diff warning there with the latest clearml, I think there was some fix related to that)

3 years ago

0 Hello, Has Anyone Know Any Solutions To This?

Hi, I changed it to 1.13.0, but it still threw the same error.

This is odd, just so we can make the agent better, any chance you can send the Task log ?

one year ago

0 Hi All, Is It Possible To Control The Number Of Steps Of The Pipeline During Run Time. Eg. If User Wants #N Parallel Steps In The Pipeline

However, the pipeline experiment is not visible in the project experiment list.

I mean press on the "full details" in the pipeline page

one year ago

0 Hi, I Am Trying To Setup Multi-Node Training With Pytorch Distributeddataparallel. Ddp Requres A Launch Script With A Set Of Parameters To Be Run On Each Node. One Of These Parameters Is Master Node Address. I Am Currently Using The Following Scheme:

` task = Task.init(...)

assume model checkpoint

if task.models['output']:

get the latest checlpoint

model_file_or_path = task.models['output'][-1].get_local_copy()

load the model checkpoint

run training code `RoughTiger69 Would the above work for you?

2 years ago

0 I Want To Run My Clearml Task On An Agent In K8S Together With A Memory Profiler (Maybe

but this will be invoked before fil-profiler starts generating them

I thought it will flush in the background 😞
You can however configure the profiler to a specific folder, then mount the folder to the host machine:
In the "base docker args" section add -v /host/folder/for/profiler:/inside/container/profile

3 years ago

0 Is There A Way To Copy Local Python Packages To The Agent As Well? (Screenshot Attached Within The Thread)

YEYYYYYYyyyyyyyyyyyyyyyyyy

3 years ago

0 Hi Guys, Do You Support Pipenv And Pipfile.Lock As Deps List Instead Of Requirments.Txt?

Are you referring to Poetry ?

3 years ago

0 Hi Everyone! Is Anybody Using Log-Scale Parameter Ranges For Hyper-Parameter Optimization? It Seems That There Is A Bug In The Hpbandster Module. I'M Getting Negative Learning Rates..

` from clearml.automation.parameters import LogUniformParameterRange
sampler = LogUniformParameterRange(name='test', min_value=-3.0, max_value=1.0, step_size=0.5)
sampler.to_list()

Out[2]:
[{'test': 1.0},
{'test': 3.1622776601683795},
{'test': 10.0},
{'test': 31.622776601683793},
{'test': 100.0},
{'test': 316.22776601683796},
{'test': 1000.0},
{'test': 3162.2776601683795}] `

2 years ago

0 Anyway To Make A Job Fail If The Required Python Version (3.7 Vs 3.8 For Example) Is Not Available In The Agent?

then when we triggered a inference deploy it failed

How would you control it? Is it based on a Task ? like a property "match python version" ?

3 years ago

Show more results