AgitatedDove14

48 Questions, 8051 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8051

0 How Can I Disable Agent Pip Caching? Sometimes The Agents Load An Earlier Version Of One Of My Libraries. I'M Running Them In Docker Mode

Hi ElegantCoyote26

sometimes the agents load an earlier version of one of my libraries.

I'm assuming some internal package that is installed from a wheel file not a direct git repo+commit link ?

2 years ago

0 I Have A Little Bit Of Code That Goes Like:

great 🙂
two things:
I'm not sure argparse supports dict as a type (I mean it will take anything but I'm not sure it will parse your arguments as dict) I know there was an issue with argparsing, but I think it was solvedbtw: Basically the way clearml-agent works, it does not actually pass the arguments in commandline but directly to the argparser at runtime
What happens if you clone the Task (the one with Args showing and without the explicit task.connect(_args) and send it to the age...

3 years ago

0 Hi Everyone! We Are Trying To Run Pipelines From Gitlab Ci Runners, But Are Faced With The Following Error When Performing

PreciousParrot26 I think this is really a matter of the CI process having very limited resources. just to be clear, you are correct and the steps them selves are Not executed inside the CI environment, but it seems that even running the pipeline logic is somehow "too much" for the limited resources... Make sense ?

2 years ago

0 Hi Everyone! We Are Trying To Run Pipelines From Gitlab Ci Runners, But Are Faced With The Following Error When Performing

OSError: [Errno 28] No space left on deviceHi PreciousParrot26
I think this says it all 🙂 there is no more storage left to run all those subprocesses

btw:

I am curious about why a

ThreadPool

of

16

threads is gathered,

This is the maximum simultaneous jobs it will try to launch (it will launch more after the launching is doe, notice not the actual execution) but this is just a way to limit it.

2 years ago

0 Hi Everyone! We Are Trying To Run Pipelines From Gitlab Ci Runners, But Are Faced With The Following Error When Performing

controller_object.start_locally()

. Only the pipelinecontroller should be running locally, right?

Correct, do notice that if you are using Pipeline decorator and calling run_locally() the actual the pipeline steps are also executed locally.
which of the two are you using (Tasks as steps, or functions as steps with decorator)?

2 years ago

0 Hey Since Hydra Does Not Work With

Hi TrickyFox41

Hey since Hydra does not work with

clearml-task

I should shouldn't it? what does not work ?

one year ago

0 I Found Here

Hi DrabCockroach54
I think the Kubernetes integration (k8s glue) is not part of the open-source features, and is only available as enterprise feature 😞

2 years ago

0 Hello! Is There A Way To Avoid Or Accelerate

Xeon E3-1240: 4 - 5 hours!wow... yes definitely worth upgrading 🙂

2 years ago

0 Does Clearml-Session Work In A Kubernetes Environment?

👍

3 years ago

0 I Want To Execute A Script Via Trains-Agent, But I Want To Be Able To Provide The Location Of A Config File By Specifying The Path Before Trains-Agent Executes The Script (Like A Flag Or Command Line Argument). How Can I Accomplish This?

You mean from code externally ?

4 years ago

0 Hi There, I’Ve Been Trying To Play Around With The Model Inference Pipeline Following

This is odd, how are you spinning clearml-serving ?
You can also do it synchronously :

predict_a = self.send_request(endpoint="/test_model_sklearn_a/", version=None, data=data)
predict_b = self.send_request(endpoint="/test_model_sklearn_b/", version=None, data=data)

one year ago

0 Hi All! I'M Using Clearml With Hydra As Configuration Manager. I'M Trying To Rerun A Task By Overriding Some Of The Configurations From The Ui. I Tried To Change The Config_Name Args In The Args Section And Also The Omegaconf Configuration In Configuratio

Hi LovelyHamster1
Could you think of a toy code that reproduces this issue ?

3 years ago

0 Hi There! Is There Any Way To Boost Creating Sha2 Hashes During

It uses only one CPU core, could I use multiprocessing somehow?

Hi EcstaticMouse10
Hmm, yes it should be multi core:
https://github.com/allegroai/clearml/blob/a9774c3842ea526d222044092172980ae505e24f/clearml/datasets/dataset.py#L1175
wdyt?

2 years ago

0 Hi, I Am Considering Different Plans Clearml Offers And It Would Be Great If Somebody Could Confirm If My Understanding Is Correct. So For Now What I Understood Is: All Of The Information Related To Tasks Like Artifacts, Scalars And Plots Are By Default U

Hi @<1566596960691949568:profile|UpsetWalrus59>
All correct with the exception of " ...or 1GB Metric" this is a limit, since metrics (and meta data) is always stored on the clearml-server, so it is metered. There is also an API limit, basically anti abuse, which of course resets every month, but if you are running tens of experiments at the same time you will hit this limit. Make sense ?

one year ago

0 Hi There! Is There Any Way To Boost Creating Sha2 Hashes During

Switching to process Pool might be a bit of an overkill here (I think)
wdyt?

2 years ago

0 We Are Facing Performance Issues Of Our Self-Hosted Clearml Server Looking At The Cpu Utilization \ Memory \ Networking We Couldn'T Identify A Bottleneck We Are At The Moment Using ~100 Workers For Some Hpo, And The Main Performance Issues We Observe Are

The api server by default spins multiple processes (they all might be busy a tye time with a huge flood of requests, but this is still multi process). Let me check if there is an easy way to set more processes

3 years ago

if I run my own ClearML self-hosted server?

Then you have everything on your end, it will not communicate with the saas offering. meaning no limits what so ever.
(That said some of the cloud auto-scaling and compute features are not part of the open source)

one year ago

0 Hi! I Have A Gpu Workstation At The Office (No Public Ip) With Latest Clearml-Agent Installed. When I Was In The Same Network - I Was Able To Use Clearml-Session From My Laptop. Now I Work From Home, And Clearml-Session Fails With

Many thanks FiercePenguin76 !

3 years ago

0 What’S The Point Of Tracking Artifacts Dynamically?

Registering some metadata as a model doesn’t feel correct to me.

Yes I'm with you 🙂
BTW what kind of meta-data would need versions during the life time of a Task ?

3 years ago

0 Hey! I'M Having A Weird Issue When I Run Pip Freeze Locally It'S Showing Version "Clearml==0.17.5Rc6" But When I Initiate The Task It'S Always Starting With "Clearml==0.17.2" - This Version Isn'T Accepting Tags Through The Code Etc. (I'M Manually Fixing I

SmallBluewhale13 in your code what are you getting when you print the version:
from clearml import __version__ print(__version__)

3 years ago

0 Does Clearml-Session Work In A Kubernetes Environment?

Have to get glue setup, which I couldn’t understand fully, so that’s a different topic

I suggest using the apply template setup (basically you provide a Job/Service template, and it uses that to setup k8s jobs based on the Tasks coming in from the specific queue)

3 years ago

0 Hi Clearml Experts. I Have A Question About Deploying Clearml On Kubenetes. Why Kubernetes? Does Clearml Also Provide An Execution Environment On Kubernetes Like Pachyderm? Or Will It Rely On External Engines (Like Say Spark On Kubernetes) To Run The Jobs

Hi NonsensicalSeaanemone47
I'm assuming you mean k8s as compute cluster?
If so, then yes clearml adds priority scheduling on top of your existing kl8s cluster. It also allows you to reuse images as the k8s spins the base container image and then inside the container image the agent sets the environment of the experiment (clones code, apply diff, install missing python packages etc.)
It also gives visibility into the executed pods.
Make sense ?

3 years ago

0 I Have A Logical Task That I Want To Split To Multiple Workers. The Task Involves Processing Media Files (Not Training). The Optimal Design For Me Would Be:

from what I gather there is a lightly documented concept

Yes ... 😞 the reason for it is that actually one could do:
` @PipelineDecorator.pipeline(...)
def pipeline(i):
....

if name == 'main':
pipeline(0)
pipeline(1)
pipeline(2) `Basically rerunning the pipeline 3 times
This support was added as some users found a use case for it, but I think this would be a rare one

2 years ago

0 I Have A Logical Task That I Want To Split To Multiple Workers. The Task Involves Processing Media Files (Not Training). The Optimal Design For Me Would Be:

Hi RoughTiger69
Interesting question, maybe something like:

` @PipelineDecorator.component(...)
def process_sub_list(things_to_do=[0,1,2]):
r = []
for i in things_to_do:
print("doing", i)
r.append("done{}".format(i))
return r

@PipelineDecorator.pipeline(...)
def pipeline():

create some stuff to do:

results = []
for step in range(10):
r = process_sub_list(list(range(step*10, (step+1)*10)))
results.append(r)

push into one list with all result, this will ac...

2 years ago

0 What Happens To File That Are Downloaded To A Remote_Execution Via Storagemanager? Are They Removed At The End Of The Run, Or Does It Continuously Increases Disk Space?

One reason I don't like using the configuration section is that it makes debugging much much harder.

debugging ? please explain how it relates to the configuration, and presentation (i.e. preview)
2.
Yes in theory, but in your case it will not change things, unless these "configurations" are copied on any Task (which is just storage, otherwise no real harm)
3.
I was thinking "zip" file that the Task creates and uploads, and a new configuration type, say "external/zip" , and in the c...

2 years ago

0 Hi! I Need Help Debugging The Following Issue Please. I'M Training A Cnn And Plotting The Confusion Matrices For Train And Val In Each Epoch. When I Get To Epoch 101, The Ui Kind Of Breaks..It Starts Showing Me The Images For Epoch 1. When I Right Click O

I was using clearml == 0.17.5 and I also had this issue

I think it was introduced when we moved to subprocess reporting, with 0.17.5
You can disable it with the following in clearml.conf:
sdk.development.report_use_subprocess = false

3 years ago

0 Follow Up On Execute_Remotely, I See One Can Limit The Available Gpu Resources In A Worker Daemon; Could One Also Limit The Number Of Cpu Cores Available?

could one also limit the number of CPU cores available?

If you are running in docker mode you can add:
--cpus=<value>see ref here: https://docs.docker.com/config/containers/resource_constraints/

Just add it to extra_docker_arguments :
https://github.com/allegroai/clearml-agent/blob/2cb452b1c21191f17635bcb6222fa8bfd82afe29/docs/clearml.conf#L142

2 years ago

0 Hi All, I'M Using Clearml 1.0.3 With Clearml-Server <1 (How Do I Get The Current Running Version?) In Pytorch-Lightning I Use Ddp And I See Multiple Tasks (As The Number Of Gpus) Being Created And Remaining In Draft Mode. Is It A Problem Running Clearml

An easier fix for now will probably be some kind of warning to the user that a task is created but not connected

That is a good point, maybe if you do not have a "main" Task, then we print the warning (with some flag to disable the warning) ?

3 years ago

0 One More Follow-Up Still; We'Re Trying To Run Non-Gpu Scaler, And I'Ve Finally Sorted Out Subnet And Security Groups Issues, Only To Run Into This:

hmm this might help:
https://pip.pypa.io/en/stable/topics/configuration/#environment-variables
basically you might be able to define:
PIP_NO_USE_PEP517=1

2 years ago

0 I Have A Self-Hosted Clearm-Server And And Clearml-Agent Started With

Great!
I'll make sure the agent outputs the proper error 🙂

3 years ago

Show more results