AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 Hey All, Hope You’Re All Doing Well. I’M Running A Self-Deployed Server (0.17, I Think, Where Can You Find The Version In Use?). I’M Having Trouble With The Automatic Plot Capture. If I Run

Okay let's see if I can reproduce it:
new conda env py==3.8 install clearml == 0.17.5rc5 matplotlib == 3.3.4 numpy == 1.20.1 seaborn == 0.11.1Clone repo run `python examples/frameworks/matplotlib/matplotlib_example.pyRight ?

4 years ago

0 Hi Guys, Just Wanted To Let You Know That Many Links In The Clearml Github Page Are Broken (I.E.,

Done

4 years ago

0 Hi All

In theory, one could go over previously executed tasks, and create a copy of a specific scalar metric.
ShallowCat10 does that make sense in your scenario ?

5 years ago

0 The “Manage Queue” Option In The Right Tab On A Queued Experiment Is Broken In V1.0 (It Does Nothing)

It should move you directly into the queue pages.
Let me double check (working on the community server)

4 years ago

0 Hello Folks! I Don'T Know If This Issue Has Already Been Addressed. I Have A Basic Pipelinecontroller Script With Two Steps: One Of Task Is For Preprocessing Purposes And The Other For Training A Model. Currently I Am Placing The Code Related To The Pack

Hmm what do you mean? Isn't it under installed packages?

4 years ago

0 Hi Team! Is There A Way To Make Clearml’S Aws Autoscaler And Queues Resource-Aware Please? I.E. If We Can Say, As We Enqueue Our Job, How Much Ram Or Gpu-Ram Or Even Gpus It Needs, Have The Scheduler/Autoscaler Dispatch The Job To Instances That Are Of Th

Having the ability to pack jobs/tasks onto the same "resource" (underlying server/EC2 instance)

This is essentially a "queue". Basically a queue is a way to abstract a specific type of resource, so that you can achieve exactly what you descibed.

open up a streaming use case, wherein batch (offline) inference could be done directly inside of a ClearML pipeline in reaction to an event/trigger (like new data landing in your data lake).

Yes, that's exactly how clearml is designed, a...

2 years ago

0 Any Ideas Why This Is Happening? It Was Fine Yesterday

TenseOstrich47 this looks like elasticserach is out of space...

4 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

(BTW: draft means they are in edit mode, i.e. before execution, then they should be queued (i.e. pending) then running then completed)

5 years ago

0 Hey, I Was Wondering How Can I Do Hparams Tuning With Trains? Couldn'T Find Anything On The Documentation

Yes, this seems like the problem, you do not have an agent (trains-agent) connected to your server.
The agent is responsible for pulling the experiments and executing them.
pip install trains-agent trains-agent init trains-agent daemon --gpus all

5 years ago

0 How Can I Disable Agent Pip Caching? Sometimes The Agents Load An Earlier Version Of One Of My Libraries. I'M Running Them In Docker Mode

Hi ElegantCoyote26

sometimes the agents load an earlier version of one of my libraries.

I'm assuming some internal package that is installed from a wheel file not a direct git repo+commit link ?

3 years ago

0 Hi

Yes it does. I'm assuming each job is launched using a multiprocessing.Pool (which translates into a sub process). Let me see if I can reproduce this behavior.

4 years ago

0 Whet Is The Method For Packages Exploration When Using Conda? Agent Is Set To 'Conda' Mode. We Upload A Task From A Local Conda Env That (Obviously) Has Some Pip Packages As Well. When We Enqueue The Task To Run Remotely, Not All Conda Packages Are Instal

Let me try to add some color to this process analysis process.
Basically clearml will try to statically analyze the code (i.e. look for import/from packages)
Then it will list them in a pip requirements.txt format under installed packages.
When running inside conda environment, it will check which packages were installed via "conda install" (instead of pip install) and mark them internally. This process ensures that when the clearml-agent is running with conda package manager, it "knows" whic...

3 years ago

0 Hi, I Was Trying To Test The Autoscaler Feature, But I Am Getting The Following Error:

Can you share the log?

3 years ago

0 Hi All! Question Around Resource Management Using

Containers (and Pods) do not share GPUs. There's no overcommitting of GPUs.Actually I am as well, this is Kubernets doing the resource scheduling and actually Kubernetes decided it is okay to run two pods on the Same GPU, which is cool, but I was not aware Nvidia already added this feature (I know it was in beta for a long time)
https://developer.nvidia.com/blog/improving-gpu-utilization-in-kubernetes/
I also see thety added dynamic slicing and Memory Proteciton:
Notice you can control ...

3 years ago

0 One More Follow-Up Still; We'Re Trying To Run Non-Gpu Scaler, And I'Ve Finally Sorted Out Subnet And Security Groups Issues, Only To Run Into This:

hmm this might help:
https://pip.pypa.io/en/stable/topics/configuration/#environment-variables
basically you might be able to define:
PIP_NO_USE_PEP517=1

3 years ago

0 Regarding The New Version 1.1.2, I Have Noticed Type Hints Are Now Included In The Script Generated By

(ignoring still having to fix the problem with

LazyEvalWrapper

return values).

fix will be pushed post weekend 🙂

such as displaying the step execution DaG in the PLOTS tab . (edited)

Wait, what are you getting on the DAG plot ? I think we "should" be able to see all the steps

4 years ago

0 Hi All, Is It Possible To Control The Number Of Steps Of The Pipeline During Run Time. Eg. If User Wants #N Parallel Steps In The Pipeline

yes

argument saying always create from code

can be helpful

@<1523701523954012160:profile|ShallowCormorant89> any chance you can open a github issue on that, just so we do not forget ?

if we can edit the configuration objects of a pipeline, that can be beneficial too. which we're unable to do from UI

Actually you already can, after you clone the pipeline, you can press on details then go to configuration Tab, and edit the pipeline object. The format is HOCON (...

2 years ago

0 Fyi: Conda Installation Of Pytorch Is Broken Again. My Old Tasks Which Worked Before Now Fail Since They Do Not Find Torch. However, I Can See In The Execution That Conda Had Errors. Most Probably It Happens Because Pytorch 1.8.1 Has Been Released, But I

My pleasure

4 years ago

0 Hi

👍

4 years ago

0 Is There Any Way To Clear The Installed Packages Of A Task Programmatically? (I.E. Using The Python Sdk And Not The Ui)

Regrading the missing packages, you might want to test with:
force_analyze_entire_repo: falsehttps://github.com/allegroai/trains/blob/c3fd3ed7c681e92e2fb2c3f6fd3493854803d781/docs/trains.conf#L162

Or if you have a full venv you like to store instead:
https://github.com/allegroai/trains/blob/c3fd3ed7c681e92e2fb2c3f6fd3493854803d781/docs/trains.conf#L169

BTW:
What is the missed package?

4 years ago

0 Hello! I Have A Problem With Tutorial Client Code Crashes On Starting Pipelines Remotely Via

Still, My problem is calling

pipe.start()

crashes.

is supposed to kill the process
2022-08-19 09:17:56,626 - clearml - WARNING - Terminating local execution processThis is what it writes before killing the local process.
` /opt/homebrew/anaconda3/envs/py39/lib/python3.9/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 16 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be ...

3 years ago

0 Hi All, Is It Possible To Control The Number Of Steps Of The Pipeline During Run Time. Eg. If User Wants #N Parallel Steps In The Pipeline

Hi @<1523701523954012160:profile|ShallowCormorant89>
This is generally based on number of agents, or am I missing something ? Also is it based on Task or decorated functions ?

2 years ago

0 Hello Community, I Had A Query Regarding Clearml-Data , Can The Dataset Be Queried Against Some Metadata Using Ui And/Or Cli ?

Where do you store those ?

4 years ago

0 Hi All, Is It Possible To Control The Number Of Steps Of The Pipeline During Run Time. Eg. If User Wants #N Parallel Steps In The Pipeline

@<1523701523954012160:profile|ShallowCormorant89> can you verify it is reproducible in 1.9.3 ? because if it is I'd like to fix that 🙂

will it be possible for us to configure the "new run" button in a way so that it always clones from a particular pipeline ?

What do you mean by "particular pipeline" ? by default it will clone the last successful one, and by right clicking a specific one you can run a copy of that one. what am I missing ?

2 years ago

0 Hi Folks. I'Ve Installed Clearml On K8S Cluster Using Helm Chart 7.11.0, If It Matters. When I Trying To Create "App Credentials" From Workspace Settings And Then Past Them To Clearml-Init - I Got The Error:

Hi @<1729309120315527168:profile|ShallowLion60>
How did you create those credentials ?

one year ago

0 Hi Guys, I Have Many Questions To Ask, Sorry If This Questions Were Posted Already - If The Answer Exist, Please, Point Me To It. Thank You For Your Help. I'M Training Object Detection Model Using Tf 2.3 Object Detection Api And Use Clearml On Local Serve

I have to admit, I haven't had the time 😞
Trying to get pip to be twice as fast 🤞
https://github.com/pypa/pip/pull/8215
Please keep pinging me, I would really like to follow on it.

4 years ago

0 Hi, I Try To Run Locally

Okay this seems correct...
Can you share both yaml files (server & serving) and env file?

3 years ago

0 Our Mac Users Are Having Some Issues. They Have Their Respective ~/Clearml.Conf, And Yet They Get: Clearml 1.1.5

Hmm UnevenDolphin73 I just checked with v.1.1.6, the first time the configuration file is loaded is when calling Task.init (if not running with an agent, which is your case).
But the main point I just realized I missed 🤯
"http://"${CLEARML_ENDPOINT}":8080"The code does not try to resolve OS environments there!
Which, well, is a nice feature to add
https://github.com/allegroai/clearml/blob/d3e986393ac8d1a1ea48302224962570ab8e6f9e/clearml/backend_api/session/session.py#L576
should p...

3 years ago

0 So, I'M Trying To Do A Several-Step Process, But It Needs To Run On A Gpu Queue In Clearml. How Would I Do That? Specifically, Here'S What I'M Trying To Do, Is It Possible?

Hi SmallDeer34
Is the Dataset in clearml-data ? If it is then Dataset.get().get_local_copy() will get you a cached local copy of the entire dataset.
If it is not, then you can use StorageManager.get_local_copy(url_here) to download the dataset.

Any Argparser is automatically logged (and later can be overridden from the UI). Specifically HfArgumentParser will be automatically logged https://github.com/huggingface/transformers/blob/e43e11260ff3c0a1b3cb0f4f39782d71a51c0191/examples/pytorc...

4 years ago

0 Hi Everyone! I Am Using Clearml-Serving When I Am Trying To Add New Endpoint Like This

Yeah I think that for some reason the merge of the pbtxt raw file is not working.
Any chance you have an end to end example we could debug? (maybe just add a pbtxt for one of the examples?)

2 years ago

Show more results