AgitatedDove14

49 Questions, 8122 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8122

0 Has Anyone Got Any Experience With C++ Extensions In Python When Using Clearml? In Our Setup.Py We Have:

Manually I was installing the

leap

package through

python -m pip install .

when building the docker container.

NaughtyFish36 what happnes if you add to your "installed packages" /opt/keras-hannd ? This should translate to "pip install /opt/keras-hannd" which seems like exactly what you want, no ?

2 years ago

0 Let’S Imagine I’M Building A Pipeline With Five Consecutive Steps, Where Some Of The Steps Are Non Ml/Dl Based. Using Clearml I Run A Lot Of Experiments To Find The Right Pipeline Configuration. After I Found The Right Algorithms And Parameters For My Pip

without the ClearML Server in-between.

You mean the upload/download is slow? What is the reasoning behind removing the ClearML server ?

ClearML Agent per step

You can use the ClearML agent to build a socker per Task, so all you need is just to run the docker. will that help ?

4 years ago

0 When I Try To Create Experiment In The Ui All I See Is This Dialogue

seems like I'm passing in my own docker image which is then used at run time?

You are passing the Default docker image, if the Task does not list a specific docker image it will use the one you passed.
Yes this is "docker mode" (in venv mode no dockers are used, it just creates a new venv per experiment and installs everything inside the venv)

2 years ago

0 I Originally Posted In

ldconfig from

/etc/profile

which is put there by the interactive_session_task

LackadaisicalOtter14 are you sure ? maybe this is done as part of the installation the interactive session runs ?
Could that be the issue ?
apt-get update && apt-get install -y openssh-server

3 years ago

0 Can Someone Help Me With Deploying This Example Model (From Triton Inference Server) Deployed In Clearml-Serving? Too Many Random Errors For Me To Figure It Out

On my to do list, but will have to wait for later this week (feel free to ping on this thread to remind me).
Regrading the issue at hand, let me check the requirements it is using.

4 years ago

0 Hi All, I Have An Issue With The Way Hyper Parameters Are Logged Under Configuration, The Values That Are Stored Seem To Add Unnecessary Escape Characters To The Original Values.. Is It A Known Issue? Is There A Way To Change It? Thanks

BTW:
str('\.') Out[4]: '\\.' str(('\.', )) Out[5]: "('\\\\.',)"This is just python str casting

4 years ago

0 I Found Here

Hi DrabCockroach54
I think the Kubernetes integration (k8s glue) is not part of the open-source features, and is only available as enterprise feature 😞

2 years ago

0 When I Try To Create Experiment In The Ui All I See Is This Dialogue

BoredHedgehog47 this is basically a wizard explaining the steps, see the 3 tabs 🙂
BTW, you can launch an experiment directly from CLI with clearml-task
https://clear.ml/docs/latest/docs/apps/clearml_task

2 years ago

0 Hi, Does Anyone Use Mlflow / Weight & Biases /

EnviousStarfish54 you can also run the docker-compose on one of the machines on your local LAN. but then you will not be able to access it from home 🙂

5 years ago

0 Hi, When Migrating From The Clearml Server To A Self Hosted Server Is There A Way To Transfer All The Data/Training Tasks Between Them?

off the top of my head, the self hosted is missing the autoscalers (there is an AWS CLI, but no UI or others), also missing a the HPO UI feature,
but you should just check the detailed table here: None

2 years ago

0 Hi! I’M Running An Experiment As Follows:

Yes, actually ensuring pip is there cannot be skipped (I think in the past it cased to many issues, hence the version limit etc.)
Are you saying it takes a lot of time when running? How long is the actual process that the Task is running (just to normalize times here)

2 years ago

0 Hi, I Noted That Clearml-Serving Does Not Support Spacy Models Out Of The Box And That Clearml-Serving Only Supports Following;

2 and 3 - I want to manage access control over the RestAPI

Long story short, put a load-balancer in front of the entire thing (see the k8s setup), and have the load-balancer verify JWT token as authentication (this is usually the easiest)

1- Exactly, custom code

Yes, we need to add a custom example there (somehow forgotten)
Could you open an Issue for that?
in the meantime:
` #

Preprocess class Must be named "Preprocess"

No need to inherit or to implement all methods

lass P...

3 years ago

0 Hi, How Can I Remove A Tag From A Task Via Code In A Non-Barbaric Way?

SmarmySeaurchin8
updated_tags = task.tags
updated_tags.remove(tag)
task.tags = updated_tags

4 years ago

0 Hi. I'M Just Starting Out Here Trying To Evaluate Clearml Ease Of Use. I'D Like To Understand Whether Clearml (Paid Service) Can Receive Access To A Gcp Project And Use Gke To Spin Clusters Up And Workers Or Would That Be On The Customer To Manage.

Hi PanickyMoth78

can receive access to a GCP project and use GKE to spin clusters up and workers or would that be on the customer to manage.

It does, and also supports AWS.
That said only the AWS is part of the open-source, but both are parts of the paid tier (I think Azure is in testing)

3 years ago

0 Is There Some Support Of Multi-Machine Training On Clearml Level?

Hi HelpfulHare30

I mean situations when training is long and its parts can be parallelized in some way like in Spark or Dask

Yes that makes sense, with both the function we are paralleling usually bottle-necked in both data & cpu, and both frameworks try to split & stream the data.
ClearML does not do data split & stream, but what you can do is launch multiple Tasks from a single "controller" and collect the results. I think that one of the main differences is that a ClearML Task is ...

4 years ago

0 "5451Af93E0Bf68A4Ab09F654B222Ccae": { "1B790A3Da2E8D6Cd939Cf271694Fe81B": { "Metric": ":Monitor:Gpu", "Variant": "Gpu_0_Utilization", "Value": 0.0, "Min_Value": 0.0,

. Can I get gpu usage over time frame via API also?

task.get_reported_scalarsBut this will get you All the scalars, I think the next version of the server supports asking a specific one as well.
How are you implementing the alert monitoring?
Is is a stateless process starting every X min, or is it a state-full process running and monitoring ?

2 years ago

0 Hi I'M Trying To Run A Hyperparameter Tuning Experiment On A Privately Hosted Server And The Trials Are Forever Enqueued (Status: Pending) As Long As The Main Task Is Running But The Workers Are Never Utilised When The Trials Are Not Running. Is This Expe

Check on which queue the HPO puts the Tasks, and if the agent is listening to these queues

4 years ago

0 Hello, Does Anyone Know How To Bypass Package Management By Clearml If Using Docker Mode? I Want To Achieve The Following -

TroubledHedgehog16 if you have a preinstalled conda env then why would you need to reinstall it from yml file? Also if this is the default python env, clearml-agent will inherit from it and use i, (no real overhead there)
Notice the reason for "inheriting system" python environments is so that the agent could cache the individual Task requirements, meaning next time it will not need to reinstall anything
wdyt?

2 years ago

0 Hi, I Have Quite A Generic Question. Basically, I Am Picking Your Brains For Any Solution. Our Current Pipeline Has (Clearml-Data, Clearml And Seldon). We Were Looking For Some Workflow Orchestrator To Stitch Them Up. One Scenario:

DeliciousBluewhale87 fyi, the new version of the pipeline (hopefully pushed towards the end of this week), will allow you to more easily write steps as functions (not only as Tasks, as the current implementation)
Also check the new Trigger and Scheduler both intended to trigger these pipelines:
https://github.com/allegroai/clearml/blob/fe3c481c37e70881c44d67c1cf9bbce00a84747e/clearml/automation/scheduler.py#L457
https://github.com/allegroai/clearml/blob/fe3c481c37e70881c44d67c1cf9bbce00a8...

4 years ago

0 Sorry For The Bombarding With Errors.. But Here Comes Another One

WackyRabbit7 the auto detection will only import direct packages you import (so that we do not end up with bloated venvs)
It seems that the transformers library does not have it as a requirements, otherwise it would have pulled it...
In your code you can always do either:
import torchor
Task.add_requirements('torch')

4 years ago

0 Hi Guys, Just Wanted To Let You Know That Many Links In The Clearml Github Page Are Broken (I.E.,

Done

4 years ago

0 Assuming I Have A

A few implementation / design details:
When you run code with Trains (and call init) it will record your environment (python packages, git code, uncommitted changes etc) Everything is stored on the Task object in the trains-server, when you clone a task you literally create a copy of the Task object (i.e. a second experiment). on the cloned experiment, you can edit everything (parameters, git, base docker image etc) When you enqueue a Task you add its ID to the execution queue list a trains-a...

5 years ago

0 Has Anyone Got Any Experience With C++ Extensions In Python When Using Clearml? In Our Setup.Py We Have:

NaughtyFish36

No module named 'leap.learn.data_tools.merge_data.merge_data'

This seems to be the error but I cannot see leap in the installed packages , Notice that if the Task has "Installed Packages" section then the agent will use that Not the "requirements.txt" , Only if this section is Empty it will revert to the "requirements.txt" in the repo.
How did you create the Task in the first place?
I see that you added "leap" into the initial bashscript, actually you should add i...

2 years ago

0 What Is The Suggested Way Of Running Trains-Agent With Slurm? I Was Able To Do A Very Naive Setup: Trains-Agent Runs A Slurm Job. It Has The Disadvantage That This Slurm Job Is Blocking A Gpu Even If The Worker Is Not Running Any Task. Is There An Easy Wa

Okay this more complicated but possible.
The idea is to write a glue layer (service) that pulls from the (i.e UI) queue
sets the slurm job
and puts it in a pending queue (so you know the job s waiting in the slurm scheduler)
There is a template here:
https://github.com/allegroai/trains-agent/blob/master/trains_agent/glue/k8s.py
I would love to help and setup a slurm glue in a similar manner
what do you think?

4 years ago

0 Assuming I Have A

That is correct.
Obviously once it is in the system, you can just clone/edit/enqueue it.
Running it once is a mean to populate the trains-server.
Make sense ?

5 years ago

0 Hi! I Have Local Minio Setup, Via Minio Browser I Can Upload 50-100 Mb Per Second As Its Local. But When I Try To Use Task.Upload_Artifact It Uploads 500 Kb Per Second. Does Anyone Have An Idea About This?

What if I register the artifact manually?

task.upload_artifact('local folder', artifact_object=' ')This one should be quite quick, it's updating the experiment

5 years ago

0 Hey, Trying To Figure Out How To Create An

while I'm looking to upload local weights

Oh, so this is not "importing uploaded (exiting) model" but manually creating a Model.
The easiest way to do that is actually to create a Task for Model uploading, because the model itself will be uploaded to unique destination path, and this is built on top of the Task.
Does that make sense ?

2 years ago

0 "5451Af93E0Bf68A4Ab09F654B222Ccae": { "1B790A3Da2E8D6Cd939Cf271694Fe81B": { "Metric": ":Monitor:Gpu", "Variant": "Gpu_0_Utilization", "Value": 0.0, "Min_Value": 0.0,

Hi DrabCockroach54

Do we know if gpu_0_mem_usage and gpu_0_mem_used_gb, both shows current GPU usage?

the first is percentage used (memory % used at any specific moment) and the second is memory used GiB , both for the video memory

How to know from this how much GPU is reserved for the task if this task is in progress?

What do you mean by how much is reserved ? Are you running with an agent?

2 years ago

0 Hi, Is There A Way To Log

Which means you currently save the argument after resolving and I'm looking to save them explicitly so the user will not forget to change some dependencies.

That is correct

I'm looking to save them explicitly so the user will not forget to change some dependencies.

Hmm interesting point. What's the use case for storing the values before the resolving ?
Do we want to store both ?
The main reason for storing the post resolve values, is that you have full visibility to the actual...

4 years ago

0 Question Regarding Tensorboard (If There Is An Answer Here Already Please Send Me A Link). I Have A Few Graphs With The Same X Axis But Different Y Axis That Are Presented On Different Graphs In Tensorboard And For Some Reason Trains Joins Them On The Sam

BTW: CloudyHamster42 I think this issue was discussed on GitHub, and the final "verdict" was we should have an option to split/combine graphs on the UI side (i.e. similar to the "smoothing" or wall-time axis etc.)

5 years ago

Show more results