AgitatedDove14

48 Questions, 8051 Answers

Active since 10 January 2023

Last activity 7 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8051

0 Hi, I'M Having Some Issues That I Can'T Seem To Find Where The Problem Is Or How To Solve It. I'M Running Some Code On The Worker When I'M Trying To Download One Of The Artifacts That Can Be Found In The Input Model Task I'M Getting:

PompousBeetle71 could you check that the "output:destination" is the same for both experiments ?

4 years ago

0 I Am Starting To Use Clearml-Data, And I Have A Feature Request - Add A Progress Bar For The Upload Phase / Log Which Files Are Uploaded / Add Upload Speed Currently When Uploading Large Amounts Of Data, We Can An Obscure Message Of

The issue is uploading reporting fro http uploads (object storage will report upload). Basically the http upload is post with urllib that does not support upload callbacks for progress report. If you have an idea here, we will gladly add it (as you mentioned it can be quite annoying to have to open network manager to verify the upload is progressing)

3 years ago

0 Hi! I Am Currently Using Hydra+Clearml And Wanted To Know If There Are Still Some Updates Coming. At The Moment, If I Change The Defaults Hydra Uses From The

GrievingTurkey78 I see,
Basically the arguments after the -m src.train in the remote execution should be ignored (they are not needed).
Change the m in the Args section under the configuration. Let me know if it solved it.

3 years ago

0 Hello Community! How I Can Add S3 Credentials To S3 Bucket In Example.Env For Clearml-Serving-Triton? I Need To Add Bucket Name, Keys And Endpoint

Just making sure i understand, you are to upload your models with clearml to the Yandex compatible s3 storage?

2 years ago

0 I Uncommented The Line

HurtWoodpecker30

The agent uses the

requirements.txt

)

what do you mean by that? aren't the package listed in the "Installed packages" section of the Task?
(or is it empty when starting, i.e. it uses the requirements.txt from the github, and then the agent lists them back into the Task)

2 years ago

0 Hi! Is There Something Happening With The

Oh my bad, post 0.17.5 😞
RC will be out soon, in the meantime you can install directly from github:
pip install git+

3 years ago

0 Hello Everyone! Found Some Strange Behavior With Histogram Logging: When I View My Neural Network Weight Distribution, I See The First Picture In Tensorboard And The Second In Trains Plots Tab. Tensorboard Plots Expected Unimodal Histogram, But Trains Cl

ProudMosquito87 Just a few pointers on how we convert the TB histograms to awesome (but less accurate) 3D surfaces.
First I have to admit, I almost never use these histograms, maybe to detect a plateau of if something goes really wrong...
The 3D surface is basically grouping all the histograms and then bucketing them (I think the default is 50 buckets) so that you get a general feel of what's going on, not necessary a detailed view. Bottom line, you are correct, the TB is the source of truth...

4 years ago

0 What Is The Suggested Way Of Running Trains-Agent With Slurm? I Was Able To Do A Very Naive Setup: Trains-Agent Runs A Slurm Job. It Has The Disadvantage That This Slurm Job Is Blocking A Gpu Even If The Worker Is Not Running Any Task. Is There An Easy Wa

Sure thing, and I agree it seems unlikely to be an issue 🙂

4 years ago

0 Hi, With The Upcoming Version Of Hydra It Seems The Binding Breaks. Specifically In The

Ohh sorry you will also need to fix the def _patched_task_function
The parameter order is important as the partial call relies on it.

3 years ago

0 Hi! Is There A Way To Run A Task Without Reporting To The Server? For Example If I Want To Debug A Script By Running It Locally Without It Appearing On The Server

Task.set_offline(True)
https://clear.ml/docs/latest/docs/references/sdk/task#taskset_offline

3 years ago

Hi HealthyStarfish45
Funny just today I had a similar discussion on slurm:
https://allegroai-trains.slack.com/archives/CTK20V944/p1603794531453000

Anyhow, when you say "[scale up agents]" are you referring to a machine constantly running an agent pulling jobs from the queue, where the machine itself (aka the resource) is managed as a slurm job?

4 years ago

0 Hello Again, How Can I Use The

Hi JumpyDragonfly13
Let's assume we have two machines, one we call remote, one we call laptop (at least for this discussion)

On the Remote machine we need to run: (notice we must have docker preinstalled on the remote machine, it can work without docker, let me know if this is the case for you)
clearml-agent daemon --queue interactive --create-queue --docker
On the Laptop we run
clearml-session --docker nvidia/cuda:10.1-cudnn7-runtime-ubuntu18.04What clearml-session will do is crea...

3 years ago

0 Is There Any Examples Of Mounting An Aws Efs Mount To A Self Hosted K8 Agent Deploy?

. Curious what advantage it would be to use the StorageManager

Basically if you set the clearml cache folder to the EFS, users can always do:
from clearml import StorageManager local_file = StorageManager.get_local_copy(" ")where local_file is stored on persistent cache (EFS) and the cache is automatically cleaned based on last accessed file

2 years ago

0 Hi! I Recently Updated My Server And My Clearml Version, Now When I Set A Task To Be Executed Remotely Its Default State Is Aborted Hence I Have To Reset And Enqueue, Is There Something I Am Doing Wrong (I Am Using Hydra Too)?

GrievingTurkey78 notice that when enqueuing an aborted Task, the agent will not deleted the previously reported metrics/logs

3 years ago

0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

in which I can just spawn an ad-hoc worker

Can you elaborate on what you would do with it? Like an OS environment disable the entire setup itself ? will it clone the code base ?

2 years ago

0 Can I Make A Super Small Fr Or See If This Already Exists. I Want To Ensure/Add A Tag On A Run, But There Is No Add_Tag. Set_Tags Allows Duplication, Which Isnt Something I Think Is Useful With Tags (And Cant Be Done On The Ui I Believe). Currently, I Si

No worries 🙂

3 years ago

0 Hi, I Was Getting A Really Weird Error Due To Mismatch On The Versions Between The Installed Libraries In My Environment And The Ones Ran In The Node (I Manually Changed The Installed Packages And Everything Worked). How Can I Force Trains To Use Exactly

GrievingTurkey78

maybe since the package is not directly imported in my code it is possible to get a different version to what I have locally (?).

If these are derivative packages (i.e. imported by other packages) they are not automatically logged when executing the Task manually (in order to keep the "installed packages as lean as possible on the one hand but specify also specify the important packages for you)
That said, when the "trains-agent" executed the task it will store nack...

4 years ago

0 Hi! Is There Something Happening With The

Yes , both work :(

3 years ago

0 Hi All! Is There A Way For Trains To Recognize The Cli Arguments When Using

I could take a look and figure that out.

This will greatly accelerate integration 😉

4 years ago

0 Hi! Is There Something Happening With The

GrievingTurkey78 are you able to reproduce it?

3 years ago

0 Hi All, Is It Possible To Control The Number Of Steps Of The Pipeline During Run Time. Eg. If User Wants #N Parallel Steps In The Pipeline

. but when we try to do a "New Run" from UI, it tries to follow the DAG of previous run (the run with all child nodes skipped) and the new run fails too.

This is odd, is this reproducible ? what's the clearml python package version ?

one year ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

Oh i get it now, can you test:
git ls-remote --get-url githuband then
git ls-remote --get-url

3 years ago

0 Brand New User Here. I’M Trying To Run An Optimization Task. The Tasks Resulting From The Optimization All Fail Because A Necessary Package Is Not Installed On Them. I Checked The Template Task And The List Of “Installed Packages” Indeed Does Not Have One

Nice!

2 years ago

0 Hello! I Get The Following Error In Results->Console After A Task Is Sent For Remote Execution (Using Sdk):

AttractiveCockroach17 can you provide some insight on the pipeline creation?

2 years ago

0 How Can I Ensure Tasks In A Pipeline Have The Same Environment As The Pipeline Itself? It Seems A Bit Counter-Intuitive That The Pipeline (Executed Remotely) Captures The Local Environment, But The Tasks (Executed Remotely) Do Not Use That Same Environmen

is this repo installed on the machine creating the pipeline ?
You can also manually add it here `packages={"link_to_internal_python_package",]
None

one year ago

0 Hi All! Is There A Way For Trains To Recognize The Cli Arguments When Using

GrievingTurkey78 can you send the entire log?

4 years ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

I will take any suggestion 🙂
git remote -v could be a good start but I'm not familiar with the output structure, is there a template for parsing ?

3 years ago

0 I'M Running A Simple Experiment (One Training Task, Nothing Else) And I'M Getting A Puzzling Message. Any Help Deciphering That Is Appreciated. I'M Pasting Part Of The Warnings Below:

(BTW: you can disable the auto-logging feature of joblib)
Task.init(..., auto_connect_frameworks={'scikit': False})

3 years ago

0 Different Question About Warnings: I'M Getting (Infrequently) This Warning, Followed By My Script Hanging

based on this one:
https://stackoverflow.com/questions/31436407/git-ls-remote-returns-fatal-no-remote-configured-to-list-refs-from
I think this is a specific issue of the local git repo configuration, can you verify
(btw: I tested with git 2.17.1 git ls-remote --get-url will return the remote url, without an error)

3 years ago

0 Hi, I'M Trying To Run Task.Init Inside A Jupyter Notebook For The First Time (Used It A Lot Before In Normal Python Scripts), And I Get A Warning-

You can try calling
task._update_repository()I'm still trying to figure out how to reproduce it...

3 years ago

Show more results