AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 I’M Having Some Trouble With

👍

2 years ago

0 Hey There! I'M Encountering An Odd Issue - I'M Running My Agents As Python Processes On A Windows Pc Endpoints. I Recently Had A Bug That Forced Me To Delete All Cache And All (Non-Core) Venv-Builds. My Firstly Booted Agent Uses The ''First'' Venv-Build

@<1710827340621156352:profile|HungryFrog27> the venv-build folder is supposed to be deleted after each task is done. How did you end up with leftovers? Could it be windows was failing to delete it for some reason? That actually connects with you initial issue no?

one year ago

0 Hello, I'M Trying To Save A Keras Model As A Task Artifact, And Then Upload It From Another Task. Does Anyone Know The Syntax For That? What I'Ve Seen Is Not Quite Working.

Let me check

4 years ago

0 Hi, When Using The Logger.Report_Table() Method (

Hi GreasyPenguin14
Yes, I think you are right the series name should be next to the title. Let me check it...

4 years ago

0 Hello All, I'M Trying To Queue A Task In Python But I'D Like To Reuse The Prior Task Id. In The Webapp You Can

I'm trying to queue a task in python but I'd like to reuse the prior task ID.

is it your own Task? i,,e, enqueue yourself, if this is the case use task.execute_remotely it will do just that.
If this is another Task, then if it is aborted then you can just enqueue it, by definition it will continue with the Same Task ID.

one year ago

0 So I Bumped Onto This Comparison Shared By Dagshub. It Kinda Placed Clearml Is A Rather Bad Position Compared To Everything Else In The Industry.

SubstantialElk6 feel free to tweet them on their very inaccurate comparison table 🙂

4 years ago

0 I Seem To Be Missing Something ... I'Ve Only Got One Task Running To Train A Segmentation Model On My Local Machine, And In A Few Days It'S Hit Over 1.15M Api Calls. It Looks Like It'S Sending Every Single Console Output ... Are There Settings To Control

Correct

2 years ago

0 Hey, Using K8S With Trains 0.16.1-320, All Of A Sudden The Entire Data (I.E Experiments, Tasks, Api Creds) Is Not Showing In The Ui Anymore. All Logs Seems To Be Fine Afai Can Tell... Any Idea What Went Wrong?

backup?

4 years ago

0 Hello Everyone. I Don'T Uderstand Why Is My Training Slower With Connected Tensorboard Than Without It. I Have Some Thoughts About It But I Not Sure. My Internet Traffic Looks Wierd.I Think This Is Because Tensorboard Logs Too Much Data On Each Batch And

https://stackoverflow.com/questions/47085458/why-is-multiprocessing-queue-get-so-slow

3 years ago

0 Was There Ever A Solution To This Request?

Hi @<1730033904972206080:profile|FantasticSeaurchin8>
You mean in the UI , or when reporting on the SDK?

one year ago

0 Question: Has Anyone Done Anything With Ray Or Rllib, And Clearml? Would Clearml Be Able To Integrate With Those Out Of The Box?

save off the "best" model instead of the last

Should be relatively easy to update on the main Task the model with the best performance, no?

4 years ago

0 Hi, I Encountered A Few Problems:

Artifacts and models will be uploaded to the output URI, debug images are uploaded to the default file server. It can be changed via the Logger.
Hmm is this like a configuration file?
You can do.
local_text_file = task.connect_configuration('filenotingit.txt')
Then open the 'local_text_file' it will create a local copy of the data in runtime, and the content will be stored on the Task itself.
This is how the agent installs the python packages, but if the docker already contactains th...

5 years ago

0 Hi, I Am Getting Following Error While Trying To Checkout A Gut Hub Rep. Error: Rpc Failed; Curl 56 Gnutls Recv Error (-54): Error In The Pull Function. Fatal: The Remote End Hung Up Unexpectedly Fatal: Early Eof Fatal: Index-Pack Failed Repository Cloni

Simple git clone on that repo works well

On the machine running the trains-agent ?

5 years ago

0 Currently, To Provide Ssh Access To The Docker Images For A Task,

What exactly do you mean by docker run permissions?

https://docs.docker.com/engine/install/linux-postinstall/

4 years ago

0 Hi, I Encountered An Issue That Might Affect Others As Well: When Using "

IrritableJellyfish76 point taken, suggestions on improving the interface ?

3 years ago

0 Hey, How Can I Add A Private Key In Order To Let The Clearml Agent To Clone From A Private Git Repository?

You need to mount it to ~/clearml.conf (i.e. /root/clearml.conf)

4 years ago

0 Hello There, I Am Trying To Organize The Dl Code Into A Monorepo, The Repo Will Have A Section Of Shared Packages That Will Be Used By Other Packages That Are The Actual Training Projects. Let'S Say That I Install The Shared Libs With Pip In Editable Mod

Hi SkinnyPanda43

Let's say that I install the shared libs with pip in editable mode on my development evironment, how does the clearml-agent will handle those libraries if I submit a job

So installing packages from local folders with "-e" is in general ill-advised.
But using a full git path should work out of the box. for example if you install pip install https://github.com/user/repo/repo.git then the agent will be able to install it on the remote machine. The main challenge...

3 years ago

0 Hi All, I Am Starting To Use Clearml-Agent. Run It With

Yes this is definitely the issue, the agent assume the docker user is "root".
Let me check something

4 years ago

0 Hi,

Hi FloppyDeer99

What is the meaning of no real scheduling

I think the meaning is that from the moment a k8s job is created, the k8s is in charge of actually spinning the container. Since k8s has no real priority/order the scheduling order is not guaranteed form this point.

The idea of the cleaml-k8s -glue is that the glue will launch a job on the k8s cluster only if it is sure there are enough resources to actually spin the job now (as opposed to, sometime in the future), this mea...

4 years ago

0 Hello, I Am Trying To Retrieve A Simple Dict Artifact Uploaded In A Previous Task With

JitteryCoyote63 okay... but let me explain a bit so you get a better intuition for next time 🙂
The Task.init call, when running remotely, assumes the Task object already exists in the backend, so it ignores whatever was in the code and uses the data stored on the trains-server, similar to what's happening with Task.connect and the argparser.
This gives you the option of adding/changing the "output_uri" for any Task regardless of the code. In the Execution tab, change the "Output Destina...

5 years ago

0 Very Weird Error, Trying To Run An Experiment Through An Agent In Docker Mode, And I Get This Error

correct

4 years ago

0 Hi, I Am Looking To Upload "Already Trained Models" As Experiments In My Clearml Server. How Should I Go About Doing That? Clearml Picks Up The Tensorboard Automatically While It'S Training And Reports It But How Would I Do This If I Had Everything Alread

Hi SmarmyDolphin68
You have two options:
Automatically upload the models when training pass output_uri to Task.init. For example output_uri=True will upload to the clearml-server, output_uri=' s3://bucket/folder ' will upload to S3 etc. Manually upload a model that you have locally: https://github.com/allegroai/clearml/blob/9ff52a8699266fec1cca486b239efa5ff1f681bc/examples/reporting/model_config.py#L37

4 years ago

0 Hey All, Is There Any Reason The Python Sdk

It only happens in the clearml environment, works fine local.

Hi BoredHedgehog47
what do you mean by "in the clearml environment" ?

3 years ago

0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

seems to run properly now

Are you saying the problem disappeared ?

5 years ago

0 Is There A Way To Get A Task'S Docker Container Id/Name? I'M Generally Interested In Resource Profiling Of Each Container, So I Noticed I Can Use

Hi ElegantCoyote26

is there a way to get a Task's docker container id/name?

you mean like Task.get_task("task_id_here").get_base_docker() ?

ow a Task's results page also has a plot for this, but I guess it's at the machine level and not the task level?

This is actually on the container level, meaning checked from inside the container. It should be what you are looking for

3 years ago

0 Hi Community! I Have Difficulty Using Clearml Pipeline. I Am Writing The Code Using The Pipeline Decorator, But The Pipeline Does Not Work With The Following Error When Specifying The Docker Image As A Argument Of The Decorator. How Should I Solve It?

Hmm this is odd, could you provide the pipeline code maybe?

2 years ago

0 Hey, Is There A Shortcut On The Dataset Sdk To Directly Get The Latest Version Of A Dataset ?

currently I'm doing it by fetching the latest dataset, incrementing the version and creating a new dataset version

This seems like a very good approach, how would you improve ?

3 years ago

0 Hey, How Can I Add A Private Key In Order To Let The Clearml Agent To Clone From A Private Git Repository?

If it cannot find the Task ID I'm guessing it is trying to connect to the demo server and not your server (i.e. configuration is missing)

4 years ago

0 Dear Clearml Community, I Am Trying To Optimize Storage On My Clearml File Server When Doing A Lot Of Experiments. To Achieve This, I Already Upload Only The Newest And Best Checkpoints To Clearml File Server Instead Of All Checkpoints. Another Component

Hi @<1663354518726774784:profile|CrookedSeal85>

I am trying to optimize storage on my ClearML file server when doing a lot of experiments.

This is not straight forward, you will need to get a list of all the events via
None
filter on image events
and then delete the the URL you are getting via the StorageManager.
But to be honest, why not just direct it to S3 or something like that ?

one year ago

0 <no title>

So “wait” is a better metaphore for me

So I would do something like (I might have a few typos but that's the gist):


def post_execute_callback_example(a_pipeline, a_node):
    # type (PipelineController, PipelineController.Node) -> None
    print('Completed Task id={}'.format(a_node.executed))
    # wait until model is tagged, then pass it as argument
    while True:
        found = Moodel.query_models(...) # model filter here, inlucing tag and project
        if found:
         ...

4 years ago

Show more results