AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hi, Just To Check. Does The K8S Glue Install Torch By Default? I'M Getting

SubstantialElk6 "Execution Tab" scroll down you should have "Installed Packages" section, what do you have there?

3 years ago

0 Is There A Reason

Makes sense
we need to figure what would be the easiest way to have an "opt-in" for the demo server, that will still make it a breeze to quickly test code integration ...
Any suggestions are welcomed 🙂

3 years ago

0 Hey, I'M Running A Pipeline, And 1 Stage Passed - But The Next One Failed. I Fixed The Bug For The Second One - Is There Any Way To Retry The Pipeline From The Failure?

CleanPigeon16 Coming very soon, we adding a few features for the pipeline, this one will also be included :)

3 years ago

0 Hi, Expanding On

DeliciousBluewhale87 not on the opensource, for some reason it is not passed 😞
Could you explain the use case ?

3 years ago

0 I Am Trying To Use

if it ain't broke, don't fix it

😄

Up to you, just a few features & nicer UI.
BTW: everything is backwards compatible, there is no need to change anything all the previous trains/trains-agent packages will work without changing anything 🙂
(This even includes the configuration file, so you can keep the current ~/trains.conf and work with whatever combination you like of trains/clearml on the same machine)

3 years ago

0 I Am Using Pytorch Lightning With Ddp Accelerator On 4 Gpus, And I Found Every Checkpoint Is Recorded 4 Times On Web Ui With Different Ids. One Is On

DefeatedOstrich93 can you verify lightning actually only stored once ?

3 years ago

0 Very Weird Error, Trying To Run An Experiment Through An Agent In Docker Mode, And I Get This Error

Copy paste it here 🙂

3 years ago

0 Guess We'Re Back To Basics How Do I Report A Single Scalar With No Iteration Dimension - Something I Can Put As One Of The Columns In The Experiments Table?

WackyRabbit7 How do I reproduce it ?

3 years ago

0 Is There Any Way To Clear The Installed Packages Of A Task Programmatically? (I.E. Using The Python Sdk And Not The Ui)

I think task.init flag would be great!

👍

3 years ago

0 Hi, I Assume It Is Very Basic But How Can I Add The Model That Is Created In The Training To The Artifacts And To See It In The Models Tab?

Hmmm, what's your trains version ?

4 years ago

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

Yes, hopefully they have a different exception type so we could differentiate ... :) I'll check

3 years ago

0 Question About Artifacts, Dynamic Vs Static And Their Relationship To Experiments Under

this results at the end of an experiment in an object to be saved under a given name regardless if it was dynamic or not?

Yes, at the end the name of the artifact is what it will be stored under (obviously if you reuse the name you basically overwrites the artifact)

4 years ago

0 Is Clearml Able To Intercept (Automatically) Metrics Gathered Via

I'm really for adding an interface, but I was not able to locate a simple integration option with basically anything, Wdyt ?

one year ago

0 When I Try To Create Experiment In The Ui All I See Is This Dialogue

and the agent default runtime mode is docker correct?

Actually the default is venv mode, to run in docker mode add --docker to the command line

So I could install all my system dependencies in my own docker image?

Correct, inside the docker it will inherit all the preinstalled packages, But it will also install any missing ones (based on the Task requirements. i.e. "installed packages" section)

Also what is the purpose of the

aws

block in the clearml.c...

2 years ago

0 Hi, I'M Trying To Set Storage Manager To Use Our Internal Miniio Installation But I Ran Into This Issue With This Testing Code:

an implementation of this kind is interesting for you or do you suggest to fork

You mean adding a config map storing a default trains.conf for the agent?

3 years ago

0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

It can be a different agent.

If inside a docker then
clearml-agent execute --id <task_id here> --dockerIf you need venv do
clearml-agent execute --id <task_id here>You can run that on any machine and it will respin and continue your Task
(obviously your code needs to be aware of that and be able to pull its own last model checkpoint from the Task artifacts / models)
Is this what you are after?

2 years ago

0 In Order For A New Worker To Come Online In My K8 Cluster, Do I Need To Have An Ec2 Startup Script Init The Agent/Config, And Then Start The Daemon? Do I Have To Do This Manually Is This A Better Way?

The agents are docker containers, how do I modify the startup script so it creates a queue?

Hmm actually not sure about that, might not be part of the helm chart.
So maybe the easiest is:
from clearml.backend_api.session.client import APIClient c = APIClient() c.queues.create(name="new_queue")

2 years ago

0 Hi, Is There Any Document About Migration Clearml-Server. Currently, I Have Clearml-Server Running On Servera But I Want To Move All Data (Including Artifacts, Task, Dataset) From Servera To Serverb.

VictoriousPenguin97 basically spin down sereverA (this should flush all DBs) then copy /opt/clearml to the new server and spin it with docker-compose. As long as the new server is on the same address as the previous one, everything should work out of the box

2 years ago

0 Hi, When Trying To Use A Remote Agent To Train A Model, The Initial Environment Setup On The Remote Machine Fails Because The List Of Requirements Located In /Tmp/Cached-Reqsaw90Argk.Txt Contains A Link To An Aarch64 Wheel:

Thanks for the details TroubledJellyfish71 !
So the agent should have resolved automatically this line:
torch == 1.11.0+cu113 into the correct torch version (based on the cuda version installed, or cpu version if no cuda is installed)
Can you send the Task log (console) as executed by the agent (and failed)?
(you can DM it to me, so it's not public)

2 years ago

0 Hi Everyone, Is It Possible To Not Create A Copy Of A Dataset When Adding To Clearml? My Data Is Already In A Directory On The Clearml-Server Machine And I Do Not Want To Copy It, Just Add It To Clearml As Dataset.

Yes, though the main caveat is the data is not really immutable 😞

2 years ago

0 Any Plans To Add Unpublished State For Clearml-Serving?

or even different task types

Yes there are:
https://clear.ml/docs/latest/docs/fundamentals/task#task-types
https://github.com/allegroai/clearml/blob/b3176a223b192fdedb78713dbe34ea60ccbf6dfa/clearml/backend_interface/task/task.py#L81

Right now I dun see differences, is this a deliberated design?

You mean on how to use them? I.e. best practice ?
https://clear.ml/docs/latest/docs/fundamentals/task#task-states

2 years ago

0 Can One Compare Experiments/Tasks From Different Projects? Edit: I Mean, I Can Manually Navigate To Some

could be nice to have a direct "task comparison" link in the UI somewhere,

you mean like a "cart" for comparison ? or just to "save the state" so you can move between projects ?

2 years ago

0 Whet Is The Method For Packages Exploration When Using Conda? Agent Is Set To 'Conda' Mode. We Upload A Task From A Local Conda Env That (Obviously) Has Some Pip Packages As Well. When We Enqueue The Task To Run Remotely, Not All Conda Packages Are Instal

CrookedWalrus33 from the log it seems the code is trying to use "kwcoco" but it is not listed under any "Installed packages" nor do you see any attempt to install it. Can you confirm ?

2 years ago

0 Hi! I'M Currently Saving A Dataframe With Predictions Inside The Task. To Do So, I Save A Dataframe As Pickle File In

Yes MuddySquid7 it is automatically detects it (regardless of you uploading DF as an artifact).
How are you saving the dataframe ?
(it will auto log any joblib.save call, is that it?)

3 years ago

0 Hi! Trying To Run The Following Very Basic Code. The First Few Parts Works As They Should:

Hi FunnyTurkey96
what's the clearml server you are using ?

3 years ago

0 Hi All, I'M Using Clearml 1.0.3 With Clearml-Server <1 (How Do I Get The Current Running Version?) In Pytorch-Lightning I Use Ddp And I See Multiple Tasks (As The Number Of Gpus) Being Created And Remaining In Draft Mode. Is It A Problem Running Clearml

Hi ExcitedFish86

In Pytorch-Lightning I use DDP

I think a fix for pytorch multi-node / process distribution was commited to 1.0.4rc1, could you verify it solves the issue ? (rc1 should fix this specific issue)
BTW: no problem working with cleaml-server < 1

3 years ago

0 Question About Using S3 As Artifact Storage - Do We Need To Setup S3 Credentials On Every System That Is Using Those Artifacts (E.G. In Clearml-Agent Where Model Upload Happens, Or In A Prediction Service, That Needs To Download The Latest Model)

Hi FiercePenguin76
So currently the idea is you have full control over per user credentials (i.e. stored locally). Agents (depending on how deployed) can have shared credentials (with AWS the easiest is to push to the OS env)

3 years ago

0 Hello, I Am Looking For A Way To Increase Number Of Images Saved In Results>Debug Samples. Looks Like There Is A Limit Of 100 Images Per Experiment, And All Images Saved After Are Not Displayed In Web Client. I Like To Have First Batch With Predictions V

How do you currently report images, with the Logger or Tensorboard or Matplotlib ?

3 years ago

0 Hi Everyone! Is There A Way I Can Get Task.Get_Task() To Work Without Using Task_Id When Running Tasks As A Pipeline ? Im Trying To Access Old Pipeline Runs/Artifacts On My Current Pipeline But

Oh I see, try the following to get a list of all pipelines, then with the selected pipeline you can locate the component:

pipeline_project = Task.current_task().project

pipelines_ids = Task.query_tasks(task_filter=dict(
                project=[pipeline_project],
                type=["controller"],
                system_tags=["pipeline"],
                order_by=["-last_change"],
                search_hidden=True,)
            )
# take the second to the last updated one (becuase t...

11 months ago

0 Hi Everyone! Is There A Way I Can Get Task.Get_Task() To Work Without Using Task_Id When Running Tasks As A Pipeline ? Im Trying To Access Old Pipeline Runs/Artifacts On My Current Pipeline But

Hi @<1631826770770530304:profile|GracefulHamster67>
if you want your current task:

task = Task.current_task()

if you need the pipeline Task from the pipeline component

pipeline = Task.get_task(Task.current_task().parent)

where are you trying to get the pipelines from? I'm not sure I understand the use case?

11 months ago

Show more results