AgitatedDove14

49 Questions, 8126 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8126

0 How Come

WackyRabbit7 interesting! Are those "local" pipelines all part of the same code repository? do they need their own environment ?
What would be the easiest pipeline interface to run them locally? (I would if we could support this workflow, it seems you are not alone in this approach, and of course that you can always use them remotely, i.e. clone the pipeline and launch it on an agent)

4 years ago

0 How Come

I started running it again and it seems to have passed the phase where it failed last time

Yey!

Yes it is a common case....

I have the feeling ShinyLobster84 WackyRabbit7 you are not alone in this one 🙂 let me make sure we change the default value of Yes it is a common case to False, so the code looks cleaner

4 years ago

0 Hi, I Was Some How Able To Get A Project Running Yesturday, However Now I Am Unable To Get It Running, I Keep Getting An Failed Getting Token Error

i keep getting an failed getting token error

MiniatureCrocodile39 what's the server you are using ?

4 years ago

0 Hi All, I Am Trying To Spin Up Some Aws Autoscaler Instances, But I Seem To Have Some Issues With The Instance Creation:

Sure go to the "All Projects" and filter by Task Type, application / service

2 years ago

0 Hi Everyone. I Have An Issue With The Simple Pipeline - It Runs Two Similar Nn Training Steps (Tf2.3, Windows10, Python 3.7) With Only Difference Is A Batch Size. I'M Running First Separately Each Step To Have Them In Clearml Project Page. Then I Run Pipe

Hi BattyLion34
I might have a solution, in order to make sure the two agents are not sharing the "temp" folder:
create two copies of ~/clearml.conf , let's call them :
~/clearml_service.conf ~/clearml_agent.confThen in each one select a different venvs_dir see here:
https://github.com/allegroai/clearml-agent/blob/822984301889327ae1a703ffdc56470ad006a951/docs/clearml.conf#L90
for example:
~/.clearml/venvs-builds1 ~/.clearml/venvs-builds2Now start the two agents with:
The service age...

4 years ago

0 Hi, I Have Been Getting The Following For A While. Is There A More Detailed Log I Can Look Into? This Happens On Both Https And Http.

Hmm that makes sense, I "think" the enterprise offering has a solution for that as well (i.e. full separation over static cluster), but probably the best way to constituent this avenue is talk to Sales (I'm assuming they'll setup a call to discuss the details)

Going back to the open source, I think that adding the credentials as part of the source code might allow to have "credentials" auto populate as part of the remote execution, wdyt?

4 years ago

0 Is There A Way To Set Precedence On Package Managers? If We Set An Agent To Use

first try the current setup using

pip

, and if it fails, use

poetry

if

poetry.lock

exists

I guess the order here is not clear to me (the agent does the opposite), why would you start with pip if you are using poetry ?

3 years ago

0 Hi There

set a parameter in that task and enqueue it

how do you do that?

5 years ago

0 I Have A Question Regarding Reducing Execution Time Of Pulling Results From The Server With The Python Api. As Part Of Some Pipeline, After Running Hpo I Am Pulling All The Results From My Optimizer Task And Also Pulling All The Scalars Associated With Th

I pull all the parameters, and then manually filter on the HP keys (manually=I have to plug them in, they are not part of optimizer object)

So is this an improvement to optimizer._get_child_tasks_ids(...) interface ?
e.g. return a structure like:
[ { 'id': task_id, 'hp1': value, 'hp2': value, 'hp3': value, 'objective': dict(title='title', series='series', value=42 }, ]

4 years ago

0 Hi Guys, I Have Many Questions To Ask, Sorry If This Questions Were Posted Already - If The Answer Exist, Please, Point Me To It. Thank You For Your Help. I'M Training Object Detection Model Using Tf 2.3 Object Detection Api And Use Clearml On Local Serve

This looks strange that only a single scalar is reported.

4 years ago

0 Hi Guys! How Do You Handle Tasks With A Complex Parametrization? For Example, A Script That Trains A Machine Learning Model, Where You Want To Parametrize Model Name, Hyperpars, Preprocessing Steps, Etc. So A Nested Configuration With Many Parameters Do I

Hi @<1691620877822595072:profile|FlutteringMouse14>

Do I have to use Hydra

You can, and then the entire configuration is fully captured by ClearML (automatically) while you can still override values with the manual "key.sub=value" both in the UI and in the CLI

Otherwise you can connect nested dict with task.connect (these will be flattened with / for sub keys).
Or you can connect configuration files ( task.connect_configuration ) and edit them as is in the UI (with override of...

one year ago

0 Any Ideas Of Using Label Studio With Clearml Datasets - Base Dataset, Load To Label Studio, Annotate, Child Annotated Dataset Is The Kind Of Flow

I assume so 🙂 Datasets are kind of agnostic to the data itself, for the Dataset it's basically a file hierarchy

4 years ago

0 What Is The Suggested Way Of Running Trains-Agent With Slurm? I Was Able To Do A Very Naive Setup: Trains-Agent Runs A Slurm Job. It Has The Disadvantage That This Slurm Job Is Blocking A Gpu Even If The Worker Is Not Running Any Task. Is There An Easy Wa

HealthyStarfish45 We are now working on improving the k8s glue (due to be finished next week) after that we can take a stab at slurm, it should be quite straight forward. Will you be able to help with a bit of testing (setting up a slurm cluster is always a bit of a hassle 🙂 )?

5 years ago

0 Hi, We Have A Bit Old Open Source Clearml Instance. I Want To Create A New Instance On A New Infrastructure. Is There An Easy Way To Migrate Data Between Clearml Instances?

Hi @<1544128915683938304:profile|DepravedBee6>
You mean like backup the entire instance and restore it on another machine? Or are you referring to specific data you want to migrate?

BTW if you are upgrading old versions of the server I would recommend upgrading to every version in the middle (there are some migration scripts that need to be run in a few of them)

2 years ago

0 So, Here'S A Question. Does Clearml Automatically Save Everything Necessary To Continue Training A Pytorch Language Model? Specifically, I'Ve Been Looking At The Checkpoint Folders Created When I'M Training A Huggingface Robertaformaskedlm. I Checked What

If you cannot change the "TrainerState" (i.e. inherit and pass it into the code)
you cloud also monkey-patch it, something like
` class OurTrainerState(TrainerState):
def init(...)
...
def load_from_json(cls, json_path: str):
super().load_from_json(json_path))
Task.current_task().upload_artifact(...)

trainer.state = OurTrainerState(trainer.state) `

4 years ago

0 Hi, When Using

somehow set docker_args and docker_bash_setup_script equivalent??task.set_base_docker(...)
# somehow setup repo and branch to download to remote instance before runningThis is automatically detected based on your local commit/branch as well ass uncommitted changes

3 years ago

0 Currently, To Provide Ssh Access To The Docker Images For A Task,

The .ssh is mounted, but the owner is my local user,

sudo -H clearml-agent ...to allow sudo to access home

4 years ago

0 Hello! I'M Using The Self-Hosted Version Of Clearml. I'M Doing Some Testing And It Seems That The Clearml Isn'T Auto-Logging My Matplotlib Plots. The Versions I'M Using Are Matplotlib==3.6.2 And Clearml==1.6.4. Am I Missing Something?

Hi FrothyShark37
Can you verify with the latest version?
pip install -U clearml

3 years ago

0 Also, Not Sure Where To Ask This Question. I Am Following The Instructions From Here:

Hi @<1603198134261911552:profile|ColossalReindeer77>
I would also check this one: None

2 years ago

0 Hello Guys! I Have A Little Question: My Metrics Quota Has Been Reached, And I Cannot Use The Platform Anymore.. I Have Already Removed Almost All Projects Some Days Ago But It Still Says The Same. Is There Anything I Can Do To Get Some Quota Again?

Hi NastySeahorse61
Did you archive And delete the experiments from the archive?
BTW: I think this question belongs to

3 years ago

0 ..

But I have no idea what will be input of step2.

What do you mean by that? the assumption is that somehow the output of step 1 will be passed (a string reference) to step 2, what am I missing ?

3 years ago

0 Hey Clearml Team, We Created An Account, Setup Our Data Pipeline, And Now We Can'T Get Back In. Nothing Is In The Project. Can Someone From Support Reach Out To Help?

For visibility, after close inspection of API calls it turns out there was no work against the saas server, hence no data

2 years ago

0 Follow Up On Execute_Remotely, I See One Can Limit The Available Gpu Resources In A Worker Daemon; Could One Also Limit The Number Of Cpu Cores Available?

You mean for running a worker? (I think plain vanilla python / ubuntu works)
The only change would be pip install clearml / clearml-agent ...

3 years ago

0 Hi Everyone! So, I'M Having A Problem With The Auto Detect Dependencies When Running A Task Remotly. The Problem Is That When I Import Some Function From A File In Another Folder, That Task Doesn'T Catch The Files Depencies. Given A Folder Structure:

GrotesqueOctopus42

The problem is that when I import some function from a file in another folder, that task doesn't catch the files depencies.

Just to be clear, if this is another file, you have to have all the files in the same git repo for the agent to actually be able to fetch them on the remote machine.
If you have a mix of notebooks and code, you have to have the local code in a git repo,
Make sense ?

2 years ago

0 Hey There, Is There Any Way I Can Tell The Task Not To Set A Random Seed? I'M Setting Up Reproducibility Myself But When I Call Task.Init() The Seed Is Changed. Is It Possible To Tell Clearml Not To Initialize Any Rng? It Appears That Task.Set_Random_Seed

Hi TartBear70

I'm setting up reproducibility myself but when I call Task.init() the seed is changed

Correct

. Is it possible to tell clearml not to initialize any rng? It appears that task.set_random_seed() doesn't change anything.

I think this is now fixed (meaning should be part of the post weekend release)

. Is this documented?

Hmm i'm not sure (actually we should write it, maybe in Task.init docstring?)
Specifically the function that is being called is:
https://gi...

3 years ago

0 Hi! Is There Any Reason Why Integer/Float Values Are Casted To String When Connecting Arguments Dictionary To Task And Then Retrieve Them Using

GiganticTurtle0
I think that what you are looking for is:
param_dict = {'key': 1234} task.connect(param_dict, name='general')Notice that when this code runs manually (i.e. not by the agent), the dict is stored on "general" parameter section of the Task.
But when the code is executed by the Agent, the opposite happens and the parameters from the "general" section of the Task or put back into the param_dict , here the casting is done based on the type of the original values.
Generall...

4 years ago

0 Does Clearml-Session Work In A Kubernetes Environment?

👍

4 years ago

0 Hi Everyone, I'M Trying To Execute Trains-Agent In Docker Mode With Conda As Package Manager, Is It Supported? I Tried To Work With Nvidia/Cuda:10.0-Runtime-Ubuntu18.04 And Got The Error "Trains_Agent: Error: Error: Package Manager "Conda" Selected, But '

Do you have python 3.7 in the docker ?

5 years ago

0 Hello, I Tried The Clearml-Session Cli To Start A Jupyter Instance On An Agent, But An Error With The Password, Here Is The Full Cli Log:

Alright I have a followup question then: I used the param --user-folder “~/projects/my-project”, but any change I do is not reflected in this folder. I guess I am in the docker space, but this folder is not linked to my the folder on the machine. Is it possible to do so?

Yes you must make sure the docker can mount a persistent folder for you to work on.
Let me check what's the easiest way to do that

4 years ago

0 Hey There, Since A Bit I Often Find Experiments Being Stuck While Training A Model. It Seems To Happen Randomly And I Could Not Find A Reproducible Scenario So Far, But It Happens Often Enough To Be Annoying (I'D Say 1 Out Of 5 Experiments). The Symptoms

Most likely yes, but I don't see how clearml would have an impact here, I am more inclined to think it would be a pytorch dataloader issue, although I don't see why

These are most certainly dataloader process. But clearml-agent when killing the process should also kill all subprocesses, and it might be there is something going on that prenets it from killing the subprocesses ...

Is this easily reproducible ? Can you verify it is still the case with the latest RC of clearml-agent ?

2 years ago

Show more results