AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 Hi Everyone!

You mean one machine with multiple clearml-agents ?
(worker is a unique ID of an agent, so you cannot have two agents with the exact same worker name)
Or do you mean two agents pulling from the same queue ? (that is supported)

2 years ago

0 So, I Did A Slew Of Pretrainings, Then Finetuned Those Pretrained Models. Is There A Way To Go Backwards From The Finetuning Task Id To The Pretraining Task Id? What I Tried Was:

SmallDeer34 the function Task.get_models() incorrectly returned the input model "name" instead of the object itself. I'll make sure we push a fix.

I found a different solution (hardcoding the parent tasks by hand),

I have to wonder, how does that solve the issue ?

3 years ago

0 Hi, Is There A Way To Create A Draft Experiment Manually? That Is - Give It A Some File To Run, Or, Better Yet, A Function To Run Which Will Be The Start Of The Experiment? In W&B, For Example It Is Possible To Simply Write (Their

OddAlligator72 FYI you can also import / export an entire Task (basically allowing you to create it from scratch/json, even without calling Task.create)
Task.import_task(...) Task.export_task(...)

4 years ago

0 Hey Guys, I Am Setting Up A New Machine With Two Rtx 3070 Gpus Where I Created Two Agents (One For Each Gpu). On Both Agents, My Experiments Fail With Error:

Many thanks!

4 years ago

0 Is There Any Testing Suite That Ships With Clearml? If We'D Like To Make Some Unit Tests For Our Code?

Should not be complicated, it's basically here
https://github.com/allegroai/clearml/blob/1eee271f01a141e41542296ef4649eeead2e7284/clearml/task.py#L2763
wdyt?

2 years ago

0 Is There Any Specific Version Of Numpy You Recommend To Use With Clearml Python Library? I Am Building An Python Alpine Docker Image With Clearml==1.7.2 But It Breaks When Building Image From Dockerfile.

this sounds like docker build issue on macos M1
https://pythonspeed.com/articles/docker-build-problems-mac/

2 years ago

0 Hello! Since Today I Get

Let me check something

4 years ago

0 We Are Facing Performance Issues Of Our Self-Hosted Clearml Server Looking At The Cpu Utilization \ Memory \ Networking We Couldn'T Identify A Bottleneck We Are At The Moment Using ~100 Workers For Some Hpo, And The Main Performance Issues We Observe Are

For example, opening a project or experiment page might take half a minute.

This implies mongodb performance issue
What's the size of the mongo DB?

3 years ago

0 Hi Guys! Is There A Way To Tell An Agent To Run A Task In An Existing Venv (Without Creating A New One)?

Oh if this is the case you can probably do
` import os
import subprocess
from clearml import Task
from clearml.backend_api.session.client import APIClient

client = APIClient()

queue_ids = client.queues.get_all(name="queue_name_here")

while True:
result = client.queues.get_next_task(queue=queue_ids[0].id)
if not result or not result.entry:
sleep(5)
continue
task_id = result.entry.task
client.tasks.started(task=task_id)
env = dict(**os.environ)
env['CLEARML_TASK_ID'] = ta...

3 years ago

0 Hello! Is There A Way To Override The Configuration Vault Parameters Of A Pipeline Step With The Add_Function_Step Method? I See In The Docs That Add_Step Method Has The Option To Override The Vault With The Configuration_Overrides Argument, But Not Add_F

Hi @<1688721797135994880:profile|ThoughtfulPeacock83>

the configuration vault parameters of a pipeline step with the add_function_step method?

The configuration vault are a per set at execution user/project/company .
What would be the value you need to override ? and what is the use case?

one year ago

0 Hi All, Idk Whether This Is The Right Channel To Ask This Question My Team Are Using Clearml As Part Of Our Development Process. We Wish To Use Clearml-Serve As Well To Deploy Our Services To K8S; However, As Clearml-Serve Is Not Mature Enough, We’Ll Wai

Hi GrittyKangaroo27
Maybe check the TriggerScheduler , and have a function trigger something on k8s every time you "publish" a model?
https://github.com/allegroai/clearml/blob/master/examples/scheduler/trigger_example.py

3 years ago

0 Hey Everybody - I Am Using The Pipelinecontroller With Add_Function_Step To Add Different Step To The Pipeline. Is There A Way To Specify A Callback Upon An Abort Action From The User ? I Tried Using The Post_Execute_Callback Or The Status_Change_Callback

Hi @<1523715429694967808:profile|ThickCrow29>

Is there a way to specify a callback upon an abort action from the user

You mean abort of the entire pipeline?
None

2 years ago

0 Hello, I'M Struggling To Get A My Clearml-Agent To Work With Poetry As The Package Manager. When Cloning A Task For Running On The Clearml-Agent, The

Many thanks!

4 years ago

0 Hi! I Am Currently Using Clearml (With Remote Execution), To Train An Object Detection Model With

Just verifying the Pod does get allocated 2 gpus, correct ?
What do you have under the "script path" in the Task?

4 years ago

0 Hey There, Does Trains Support

JitteryCoyote63 not yet 😞
I actually wonder how poplar https://github.com/pallets/click is ?

5 years ago

0 Can Anyone Help Me About This?

Hi @<1523708901155934208:profile|SubstantialBaldeagle49>
If you report on the same iteration with the same title/series you are essentially overwriting the data (as expected)
Regrading the plotly report size.
Two options:

round down numbers (by default it will store all the digits, and usually after the forth it's quite useless, and it will drastically decrease the plot size)
Use logger.report_scatter2d , it is more efficient and has a mechanism to subsample extremely large graphs.

5 years ago

0 Hey, We Were Trying To Run An Experiment On Clearml Using Its Python-Sdk. When I Run An Experiment Using

p.s. StraightCoral86 I might be missing something here, please feel free to describe the entire execution scenario and what you are trying to achieve 🙂

4 years ago

0 Hello. Am New To Clearml. I Wish To Know If There Are Clearml Support For Nvidia Tao (Formerly Known As Transfer Learning Toolkit) ? Thank You

My current experience is there is only print out in the console but no training graph

Yes Nvidia TLT needs to actually use tensorboard for clearml to catch it and display it.
I think that in the latest version they added that. TimelyPenguin76 might know more

3 years ago

0 I Would Like To Understand The Limitations Of

My question is what happens if I launch in parallel multiple doit commands that create new Tasks.

Should work out of the box.

I would like to confirm that current_task ...

Correct.

4 years ago

0 I Am Trying To Use

make sure the API port is 8008 and the web 8080

4 years ago

0 Can I Run A Random Task From A Queue? Like This

so it would be better just to use the original code files and the same conda env. if possible…

Hmm you can actually run your code in "agent mode" assuming you have everything else setup.
This basically means you set a few environment variables prior to launching the code:
Basically:
export CLEARML_TASK_ID=<The_task_id_to_run> export CLEARML_LOG_TASK_TO_BACKEND=1 export CLEARML_SIMULATE_REMOTE_TASK=1 python my_script_here.py

3 years ago

0 Hi, I'D Like To Know If There Is A Way To Include A Process Like Aws Autoscaler And Its Configurations Inside The Clearml Helm Chart. My Goal Is To Automatically Run The Aws Autoscaler Task On A Clearml-Agent Pod When I Deploy The Clearml Services On The

My goal is to automatically run the AWS Autoscaler task on a clearml-agent pod when I deploy

LovelyHamster1 this is very cool!
quick question, if you are running on EKS, why not use the EKS autoscaling instead of the ClearML aws EC2 autoscaling ?

4 years ago

0 Hi, Does Anyone Have Some Issues With Cloning Git Repos Within Alegro? I Always Got Some Error Massage: Fatal: Unable To Access '

Is trains-agent using docker-mode or virtual-env ?

4 years ago

0 Hello, Everyone! I Have A Question Regarding Clearml Features. We Run Into The Situation When Some Of The Agents That Are Working On A Hpo Die Due To Variable Reasons. Some Workers Go Offline Or Resources Need Temporarily Be Detached For Other Needs. Thu

It can be a different agent.

If inside a docker then
clearml-agent execute --id <task_id here> --dockerIf you need venv do
clearml-agent execute --id <task_id here>You can run that on any machine and it will respin and continue your Task
(obviously your code needs to be aware of that and be able to pull its own last model checkpoint from the Task artifacts / models)
Is this what you are after?

2 years ago

0 Hey Everyone, I Have An Autoscaler Configuration That Runs Different Ec2 Instances. I Want The Ec2 Worker Launched By The Autoscaler Could Handle A Bucket With Different Aws Keys. The Configuration I Am Passing Is As Follows:

GrumpySeaurchin29 you can pass s3 credential for the autoscaler, but all the tasks will have them. Are you saying two diff sets of credentials is the issue, or is it the visibility?

3 years ago

0 Hi. I Get Some Problem With Clearml Agent. I Start Training On My Local Device, Clone Run, And Start This Run In Docker On Cluster. But, Seems Like Clearml Agent Сaches Environment(Package Weels, Python Version, Etc). Can I Config Clearml Agent To Not Сac

StickyBlackbird93 the agent is supposed to solve for the correct version of pytorch based on the Cuda in the container. Sounds like for some reason it fails? Can you provide the log of the Task that failed? Are you running the agent in docker-mode , or inside a docker?

3 years ago

0 Hi, I’M Trying To Create A Dataset On Clearml Server From My Aws S3 Bucket Via:

@<1562610699555835904:profile|VirtuousHedgehong97>

source_url="s3:...",

This means your data is already on S3 bucket, it will not "upload" it it will just register it.
If you want to upload files, then they should be local and then when you call upload you can specify the target S3 bucket, and the data will be stored in a unique folder in the bucket
Does that make sense ?

2 years ago

0 Quick Question, Can Trains Log Keras Loss Values And/Or Metrics Automatically? Or Would I Have To Attach A Tensorboard Callback?

Hi ElegantCoyote26 , yes I did 🙂
It seems cometml puts their default callback logger for you, that's it.

4 years ago

0 Quick Question, Can Trains Log Keras Loss Values And/Or Metrics Automatically? Or Would I Have To Attach A Tensorboard Callback?

Thanks ElegantCoyote26 I'll look into it. Seems like someone liked our automagical approach 🙂

4 years ago

0 What’S The Easiest Way To Update The Repo Url Alone For A Task? Need - In My Ci, The Url Used Is Https But I Need The Ssh Url To Be Used. I See That We Can Pass Repo To Task.Create But Not Task.Init

Ohh, sure then editing git config will solve it.
btw: why would you need to do that, the agent knows how to do this conversion on the fly

4 years ago

Show more results