AgitatedDove14

49 Questions, 8060 Answers

Active since 10 January 2023

Last activity 9 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8060

0 I Have Another Small Technical Question, I Am Trying To See The Workers Status Programatically Using The Folowing:

It's just the print (_ repr _) not showing the data
for w in client.workers.get_all(): print(w.data)

3 years ago

0 Hi Everyone, Is It Possible To Show The Upload Progress Of Artificats? E.G. I Use

server-->agent is fast, but agent-->server is slow.

Then multiple connection will not help, this is the bottleneck of the upload speed of your machine, regardless of what the target is (file-server, S3, etc...)

3 years ago

0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

How did you define the decorator of "train_image_classifier_component" ?
Did you define:
@PipelineDecorator.component(return_values=['run_model_path', 'run_tb_path'], ...Notice two return values

2 years ago

0 Hey Guys, Is There A Ready Script That Can Delete All Models From S3 (Or Other Storage) That Are Related To Deleted Or Archived Experiments?

🙂

3 years ago

0 Any Idea Why I Get This Error In All My Agents

It seems like you are correct, everything should just work. Are you still getting the error? What's the clearml agent version?

3 years ago

0 Any Idea Why I Get This Error In All My Agents

in the docker-compose file. Still strange...

hmm yes it is... If you have an idea on what went wrong let me know, we would love to fix it

3 years ago

0 Hi, I Want To Run A Script Remotely On My Agent, But For It To Work I Need It To Download To The Agent The Whole Directory The Script Is In, Is It Possible?

the other repos i have are constantly worked on and changing too

Not only it will be cloned automatically, the git diff of the sub-modules are stored as well 🙂

3 years ago

0 Qq: I'M Trying To Run The

I think so (you can also comment out the Task.init() just to verify this is not a clearml issue)

3 years ago

0 I’M Trying To Use

Sure LazyTurkey38 here's a nice hack for that:
` # code here
task.execute_remotely(queue_name=None, clone=False, exit_process=False)

patch the Task and actually send it for execution

if Task.running_locally():
task.update_task(task_data={'script': {'branch': 'new_branch', 'repository': 'new_repo'}})
# now to actually enqueue the Task
Task.enqueue(task, queue_name='default') You can also clear the git diff by passing "diff": "" `
wdyt?

3 years ago

0 Hi, The Following Does Not Seem To Work

SmarmySeaurchin8 regrading (2)
I'm not sure the current visualization supports it. I mean we can put "{}", but that would imply you can edit it, which then we have to support, possible but weird, and this is why:
task.connect({'a':{},'b': {'nested': 'value}}will become
'a' = '{}'
'b/nested' = 'value'
But then if you edit to:
'a' = '{'nested': 'value'}'
'b/nested' = 'value'
you have two different ways of presenting the same type of structure...

4 years ago

0 Hi, I Failed To Update The "Started At" And The "Completed At" Attributes In The "Info" Tab. I Tried To Do So By The Following Steps:

I failed to update the "STARTED AT" and the "COMPLETED AT" attributes in the "INFO" tab.

I'm not sure this can actually be overridden...

4 years ago

0 Hello Everyone, I'M Currently Trying Clearml-Serving To Serve A Model Via An Endpoint. I Followed The Tutorial In The Documentation, But When I Try A Request, I Get An Error. Here It Is: Curl -X Post "

Do you have any advice for this step, (monitoring)? I feel like it's not very well documented.

Yeah I think it is complicated.
I would start with the example here: None
Basically what it does is create histogram over time of the values the Rest API gets. Then in graphana it visualizes those values.
Notice that the request latency / frequency are automatically logged ...

10 months ago

0 Hi, Kudos For The 0.15 Guys! I Am Having An Issue Related To Git Auth: I Have An Issue With Trains-Agent (0.15): It Does Not Use Git Creds While Trying To Clone A Private Repo:

Thank you! 🙂

4 years ago

0 Hi All, I Am Running Into Ssl Verification Issues With Trying To Upload Model Artifacts To Minio. We Are Running The Clearml Agent In A Container, Have Mounted A Ca Bundle To The Container And Referenced It On Env Vars So That Aws Cli/Boto And Requests Us

I can but that is not a configuration we would want to run with in production
Agreed, I just want to isolate the issue. I think this is the bottom python interface missing some configuration or environment variables

3 years ago

0 How Would I Go Downloading A Table That I Have Reported Using

WackyRabbit7 This is a json representation of the entire plot (basically how plotly sees it).
What you are after is:
full_json[0]['cells']['values']Which is a list of lists (row order) in the table

3 years ago

0 Hi All! I'M Using Clearml With Hydra As Configuration Manager. I'M Trying To Rerun A Task By Overriding Some Of The Configurations From The Ui. I Tried To Change The Config_Name Args In The Args Section And Also The Omegaconf Configuration In Configuratio

If I edit directly the OmegaConf in the UI than the port changes correctly

This will only work if you change the Hydra/allow_omegaconf_edit to True in the UI. Did you?

3 years ago

0 Hi Again. As I Am Running My Experiment From Server Using Agent, I Am Failing On The Point, Where The Arguments Of Argparse Are Processed. When Is The Agent Task Registered. I Am Getting None For Task.Current_Task() At The Begining Of My Script.

or shall I call the Task.init even from the agent

WorriedParrot51 I think something is lost here.
Task.init() is always called, even when the agent is executing the code. The difference is in what happens inside the Task.init() call. When the codebase itself is executed by the trains-agent, it signals through OS environment to the task.init() that instead of a new created task, it should use the already created one. from this point all data flows from the trains-server back into the c...

4 years ago

0 If We’Re Using The Same Git Repo Over And Over For Almost All Jobs, Is It Possible To Have The Agents Keep A Local Version And Only Download The Diff Of The Job Commit To Speed Things Up?

Hi LazyTurkey38

, is it possible to have the agents keep a local version and only download the diff of the job commit to speed things up?

This is what it does, it has a local cached copy and it only pulls the latest changes

3 years ago

0 I Am Seeing That Some Steps In A Pipeline Are Being Skipped. Like For Example, In A Pipeline With 4 Steps, It’S Directly Starting At Step 3. Is There Some Reason For This, Some Optimization Kicking In?

Hi TrickySheep9
Could you post the pipeline code here?
Also which clearml version are you using ?

3 years ago

0 Hi! I Am Using The Modelcheckpoint Callback From Tensorflow To Save The Best Model. When The Experiment Finishes If I Go On The Server To Experiment > Artifacts > Output Model I Can See The Model And Subsequently By Clicking On It The Weights. How Can I

Yes, that sounds like the issue, is the file actually there ?

4 years ago

0 I’M Trying To Use

LazyTurkey38 , ohh I think you are correct 😞
it should be:
# patch the Task and actually send it for execution if Task.running_locally(): # this will verify all auto repo detection and python is done. task.close() # so that we can edit the task task.reset() # update the repo task.update_task(task_data={'script': {'branch': 'new_branch', 'repository': 'new_repo'}}) # now to actually enqueue the Task Task.enqueue(task, queue_name='default')wdyt?

3 years ago

0 Is It Possible To Run An Agent, Listen To The Services Queue Without Using Docker?

🤞

4 years ago

0 Hello! Since Today I Get

'conda --version'

3 years ago

0 I'M Trying To Configure The Glue Agent To Use Aws Ecr Via Helm Charts. Below Is My Configuration. It Is Not Pulling The Image Though, It Is Failing With

Can you fix locally, just to verify ?

2 years ago

0 Can We Report A Pandas Table With Styling To Be Retained In The Webui? It Would Be Nice To Report E.G.

SmugLizard25 are you saying that with the latest version it does not work?