AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 I Have A Situation Where I’D Like To “Promote” The Pipeline (And Dataset) By Creating It In A Completely Separate Instance Of Clearml Server Which Is Used For Production Retraining (Vs. The Dev. Clearml Server That Is Used For Experiments) A) Is This Some

Hi RoughTiger69
A. Yes makes total sense . Basically you can use Task.export Task.import to do achieve this process (notice we assume the dataset artifacts links are available on both, usually this is the case)

B. The easiest way would be to use Process , then one subprocess is exporting from dev , where the credentials and configuration is passed with os environment. The another subprocess imports it to the prod server (again with os environment pointing to the prod server). Make sense?

2 years ago

0 Hello! I’M Wondering If There Is An Option To Run A Termination Hook Script

Thanks SparklingHedgehong28
So I think I'm missing information on what you call "Instance protection" ?
You mean like respining spot instances ? or is it away to review the performance of AWS ASG (i.e. like a watchdog of a sort) ?

2 years ago

0 Hello! I’M Wondering If There Is An Option To Run A Termination Hook Script

Hi SparklingHedgehong28
What would be the use for "end of docker hook" ? is this like an abort callback? completion ?

instance protection

Do you mean like when instance just died (line spot in AWS) ?

2 years ago

0 Hello! I’M Wondering If There Is An Option To Run A Termination Hook Script

SparklingHedgehong28 this is actually quite cool! Still not sure why not just use the built in autoscaler https://github.com/allegroai/clearml/tree/master/examples/services/aws-autoscaler , but it is a really cool usage of ASG 🤩

2 years ago

0 I Originally Posted In

As we use a custom CUDA image, we do not want this running on user login, and get ugly error messages about missing symlinks.

You can customize the startup bash script (running inside Any container) here:
https://github.com/allegroai/clearml-agent/blob/bf07b7f76d3236c1118b81730c6d9718705a795a/docs/clearml.conf#L145
LackadaisicalOtter14 Would that help?

2 years ago

0 What Is

MelancholyElk85

After I set base docker for pipeline controller task, I cannot clone the repo...

What do you mean by that?
Also, how do you set the PipelineController base_docker_image (I'm assuming the is needed to run the pipeline logic?!, is that correct?)

3 years ago

0 Hello People, Is There An Easy Way For Clearml To Work With

Hi TartSeal39
So the thing is, the agent does not support yaml env for conda. Currently if the requirements section is empty, the agent will use the requirements.txt of the repo. We first need to add support for conda yaml, and then allow you to disable the auto requirements or push the specific yaml. Would that work? Also is there a reason the auto package is not working?

3 years ago

0 Hello People, Is There An Easy Way For Clearml To Work With

TartSeal39 please let me know if it works, conda is a strange beast and we do our best to tame it.
Specifically when you execute manually on a conda env we collect (separately) the conda packages & the python packages (so later we can replicate on both conda & pip, or at least do our best)
Are you running both development env and agent with conda ?

3 years ago

0 Hey Guys, Hope You'Re Having A Good Week

Yep 🙂
Basically:
` task = Task.get_task(task_id='aaaa')
while task.status not in ('completed', 'stopped',):

do something ?

sleep(15) `(Notice task.status / task.get_status() will refresh the Task status on every call)

3 years ago

0 What Is The Recommended Way To Stop The Execution Of A Specific Agent? This Command Doesn'T Allow Me To Specify The Agent Ip I Want To Stop:

Maybe this one?
https://github.com/allegroai/clearml/issues/448
I think it is already there (i.e. 1.1.1)

2 years ago

0 Probably A Novice Question, But I’M Getting

Hi ExuberantParrot61
Is the pipeline logic code running from inside the repo?

2 years ago

0 Probably A Novice Question, But I’M Getting

Hi ExuberantParrot61 the odd thing is this, message

No repository found, storing script code instead

when you are actually running from inside the repo... (
is it saying that on a specific step, or is it on the pipeline logic itself?
Also any chance you can share the full console output ?
BTW:
you can manually specify a repo branch for a step:
https://github.com/allegroai/clearml/blob/a492ee50fbf78d5ae07b603445f4983feb9da8df/clearml/automation/controller.py#L2841
Example:
https:/...

2 years ago

0 Hi. When Using The Logger'S

DistressedGoat23 notice the last argument in report_histogram, 'extra_layout'
https://clear.ml/docs/latest/docs/references/sdk/logger#report_histogram
You can then specify the plotly histogram orientation, full details here:
https://plotly.com/javascript/reference/bar/
I'm assuming the one you are after is 'orientation '
https://plotly.com/javascript/reference/bar/#bar-orientation

2 years ago

0 I’M Trying To Get A Copy Of A Model Through Clearml Which Is Stored In S3:

BeefyHippopotamus73 this error seems like it is coming from boto3, are you sure the credentials are properly configured and that you have read permission ?

2 years ago

0 Hi All, I Am Getting A Bunch Of This Kind Of Log Messages "Clearml.Storage - Info - Starting Upload: /Tmp/.Clearml.Upload_Model_6Ou50Pb1.Tmp =>" I Am Pretty Sure They Happen As A Part Of The Model Initialization About 10 Of Those, My Guess Is That Every T

RipeGoose2 models are automatically registered
i.e. added to the models artifactory, but it only points to where the files are stored
Only if you are passing the output_uri argument to the Task.init, they will be actually uploaded.
If you want to disable this behavior you can pass
Task.init(..., auto_connect_frameworks={'pytorch': False})

3 years ago

0 I’M Trying To Get A Copy Of A Model Through Clearml Which Is Stored In S3:

I still think the issue is getting boto3 credentials

It might be the case
Are you using clearml-agent or are you running it manually ?

2 years ago

0 Hello! I'M Using The Self-Hosted Version Of Clearml. I'M Doing Some Testing And It Seems That The Clearml Isn'T Auto-Logging My Matplotlib Plots. The Versions I'M Using Are Matplotlib==3.6.2 And Clearml==1.6.4. Am I Missing Something?

This one seem to work

` from clearml import Task
task = Task.init(...)
import matplotlib.pyplot as plt
import numpy as np

plt.style.use('_mpl-gallery')

make data:

np.random.seed(10)
D = np.random.normal((3, 5, 4), (0.75, 1.00, 0.75), (200, 3))

plot:

fig, ax = plt.subplots()

vp = ax.violinplot(D, [2, 4, 6], widths=2,
showmeans=False, showmedians=False, showextrema=False)

styling:

for body in vp['bodies']:
body.set_alpha(0.9)
ax.set(xlim=(0, 8), xticks=np.arang...

one year ago

FrothyShark37 what was different in your script ?

one year ago

Thanks FrothyShark37
I just verified, this would work as well, I suspect what was missing is the plt.show call, this is the actual call that triggers clearml

one year ago

Can you post here the actual line? seems like we can fix it to also support this scenario (if we could test it)

one year ago

0 Hello! I'M Trying To Make A Simple Eval.Py Script That Will Go Pull The Best Model Of A Given Experiment, Load It Locally And Evaluate It On Whatever Data I Give. Question 1: Is There A Standard Way Documented Somewhere To Do This? Question 2: I'M Loadin

Fixed in pip install clearml==1.8.1rc0 🙂

one year ago

0 I Have A Reporting Task I Want To Schedule Using Taskscheduler. 2 Main Input Params Are

can I add user properties to a scheduler configuration?

please expand, what do you mean by user property and how one would use it?

2 years ago

0 Is There Any Specific Version Of Numpy You Recommend To Use With Clearml Python Library? I Am Building An Python Alpine Docker Image With Clearml==1.7.2 But It Breaks When Building Image From Dockerfile.

Hi DrabCockroach54
This seems like a pip issue trying to install from source, try upgrading the pip version and before installing numpy, it should solve it 🤞

one year ago

0 Hey Is

Hi FierceHamster54
Dataset is downloading multi threaded already
But yes get_local_copy() is thread / process safe

one year ago

0 Hi! In "Parallel Coordinates" View, Is There An Option To "Tilt" The Strings A Bit? It'S Currently Impossible To Understand Anything When There Are Multiple Hyperparameters Viewed And Some Have More Then Super Short Strings. Example Of How It Can Look (Se

Thanks GorgeousMole24
That is a very good point! passing to product guys

one year ago

0 Is There A Direct Way To Get A Model Using Its Id Like It Works With Dataset.Get?

What is the Model url?
print(model.url)

2 years ago

0 When I Do

You could change infrastructure or hosting, and now your data is associated with the wrong URL

Yeah that makes sense, so have it on a specific dns name? (this is usually the case with k8s deployments)

2 years ago

0 Hello! Something With A Very Long Running

SpicyCrab51 you can change the task to complete, it is just a state change nothing will actually change other than the status. Task.get_task(pass_dataset_id_here).mark_complete()

2 years ago

0 Hello! Something With A Very Long Running

Hi SpicyCrab51 ,
Hmm, how exactly is the Dataset opened?
If the Dataset object is alive for 30h it will keep the dataset alive, why isn't it being closed ?

2 years ago

0 Hi, I Need Your Help Setting Up An Trains Agent Running In Docker. I Have An Python Script Calling Wget As System Command Which Runs Fine On My Dev Engine. When Cloning The Experiment And Scheduling It Into The Services Queue I Get An Error That The Call

Okay, so basically set a template for the pod, specifying the docker image. Make sure you pass the correct trains-server configuration (i.e. api/web/file server addresses and credentials), and select the queue name the agent will listen to.

container image / details
https://hub.docker.com/r/allegroai/trains-agent

https://github.com/allegroai/trains-agent/tree/master/docker/agent

Full environment variable list to pass can be found here:
https://github.com/allegroai/trains-server/blob/...

3 years ago

Show more results