AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 Hi Again, I Tried To Upgrade Trains Package To 15.1 From 13.1 That I Was Using For A While.. After The Upgrade My Code Stuck When Trying To Use "Pool" (From Multiprocessing Import Pool) The Code Snip:

Hi CooperativeFox72 trains 0.16 is out, did it solve this issue? (btw: you can upgrade trains to 0.16 without upgrading the trains-server)

5 years ago

0 I Have A Question Regarding Running The Code On The Remote Machine, Each Time I Run The Code I See The Console In The Clearml Server Start Downloading All The Libraries I Used In The Code And When I Run Another Code The Same Thing Happens So Why It Has To

how to put or handle this configuration and where?

In your clearml.conf on the machine with the agent just add at the bottom of the file agent.venvs_cache.path=~/.clearml/venvs-cache

2 years ago

0 Hi Folks, I Am Having An Issue I Can'T Properly Understand: I Have Tried To Run The "Dataset" Example From The Official Clearml Repository (From My Laptop) For Some Reason It Got Stuck, So I Killed The Process, But In Clearml Ui It Still Results As "Runn

SarcasticSquirrel56 when the process dies (i.e. killed) it does not have time not update the state, then the server watchdog will set the state to aborted after X amount of time of inactivity (default is 2 hours)

3 years ago

0 Hi, I Am Using Clearml By Building It As My Own Server. After The Message Below Was Displayed, The Operation Stopped Without Progress. In Clearml Server, It Is In “Running” State. “Clearml.Task - Info - No Repository Found, Storing Script Code Instead”

Hmm what's the clearml version? Whats the python version, whats the OS? And pytorch version?

one year ago

Does it wok if you remove the Task.init call?

one year ago

0 Hi Everyone, Does Anybody Now If The Latest Release 1.15 Is Still Vulnerable To

Hi Martin, of course not,

Smart!

I was just wondering if it has been patched yet and if not what is the expected timeline for patching it

Yes, I believe the target is a patch version 1.15.1 to be released in a couple of weeks. This is not a major issue but it's always better to have have it fixed. (btw: the enterprise version never had this issue to being with, because it is of course authenticated, as well as it has additional RBAC layer on top.)

one year ago

0 When Trains-Agent Is Configured With

Hi PompousParrot44
Could you send the "Installed Packages" list?
I think there is a bug in the current trains-agent (there is already a fix but the RC is still not out),
where "packeg @ git+http" packages ignore the git+http link.
You can solve it manually by just editing the "Installed packages" (when Task is in draft mode, the section becomes editable), and remove the "package @" part, and leave the "git+http" link.

5 years ago

0 We Are Currently Product-Hunting For Our Mlops Infrastructure And Clearml, Kedro, Mlrun Are On Our Short List. How Does Clearml Compare To Mlrun? One Big Difference Seems To Be That Mlrun Has A Feature Store Integrated. What Are Advantages/Disadvantages O

Not sure: They also have the feature store (data management), as mentioned, which is pretty MLOps-y

.

Right, sorry, I was thinking about "Nuclio", my bad.

How would you compare those to ClearML?

At least based on the documentation and git state I would say this is very early stages. In terms of features they "tick all the boxes", but I'll be a bit skeptic on the ability to scale and support these features.

Taking a look at the screenshots from the docs, it also seem...

3 years ago

0 Hi, I Am New Here, Can I Ask Question On Trains-Server Also?

CooperativeFox72 of course, anything trains related, this is the place 🙂
Fire away

5 years ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

Sounds great! I really like that approach, thanks GrotesqueDog77 !

2 years ago

0 Hi Guys! Trains Monitor: Could Not Detect Iteration Reporting, Falling Back To Iterations As Seconds-From-Start. What Happened?

Hi ItchyHippopotamus18
The iteration reporting is automatically detected if you are using tensorboard, matplotlib, or explicitly with trains.Logger
I'm assuming there were no reports, so the monitoring falls back to report every 30seconds where the iterations are seconds from start" (the thing is, this is a time series, so you have to have an X axis...)
Make sense ?

5 years ago

0 Hello! I'M Using The Self-Hosted Version Of Clearml. I'M Doing Some Testing And It Seems That The Clearml Isn'T Auto-Logging My Matplotlib Plots. The Versions I'M Using Are Matplotlib==3.6.2 And Clearml==1.6.4. Am I Missing Something?

Can you post here the actual line? seems like we can fix it to also support this scenario (if we could test it)

2 years ago

0 Hello, We Have A Self Hosted Clearml Server Connected To Different Queues And Use It To Launch Remote Experiments (Clearml==1.9.3, Clearml-Agent==1.5.2Rc0). It Is Working Really Well For Us Unless One Workflow :) We Would Like To Abort An Experiment And E

Hi @<1558986821491232768:profile|FunnyAlligator17>
What do you mean by?

We are able to

set_initial_iteration

to 0 but not

get_last_iteration

.

Are you saying that if your code looks like:

Task.set_initial_iteration(0)
task = Task.init(...)

and you abort and re-enqueue, you still have a gap in the scalars ?

2 years ago

0 Executed From Within A Pipelinecontroller Task, What Possible Reason Does

[Assuming the above is what you are seeing]
What I "think" is happening is that the Pipeline creates it's own Task. When the pipeline completes, it closes it's own Task, basically making any later calls to Tasl.current_task() return None, because there is no active Task. I think this is the reason that when you are calling process_results(...) you end up with None.
For a quick fix, you can do
pipeline = Pipeline(...) MedianPredictionCollector.process_results(pipeline._task)Maybe we should...

3 years ago

0 Hi All, I Am Trying To Execute Somewhat Custom Hpo Scheme With Clearml. I Would Want That A Single Running Python Script Will Be Able To Sample The Optimizer, Init A Task And Report The Result Multiple Times. I Didn'T Find Anything Similar In The Docs Or

Okay Now I get it!
Let me think about it for an hour or two 😄

4 years ago

0 Quick Note: V1.3.1 Caused Pipelinedecorator Tasks To By Default Disable The Automagic Frameworks Connection, This Bug Is Solved In The Latest Rc

Thank you ElegantCoyote26 for catching that! 😍

3 years ago

0 Hello All, I'M Trying To Queue A Task In Python But I'D Like To Reuse The Prior Task Id. In The Webapp You Can

No worries, glad to hear you found it 😄

one year ago

0 I’M

@<1541954607595393024:profile|BattyCrocodile47> you mean like environment variables?

2 years ago

0 Hi Everyone And Thanks Again For The Help, I Still Have No Success In Running Clearml Agent, It Just Gets Stuck Without Any Output, On Debug Mode For

Let me verify something in the code,

3 years ago

0 Hello! Since Today I Get

Could you try to do:

CUDA_VERSION="11.1" clearml-agent ...

4 years ago

0 Hey, Using K8S With Trains 0.16.1-320, All Of A Sudden The Entire Data (I.E Experiments, Tasks, Api Creds) Is Not Showing In The Ui Anymore. All Logs Seems To Be Fine Afai Can Tell... Any Idea What Went Wrong?

backup?

4 years ago

0 Another Question: Is It Possible To Read The Dependencies Manually From A Conda Environment.Yml? It Seems Like Clearml Is Not Able To Fetch The Dependencies Correctly When

What's the "working dir" ? (where in the repo the script is executed from)

4 years ago

Now I suspect what happened is it stayed on another node, and your k8s never took care of that

4 years ago

In that case, I think it is stuck on a previous Node, I can;t think of any other reason.
Do you have something else on the same PV that was lost ? like api server configuration?

4 years ago

0 Hi Guys, I Am Having Some Trouble Running Some Training Scripts With The Agent Functionality:

When we enqueue the task using the web-ui we have the above error

ShallowGoldfish8 I think I understand the issue,
basically I think the issue is:
task.connect(model_params, 'model_params')Since this is a nested dict:
model_params = { "loss_function": "Logloss", "eval_metric": "AUC", "class_weights": {0: 1, 1: 60}, "learning_rate": 0.1 }The class_weights is stored as a String key, but catboost expects "int" key, hence it fails.
One op...

3 years ago

0 Hi, I Wanted To Try Model Versioning, Suppose That I'Ve A Model And Want To Have Multiple Versions Of The Same Model And To Be Able To Have Inference On These Models(For Example

@<1671689437261598720:profile|FranticWhale40> this one: None

one year ago

Meaning the node restarted (or actually moved)

4 years ago

That somehow the PV never worked and it was all local inside the pod

4 years ago

0 Hi, Clearml Stores Models In The Following Format:

Is it possible to change this format ?

not really the path itself is set to be unique.
That said you can upload the model manually with StorageManager.upload_file then register it with Model.import_model
None
None
wdyt?

2 years ago

0 Hi Again, I Am Trying To Execute A Pipeline Remotely, However I Am Running Into A Problem With The Steps That Require A Local Package. Basically I Have A Repo, That I Created Specifically For This Pipeline And I Have Packaged It So That I Can Split It I

The second problem that I am running into now, is that one of the dependencies in the package is actually hosted in a private repo.

Add your private repo to the extra index section in the clearml.conf:
None

one year ago

Show more results