AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hello, I Am Getting `Valueerror: Could Not Get Access Credentials For '

JitteryCoyote63 what am I missing?
What are the errors you are getting (with / without the envs)

3 years ago

0 Hello! How To Determine The Cache For An Agent In Kubernetes? I'M Going To Mount S3 As A Cache Folder As A Local Path Using S3Fs. What Variable Needs To Be Set In Values.Yaml For Agent Helm Chart?

Hi @<1578555761724755968:profile|GrievingKoala83>

mount s3 as a cache folder

I'm not sure that would be fast enough for cache ...

How to override

/root/.cache/pip

path?

in your clearml.conf fille:
None
then set it to your PV

5 months ago

0 Hi, Trying To Spin Up A Clearml Agent And Gettting This Error:

the latter is an ec2 instance

and the agent fails to install on the ec2 machine ?

2 years ago

0 Hey Guys. I Tried Running The Pytorch Mnist Example On A Train-Agent By Running It Locally And Then Resetting The Experiment And Then Enqueue-Ing It To The Default Queue. All Works Well But It Seems The Environment Building Process Gets Stuck On A Manual

Hi ColossalAnt7 , I think we run into it on a few dockers, I believe the bug was fixed in the latest trains-agent RC. Could you verify please ?

3 years ago

0 I Have Another Pipeline-Related Question. In A Pipeline Controller Task, I Would Like To Add Several Steps That Are Based On The Same Base Task (I'M Passing The

No worries, and I will make sure we output a warning if section names are not used 🙂

3 years ago

0 Hello! I Add To Inject The Configuration Into Clearml With

So the naming is a by product of the many TB created (one per experiment), if you add different naming ot the TB files, then this is what you'll be seeing in the UI. Make sense ?

3 years ago

0 I Wanted To Ask, How To Run Pipeline Steps Conditionally? E.G If Step Returns A Specific Value, Exit The Pipeline Or Run Another Step Instead Of The Sequential Step

VexedCat68 both are valid. In case the step was cached (i.e. already executed) the node.job will be None, so it is probably safer to get the Task based on the "executed" field which stores the Task ID used.

2 years ago

0 I Am Using Opennmt-Tf (2.18.1) And Clearml (1.1.2) For Training And Testing My Translation Models. I Am Wanting To Register The Incremental Bleu Scores And Final Test Data With Clearml (For Plotting, Comparison, Etc.), But It Is Not Working. I Cannot Fi

Hmm StrangePelican34
Can you verify you call Task.init before TB is created ? (basically at the start of everything)

2 years ago

0 We Recently Released A New Version Of

where is it persisted? if I have multiple sessions I want to persist, is that possible?

On the file server, yeah it should be support that, you can specify the --continue-session to continue a previously used one.
Notice it does delete older "snapshots" (i.e. previous workspace) when you are continuing a session (use --disable-session-cleanup to disable it)

6 months ago

0 Hello, I Am Trying To Use The

I am trying to use the

configuration vault

option but it doesn't seem to apply the variables I am using.

Hi EmbarrassedSpider34 I think this is an enterprise feature...

Manged to make the credentials attached to the configuration when the task is spinned,

I'm assuming env variables ?

2 years ago

0 Hi, I Am Trying To Upload A Plot To An Existing Task Using The

BTW: I tested the code you previously attached, and it showed the plot in the "Plots" section
(Tested with latest trains from GitHub)

3 years ago

0 Hi, I Assume It Is Very Basic But How Can I Add The Model That Is Created In The Training To The Artifacts And To See It In The Models Tab?

Your code should have worked, i.e. you should see the 'model.h5' in the artifacts tab. What do you have there?
It should look something like this one:
https://demoapp.trains.allegro.ai/projects/531785e122644ca5b85b2e19b0321def/experiments/e185cf31b2634e95abc7f9fbdef60e0f/artifacts/output-model

BTW:
To manually register any model:

from trains import Task, OutputModel task = Task.init('examples', 'my model') OutputModel().update_weights('my_best_model.h5')

4 years ago

0 Can I Prevent

however, this will also turn off metrics

For the sake of future readers, let me clarify on this one, turning it off auto_connect_frameworks={'pytorch': False} only effects the auto logging of torch.save/load
(side note: the reason is pytorch does not have built in metric reporting, i.e. it is usually done manually and these days most probably with tensorboard, for example lightning / ignite will use tensorboard as default metric reporting),

3 years ago

0 Hey Guys Trying To Save A Model Via The Outputmodel.Update_Weights Function I Get The Following Error:

btw: what's the OS and python version?

one year ago

0 Anyone Knows Why This Happens?

AbruptWorm50 can you send full image (X axis is missing from the graph)

2 years ago

0 Quick Question About Concurrency And The Serving Pipeline, If I Have Request A Sent And Its Being Processed, And Then Send Request B While A Is Processing, Will The Serving Pipeline Start Processing (I.E. Run

Hi @<1547028116780617728:profile|TimelyRabbit96>
It should process the new request A (this is a multi threading / async implementation)
Is this consistent with what you are seeing ?

5 months ago

0 Hi, Just Checking.. Does Anyone Know Whether Clearml Enterprise Has Deployment Functionality..

So I might be a bit out of sync, but I think there should be Triton serving and OpenVino serving built into it (or at least in progress).

3 years ago

0 We Have A Environment Variables Definitions.Py File Which Every User Configures On Their Local Machine. This File Includes Local Paths As Well As Aws/Api Credentials. This Is An Issue When Spinning Up Clearml Tasks Since It Is Not Included In The Git Repo

sure thing 🙂

2 years ago

0 Hey, I'M Running A Pipeline, And 1 Stage Passed - But The Next One Failed. I Fixed The Bug For The Second One - Is There Any Way To Retry The Pipeline From The Failure?

Hi CleanPigeon16

can I make the steps in the pipeline use the latest commit in the branch?

Yes:
manually clone the stesp's Task (in the UI), and in the UI edit the Execution section and change to "last sommit on branch" and specify the branch name programmatically (as the above, clone+edit)

ValueError: Could not parse reference '${run_experiment.models.output.-1.url}', step run_experiment could not be found

Seems like the "run_experiment" step is not defined. Could that be ...

3 years ago

0 Hello, I Have Some Problems With Allegro. I Run A Programm And Then I Saw It On The Trains Server. But Now I Change Something With The Code And I Pushed It Again. Now I Cloned It. But The Old Code Was Executed. How Can I Run The New Code I Pushed?

Hi SuperiorDucks36

you have such a great and clear GUI

😊

I personally would love to do it with a CLI

Actually a lot of stuff are harder to get from UI (like current state of your local repository etc.) But I think your point stands 🙂 We will start with CLI, because it is faster to deploy/iterate, then when you guys say this is a winner we will have a wizard in the UI.
What do you think?

3 years ago

0 Hello! I'M Trying To Make A Simple Eval.Py Script That Will Go Pull The Best Model Of A Given Experiment, Load It Locally And Evaluate It On Whatever Data I Give. Question 1: Is There A Standard Way Documented Somewhere To Do This? Question 2: I'M Loadin

Fixed in pip install clearml==1.8.1rc0 🙂

one year ago

0 I Wanted To Ask About Html Reporting, If I Want To Do A More Fancy Visualization (Like Overlay Of Two Images Depending On Mouse Hovering), I Have To Inject This Html Into The Reporting Code, Right? I Am Asking, As Perhaps It Is Possible To Have Custom Wid

HealthyStarfish45 you mean like replace the debug image viewer with custom widget ?
For the images themselves, you can get heir urls, then embed that in your static html.
You could also have your html talk directly with the server REST API.
What did you have in mind?

3 years ago

0 Hey, Can Anyone Please Explain To Me How The /Tmp/.Clearml_Agent.Something.Cfg File Is Generated Which Next Is Used In Docker? Because This File Is Slightly Different From Mine For Example In Mine /Home/Asa/Clearml.Conf I Set System_Site_Packages = False

ResponsiveCamel97
BTW: any reason not to allow this flexibility ?

3 years ago

0 Hi, I Started A Trains-Agent (0.15) In Services Mode (Full Command:

ElegantKangaroo44 I tried to reproduce the "services mode" issue with no success. If it happens again let me know maybe will better understand how it happened (i.e. the "master" trains-agent gets stuck for some reason)

4 years ago

0 Hi, I Went Through This Slack'S History And The Problem Already Popped Up A Couple Of Times But Doesn'T Look Like Solved. On My Machine I Currently Have 4 Gpus, No Problems If I Want To Allocate All 4 Or Just 1 Using

Ubuntu? which version?

3 years ago

0 Hello! I Haven'T Used Trains Before, I Am Looking For Opinion From Anyone With More Experience On Whether Trains Is The Correct Tool For My Non-Ml Use Case. My Usecase:

I'll try to find the link...

4 years ago

0 Hi, I'Ve Just Started To Evaluate Clearml For Internal Use At My Org And Am Wondering If There'S Anyway To Import Data From Old Experiments Into The Dashboard. Anyone Have Any Thoughts On This?

If I have access to the logs, python env and git commits, is there an API to log those to the experiments too?

Sure:
task.update_task see here:
https://clear.ml/docs/latest/docs/references/sdk/task#update_task
example:
task.update_task(task_data={'script': {'branch': 'new_branch', 'repository': 'new_repo'}})The easiest way to get all the different sections (they should be relatively self explanatory) is calling task.export_task() which returns a dict with all the fields yo...

2 years ago

0 More Of Pushing Clearml To It'S Data Engineering Limits

single task in the DAG is an entire ClearML

pipeline

.

just making sure detials are not lost, "entire ClearML pipeline ." : the pipeline logic is process A running on machine AA.
Every step of that pipeline can be (1) subprocess, but that means the exact same environement is used for everything, (2) The DEFAULT behavior, each step B is running on a different machine BB.

The non-ClearML steps would orchestrate putting messages into a queue, doing retry logic, and tr...

one year ago

0 Hi, In My Setup I Run Multiple Experiments In Parallel From The Same Script. I Understand That There Can Only Be One Execution

Hi SourSwallow36
What do you man by Log each experiment separately ? How would you differentiate between them?

3 years ago

0 Upon Calling Task.Init(), I Get Below Error: Failed Getting Token (Error 401 From

You can install it, and after the wizard is done uninstall it, if you want to keep using the trains from the git clone.

4 years ago

Show more results