AgitatedDove14

49 Questions, 8124 Answers

Active since 10 January 2023

Last activity one year ago

Reputation

Badges 1

25 × Eureka!

Answers 8124

0 Hi, How Can I Remove A Tag From A Task Via Code In A Non-Barbaric Way?

Ooops 😞
task.get_tags()
task.set_tags()

4 years ago

0 Hello, I Would Like To Use Spot Instances Together With The Aws Autoscaler To Train Models With Pytorch/Ignite And I Am Wondering How To Support Interruptions During The Training (In Case The Instance Is Terminated By Aws). Is There Anything Already Built

JitteryCoyote63 that makes total sense!!
The reporting subprocess is not being updated with the new value! Let me check how we can pass it along...

4 years ago

0 Hi Everyone, I Have A Question About Using

The other order (with custom decorator above pipeline fails - just for you info

)

This is on "purpose" the pipeline decorator has to be the top decorator.
Glad it works!

one year ago

0 Hi Again, I Am Trying To Execute A Pipeline Remotely, However I Am Running Into A Problem With The Steps That Require A Local Package. Basically I Have A Repo, That I Created Specifically For This Pipeline And I Have Packaged It So That I Can Split It I

The second problem that I am running into now, is that one of the dependencies in the package is actually hosted in a private repo.

Add your private repo to the extra index section in the clearml.conf:
None

one year ago

0 Hi All. I Am Struggling With Integrating Plots Into My Task. Without The Plotting Code, The Task Never Completes The Execution And Seems To Hang. Also, The Plots Are Not Visible In The Plots Tab. I Am Running A For Loop For Different Models And Attemptin

Hmm could you try to upload to your files server (not the S3)
Maybe some credentials error ?

4 years ago

0 Clearml-Serving Will Automatically Serve Published Models From Your Clearml Model Repository, So The First Step Is Getting A Model Into Your Clearml Model Repository.

Actually this should be a flag

4 years ago

0 I Buried This Issue In Another Thread To Do With Deployment, But I Was Wondering If Anyone Else Has Had Problems Using

Yes this is Triton failing to load the actual model file

4 years ago

0 , This Is A Great Tool For Visualizing All Your Experiments. I Wanted To Know That When I Am Logging Scalar Plots With Title As Train Loss And Test Loss They Are Getting Diplayed As Train Loss And Test Loss In The Scalar Tab. I Wanted That The Title Shoul

If you one each "main" process as a single experiment, just don't call Task.init in the scheduler

5 years ago

0 I Buried This Issue In Another Thread To Do With Deployment, But I Was Wondering If Anyone Else Has Had Problems Using

(Also can you share the clearml.conf, without actual creds 😉 )

4 years ago

0 The

Do you think this is better ? (the API documentation is coming directly from the python doc-string, so the code will always have the latest documentation)
https://github.com/allegroai/clearml/blob/c58e8a4c6a1294f8acec6ed9cba81c3b91aa2abd/clearml/datasets/dataset.py#L633

4 years ago

0 Hi When We Try And Sign Up A User With Github. The Invitation Link Never Works. Given They Have Already Signed Up With Their Github

Yes, could you send the full log? screen grab ?

one year ago

0 Hey :wave: *Tensorboard Logs Overwhelming Elasticsearch* I am running a clear ml server, however when running experiments with tensorboard logging I am seeing the elastic indexing time increase drastically and in some cases I have also seen timeout erro

Hi @<1590152178218045440:profile|HarebrainedToad56>
Yes you are correct all TB logs are stored into the ELK in the clearml backend. This really scales well and rarely has issues, as long of course that the clearml-server is running on strong enough machine. How many RAM / HD you have on the clearml-server ?

one year ago

I would just add git+ None to your requirements (either in the requirements.txt or even better as part of the pipeline/component where you also specify the repo to be used)
The agent will automatically push the crednetilas when it installs the repo as wheel.
wdyt?
btw: you might also get away with adding -e . into the requirements.txt (but you will need to test that one)

one year ago

0 I'M Probably Stupid, But How Do I Specify Worker Name? Usecase - I Want To Create Two Workers Using The Same Gpu, And New Worker Just Overwrites The Old One

🙂

5 years ago

0 Hi Everyone, I Wanted To Inquire If It'S Possible To Have Some Type Of Model Unloading. I Know There Was A Discussion Here About It, But After Reviewing It, I Didn'T Find An Answer. So, I Am Curious: Is It Possible To Explicitly Unload A Model (By Calling

@<1657918706052763648:profile|SillyRobin38> out of curiosity did you compare performance of tensorrt-llm vs vllm ?
(the jury is still out on that, just wondered if you had a chance)

one year ago

0 Hi Everyone, Just Setup Trains.. Was Very Easy To Setup. Was Able To Run An Experiment With It. Question: Is It Possible To Turn Off The Code Tracking (Anything Related To Git) ?

I see... We could definitely add an argument to control it. I'll update here once there is an RC

5 years ago

0 Two Questions About Datasets: Question 1: Are Parallel Writes To A Dataset With The Same Version Possible? Is The Way To Go, To Have A Task, Which Creates A Dataset Object, Which In Turn Is Passed As Artifact To The Subsequent Ingestion Tasks? After The P

Hi @<1661542579272945664:profile|SaltySpider22>

question 1: are parallel writes to a dataset with the same version possible?

When you are saying parallel what do you mean? from multiple machines ?

Whats the recommended way to append the dataset in a future version?

Once a dataset was finalized the only way to add files is to add another version that inherits from the previous one (i.e. the finalized version becomes the parent of the new version)
If you are worried about multip...

one year ago

0 So I Bumped Onto This Comparison Shared By Dagshub. It Kinda Placed Clearml Is A Rather Bad Position Compared To Everything Else In The Industry.

Hi CynicalBee90
Always great to have people joining the conversation, especially if they are the decision makers a.k.a can amend mistakes 🙂
If I can summarize a few points here (and feel free to fill in / edit any mistake or leftovers)
Open-Source license: This is basically the mongodb license, which is as open as possible with the ability to, at the end, offer some protection against Amazon giants stealing APIs (like they did for both mongodb and elastic search) Platform & language agno...

4 years ago

0 Hi Community! This Is My First Time Using Clearml. When Running A Training Session In Vscode Using Yolov8, I Get The Following Errors:

Hi @<1601386194774528000:profile|AmusedPanda8>
I think the project name is ./model_training/trained_models/yolov8n-TEST_OKTODELETE/ and for some reason you have "." as a project project?
(notice jested projects are automatically created based on the project name with '/' as separator)

2 years ago

0 Hi, Just To Check. Does The K8S Glue Install Torch By Default? I'M Getting

Nice SubstantialElk6 !
BTW: you can configure your cleaml client to store the changes from the latest Pushed commit (and not the default which is latest local commit)
see store_code_diff_from_remote: in clearml.conf:
https://github.com/allegroai/clearml/blob/9b962bae4b1ccc448e1807e1688fe193454c1da1/docs/clearml.conf#L150

4 years ago

0 I Buried This Issue In Another Thread To Do With Deployment, But I Was Wondering If Anyone Else Has Had Problems Using

Would this be best if it were executed in the Triton execution environment?

It seems the issue is unrelated to the Triton ...

Could I use the

clearml-agent build

command and the

Triton serving engine

task ID to create a docker container that I could then use interactively to run these tests?

Yep, that should do it 🙂
I would start simple, no need to get the docker itself it seems like clearml credentials issue?!

4 years ago

0 I Buried This Issue In Another Thread To Do With Deployment, But I Was Wondering If Anyone Else Has Had Problems Using

If you passed the correct path it should work (if it fails it would have failed right at the beginning).
BTW: I think it is clearml-agent --config-file <file here> daemon ...

4 years ago

0 Hi Guys, Just Wondering If Anyone Encountered This Error When Using The Pipeline Controller Object. I Simply Added A Step With The Step-Name And Base_Task_Id As Flags.

and the clearml server version ?

3 years ago

0 I Buried This Issue In Another Thread To Do With Deployment, But I Was Wondering If Anyone Else Has Had Problems Using

So can you verify it can download the model ?

4 years ago

0 Hi Guys! How Do You Handle Tasks With A Complex Parametrization? For Example, A Script That Trains A Machine Learning Model, Where You Want To Parametrize Model Name, Hyperpars, Preprocessing Steps, Etc. So A Nested Configuration With Many Parameters Do I

Hi @<1691620877822595072:profile|FlutteringMouse14>

Do I have to use Hydra

You can, and then the entire configuration is fully captured by ClearML (automatically) while you can still override values with the manual "key.sub=value" both in the UI and in the CLI

Otherwise you can connect nested dict with task.connect (these will be flattened with / for sub keys).
Or you can connect configuration files ( task.connect_configuration ) and edit them as is in the UI (with override of...

one year ago

0 Hi, I Have Several Long Running Experiments Failing With

Hi JitteryCoyote63
Signal 9 is killed signal, could it be someone killed the process ? Do you have other logs to share ? Is this reproducible ?

4 years ago

0 What’S The Best Way To Get The List Of All Projects (While In A Notebook)? Hitting The Api?

Yes, I think the API is probably the easiest:
from clearml.backend_api.session.client import APIClient client = APIClient() project_list = client.projects.get_all() print(project_list)

4 years ago

0 I Have Some Old Training Jobs That I Logged With Tensorboard, Is It Possible To Add Them To Clearml?

I can read them programmatically using tensorboard and the log the using clearml logger,

StaleButterfly40 this will be a great script to put somewhere (I'm sure you are not the only one with this problem). Maybe put it as a GitHub issue ? wdyt ?

3 years ago

0 Are There Python Api Docs For Trains Hosted Anywhere? I'Ve Found Helpful Info In The Class Method Definitions That I Didn'T Find In The Main Docs/Examples

The class documentation itself is also there under "References" -> "Trains Python Package"
Notice that due to a bug in the documentation (we are working on a fix) the reference part is not searchable in the main search bar

4 years ago

0 I Want To Run My Clearml Task On An Agent In K8S Together With A Memory Profiler (Maybe

So maybe the path is related to the fact I have venv caching on?

hmmm could be...
Can you quickly disable the caching and try ?

4 years ago

Show more results