AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 5 months ago

Reputation

Badges 1

25 × Eureka!

Questions 48
Answers 8049

0 Votes

0 Answers

968 Views

0 Votes 0 Answers 968 Views

@PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with what seems like an easy to use single app. From the Reddit thread it seems that it is still not

PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Apparently Everyone Can ...

apparently everyone can ...

clearml

4 years ago

0 Votes

3 Answers

972 Views

0 Votes 3 Answers 972 Views

Hi , v0.15 is out, 🎉 🚀 Your feedback had a major influence on the features we added 🙂 thank you! A selected list of features: Column resizing / ordering /...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@YummyWhale40 awesome thanks!

YummyWhale40 awesome thanks!

clearml

4 years ago

0 Votes

0 Answers

941 Views

0 Votes 0 Answers 941 Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

YEY!!!! *Download as CSV* :exploding_head:

YEY!!!! Download as CSV 🤯

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Well To Be Honest, We Kind Of Thought It'S Redundant. Basically Storing Artifacts In Experiments And Having Them Retrieved Quickly From The Code Itself Was Way More Convenient For Us Then To Manually Have To Do Clone/Pull Of The Data... Example: Create Da

Well to be honest, we kind of thought it's redundant. Basically storing artifacts in experiments and having them retrieved quickly from the code itself was w...

clearml

4 years ago

0 Votes

0 Answers

987 Views

0 Votes 0 Answers 987 Views

Hey <!here> Just a heads up, starting *Jan 25th*, the default <http://demoapp.demo.clear.ml/|ClearML demo server> will move to a *daily* reset cycle (replacing the current weekly cycle). Anybody needing more than 24h data retention is welcome to use our <

Hey Just a heads up, starting Jan 25th , the default http://demoapp.demo.clear.ml/ will move to a daily reset cycle (replacing the current weekly cycle). Any...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Is It A One Time Thing? Or Recurring?

Is it a one time thing? or recurring?

clearml

4 years ago

0 Votes

0 Answers

966 Views

0 Votes 0 Answers 966 Views

<!channel> *important notice* : it seems Nvidia broke some of their PPA's security :confused: , causing `apt-get updates` to fail inside containers. This in term will cause `clearml-agent` to fail with specific Nvidia containers. _If you are seeing simila

important notice : it seems Nvidia broke some of their PPA's security 😕 , causing apt-get updates to fail inside containers. This in term will cause clearml...

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

I Would Guess Connectivity Issues, The Tls Is Probably Python Inaccurate Response (I Mean In A Way, It Is Also A Tls Error, But I Would Imagine This Has More To Do With The Actual Network Connection)

I would guess connectivity issues, the TLS is probably python inaccurate response (I mean in a way, it is also a TLS error, but I would imagine this has more...

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New Rc For Trains-Agent Is Out

New RC for trains-agent is out pip install trains-agent==0.13.2rc1

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

4 years ago

0 Votes

6 Answers

401 Views

0 Votes 6 Answers 401 Views

Hi :robot_face: , humans We have the new documentation site up and running 🎉 None 🎊 This is still a work in progress, so we keep the previous version alive...

clearml

3 years ago

0 Votes

6 Answers

979 Views

0 Votes 6 Answers 979 Views

Hi ! ClearML Server + SDK v1.9.0 is out! 🎉 🚀 🎊 Happy Holidays and Happy New Year! ❇️ 🎇 🎄

clearml

one year ago

0 Votes

2 Answers

950 Views

0 Votes 2 Answers 950 Views

Hi ! trains 0.16.2 is finally out with the new pipelines interface! Check out the new example https://github.com/allegroai/trains/blob/master/examples/pipeli...

clearml

3 years ago

0 Votes

3 Answers

371 Views

0 Votes 3 Answers 371 Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

one year ago

Show more results

0 When I Setup My Local Virtual Environment I Use A Combination Of Conda And Pip. I Use Conda As My Environment Manager, And Then Use Pip For Packages That Are Not In The Conda Repositories.

Thanks VivaciousPenguin66 !
BTW: if you are running the local code with conda, you can set the agent to use conda as well (notice that if you are running locally with pip, the agent's conda env will use pip to install the packages to avoid version mismatch)

3 years ago

0 Hello, I Have A Problem With Task.Set_Initial_Iteration(0) In Google Colab. After Continuing The Experiment, Gaps Appear On My Graph, But If You Use Colab. I Tried It On My Computer And Everything Is Normal There.

Okay I think I know what's going on (there is a race that for some reason on CoLab acts differently).
As a quick hack you can do the following:
Task._report_subprocess_enabled = False task = Task.init(...) task.set_initial_iteration(0)

2 years ago

0 One More Thing, I'M Trying To Take Full Advantage Of The Controller, But I Run Into A Problem In My Use Case. The Controller Is Super Useful For Creating A Dag Of Tasks Which Is A Behaviour Of Interest. But Issues Rise When The Tasks Are Changing. Not On

sure no prob

4 years ago

But I do not know how it can help me:(

In your code itself after the Task.init call add:
task.set_initial_iteration(0)See reply here:
https://github.com/allegroai/clearml/issues/496#issuecomment-980037382

2 years ago

I can't think of any actual difference in flow ...
Can you try the following?
task._setup_reporter() task.set_initial_iteration(0)

2 years ago

0 Hi All - I Have A Question To Ask (And Not Sure If There Is A Channel For Faqs So Sorry For Putting It Here) ... I Am Using Trains In Combination With Pycharm'S Remote Debugging. I Have The Pycharm Plugin Installed. When The Experiment Ends, I Get

Hmm, yes this fits the message. Which basically says that it gave up on analyzing the code because it run out of time. Is the execution very short? Or the repo very large?

4 years ago

0 Hi, I Have One Doubt Related To Pipeline I Have One Pipeline With Eg 3 Tasks, Preprocess, Train And Test Now I Want To Clone The Pipeline And Change The Hyperparameters Of Train Task, Is It Possible? If So, How??

@<1585078763312386048:profile|ArrogantButterfly10> could it be that in the "base task" of the pipeline step, you do not have any hyper-parameter ? (I mean the Task that the pipeline clones and is supposed to set new hyperparameters for...)

one year ago

Awesome!

4 years ago

My pleasure :)

4 years ago

0 Hi I Came Across Some Inconsistency In The Iteration Reporting In The Clearml With Pytorch-Lightning When Calling Trainer.Fit Multiple Times, Before I Dive In I Wondered If There Is A Known Issue Related To This?

Hi RipeGoose2
Are you continuing the Task, i.e. passing Task.init(..., continue_last_task=True)

3 years ago

0 And One More Question. How Can I Get Loaded Model In Preporcess Class In Clearml Serving?

Are you using "custom" engine?
https://github.com/allegroai/clearml-serving/tree/main/examples/custom

https://github.com/allegroai/clearml-serving/blob/main/examples/custom/preprocess.py

2 years ago

Yes that's the reason, basically there is a background thread analyzing the code, at the end of the execution if it is till running (hence the question regrading execution time) we give it extra 10seconds to come up with answers, otherwise we terminate, so the code won't get stuck. Makes sense to you?

4 years ago

0 Hi, I Am Looking To Upload "Already Trained Models" As Experiments In My Clearml Server. How Should I Go About Doing That? Clearml Picks Up The Tensorboard Automatically While It'S Training And Reports It But How Would I Do This If I Had Everything Alread

SmarmyDolphin68 sadly if this was not executed with trains (i.e. the offline option of trains), this is not really doable (I mean it is, if you write some code and parse the TB 😉 but let's assume this is way to much work)
A few options:
On the next run, use clearml OFFLINE option, (i.e. in your code call Task.set_offline() , or set env variable CLEARML_OFFLINE_MODE=1) You can compress the upload the checkpoint folder manually, by passing the checkpoint folder, see https://github.com...

3 years ago

0 Not Able To Resume A Hyper-Parameter Optmization.

Is this reproducible with the hpo example here:
https://github.com/allegroai/clearml/tree/400c6ec103d9f2193694c54d7491bb1a74bbe8e8/examples/optimization/hyper-parameter-optimization

What's your clearml version? (And is it possible you verify with the latest version?)

2 years ago

0 Hey Guys, Sorry For The Rapid Fire Questions In The Past Few Days. I Have Another Issue Though. I Initially Ran A Task, Directly From A Repo. It Succesfully Installed The Requirements From The Requirements File In The Repo And Ran The Task Without Any Iss

Anyway, in the docs, there is a function called task.register_artifact()

Yes, this is rather deprecated... The idea is that it will monitor an obejct and auto sync it (i.e. serialize and upload).
That said, it is just so much easier to do task.upload_artifact and you can always update/overrwrite if you are passing the same name, that I cannot see the actual use case. Does that make sense? What are you using it for ?

2 years ago

0 Hello Community! Is There An Option To Only Download A Part Of A Dataset With .Get_Local_Copy()? I Imagine Something Like This, But I Can'T Find The Right Way To Do It.

shared "warm" folder without having to download the dataset locally.

This is already supported 🙂
Configure the sdk.storage.cache.default_base_dir in your clearml.conf to point to a shared (mounted) folder
https://github.com/allegroai/clearml-agent/blob/21c4857795e6392a848b296ceb5480aca5f98e4b/docs/clearml.conf#L205
That's it 🙂

3 years ago

0 Hello Everyone, How Do I Tell The Agent That It Needs To Install A Local Module Of The Repo? If I Put Git+<Repopath> In The Requirements It Will Install The Module Version In The Repo And Not Necessarily The Version That Launched The Task. I Basically Wan

ok, yes, but this will install the package of the branch specified there.

Correct

So If im working on my own branch and want to run an experiment, I would have to manually put in the git path my current branch name.

When you say your own branch you mean local (i.e. not pushed to remote git repo) ?

2 years ago

0 After I Have Create A Task And Closed It In A Notebook, Any Activity Seems To Trigger Another Task. For Example:

Is Task.current_task() creating a task?

Hmm it should not, it should return a Task instance if one was already created.
That said, I remember there was a bug (not sure if it was in a released version or an RC) that caused it to create a new Task if there isn't an existing one. Could that be the case ?

3 years ago

0 Hello All, I Have A Question Regarding Showing Of Debug Samples Within An On-Prem Clearml Instance. I Am Logging Debug Images Via Tensorboard (Via

ZanyPig66 is this reproducible? This sounds like a bug, whats the TB version and OS you rae using?
Is this example working for you (i.e. you see debug images)
https://github.com/allegroai/clearml/blob/master/examples/frameworks/pytorch/pytorch_tensorboard.py

one year ago

0 I Am Using `

Many thanks!

3 years ago

0 Hi Everyone, Looking For Ml Management Tools I Stumbled Upon Trains, I Must Say It Has Been Awesome So Far. I Just Have A (Probably Stupid) Question: I'M Trying To Setup A Multi-Node Training Environment And I Thought I Could Solve This With Agents, But A

SmilingFrog76

there is no internal scheduler in Trains

So obviously there is a scheduler built into Trains, this is the queues (order / priority)
What is missing from it is multi node connection, e.g. I need two agents running the exact same job working together.
(as opposed to, I have two jobs, execute them separately when a resource is available)

Actually my suggestion was to add a SLURM integration, like we did with k8s (I'm not suggesting Kubernetes as a solution for you, the op...

3 years ago

0 Hi All! I Have A Couple Of Things That Are Not Completely Clear To Me, Hope You Can Help Me To Sort Them Out.

Hi OutrageousGrasshopper93

When the Task is executed on a worker, the presence of spaces breaks the URLs and from the UI I cannot access to the resources on the bucket

You are saying the URLs generated in a remote execution are "broken" and on local execution are working, even though it is the same project/task name ?

3 years ago

0 Also, For Selecting A Subset Of Experiments To Compare, It Looks Like Neptune Currently Has A More Advanced Solution (

You can already sort and filter experiments based on any hyper parameter or metric that the experiment reports, there is no need for any custom language query. Also all created filter/sorted table can be shared exactly as they are, so you can create leaderboards and share specific filters. You can also use the search bar in order to filter based on experiment name / comment. Tags will be added soon as well 🙂

Example of custom columns is here (the screen grab is a bit old, now there is als...

4 years ago

0 Autoscaler Parallelization Issue: I Have An Aws Autoscaler Set Up With A Resource That Has A Max Of 3 Instances Assigned To The

model_path/run_2022_07_20T22_11_15.209_0.zip , err: [Errno 28] No space left on deviceWhere was it running?

I take it that these files are also brought into pipeline tasks's local disk?

Unless you changed the object, then no, they should not be downloaded (the "link" is passed)

2 years ago

0 Hello World,

Hi PerplexedGoat65

it appears, in a practical sense, this means to mount the second drive, and then bind them in ClearML’s configuration

Yes, the entire data folder (reason is, if you loose it, you loose all the server storage / artifacts)

Also, thinking about Docker and slower access speed for Docker mounts and such,

If the host OS is linux, you have nothing to worry about, speed will be the same.

2 years ago

0 I Have A Self-Hosted Clearm-Server And And Clearml-Agent Started With

ReassuredTiger98 when you look for task "dca2e3ded7fc4c28b342f912395ab9bc" there are no artifacts ?
Could you add some prints? this should have worked...

3 years ago

0 Hi, We'Re Hosting Clearml On Our K8S Cluster, And I'M Running Into Problems With It... I'Ve Set It Up In A Subdomain Way - App/Files/Api.Clearml.Mydomain... But I Have Some Issues With The Ssl Certificate. When I Try Running

Yey!

3 years ago

0 Hello! I’M Trying To Re-Use Model, Which Is Already In Model Regystry And I Came Up With Two Ideas: Query Model And Connect It To The Task

Hi @<1523702307240284160:profile|TeenyBeetle18>

and url of the model refers to local file, no to the remote storage.

Do you mean that in the Model tab when you look into the model details the URL points to a local location (e.g. file:///mnt/something/model) ?
And your goal is to get a copy of that model (file) from your code, is that correct ?

9 months ago

0 Hi, Expanding On

DeliciousBluewhale87 let me check

3 years ago

0 Hi Team, Me Again! Im Curious If Someone Can Explain To Me Better How Task And Optimisers Integrate With Each Other. In The Example Hyperparameter Optimisation, There Is Both A Task Initialised With

The easiest would be as an artifact (I think).
Let's assume you put it into a csv file (with pandas or mnaually)
To upload (from the pipeline Task itself):
task.upload_artifacts(name='summary', artifact_object='~/my/summary.csv')Then if you want to grab it from anywhere else:
task = Task.get_task(task_id='HPO controller Task id here') my_csv = Task.artifacts['summary'].get_local_copy()
If you want to store as dict it might be even easier:
` task.upload_artifacts(name='summary', artifa...

3 years ago

Show more results