AgitatedDove14

48 Questions, 8051 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Questions 48
Answers 8051

0 Votes

1 Answers

949 Views

0 Votes 1 Answers 949 Views

Gals, Guys &

Gals, Guys & :robot_face: , if you want to checkout the Hyper-Parameters automation (Using Bayesian Optimization Hyper-Band) We have an example on the demo s...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New Rc For Trains-Agent Is Out

New RC for trains-agent is out pip install trains-agent==0.13.2rc1

clearml

4 years ago

0 Votes

0 Answers

990 Views

0 Votes 0 Answers 990 Views

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of Trains :smile_cat: ) <https://twitter.com/PyTorch/status/1272919483980500999>

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New releases: ```pip install trains==0.13.3``` <https://github.com/allegroai/trains/releases/tag/0.13.3> ```pip install trains-agent==0.13.2``` <https://github.com/allegroai/trains-agent/releases/tag/0.13.2>

New releases: pip install trains==0.13.3https://github.com/allegroai/trains/releases/tag/0.13.3 pip install trains-agent==0.13.2https://github.com/allegroai/...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

We Are At Aaai Ny, Come Look Us Up :)

We are at AAAI NY, come look us up :)

clearml

4 years ago

0 Votes

0 Answers

989 Views

0 Votes 0 Answers 989 Views

<!channel> *important notice* : it seems Nvidia broke some of their PPA's security :confused: , causing `apt-get updates` to fail inside containers. This in term will cause `clearml-agent` to fail with specific Nvidia containers. _If you are seeing simila

important notice : it seems Nvidia broke some of their PPA's security 😕 , causing apt-get updates to fail inside containers. This in term will cause clearml...

clearml

2 years ago

0 Votes

0 Answers

894 Views

0 Votes 0 Answers 894 Views

Hi Gals / :robot_face: / Guys <!here> Quick update, we will be updating the GitHub repository tomorrow with the new ClearML version, together with the accompanying python packages, ETA Noon(ish) PT time. `trains` , `trains-agent` and the docker images a

Hi Gals / :robot_face: / Guys Quick update, we will be updating the GitHub repository tomorrow with the new ClearML version, together with the accompanying p...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

YEY!!!! *Download as CSV* :exploding_head:

YEY!!!! Download as CSV 🤯

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

I Would Guess Connectivity Issues, The Tls Is Probably Python Inaccurate Response (I Mean In A Way, It Is Also A Tls Error, But I Would Imagine This Has More To Do With The Actual Network Connection)

I would guess connectivity issues, the TLS is probably python inaccurate response (I mean in a way, it is also a TLS error, but I would imagine this has more...

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Is It A One Time Thing? Or Recurring?

Is it a one time thing? or recurring?

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Finally

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Well To Be Honest, We Kind Of Thought It'S Redundant. Basically Storing Artifacts In Experiments And Having Them Retrieved Quickly From The Code Itself Was Way More Convenient For Us Then To Manually Have To Do Clone/Pull Of The Data... Example: Create Da

Well to be honest, we kind of thought it's redundant. Basically storing artifacts in experiments and having them retrieved quickly from the code itself was w...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hey <!here> Just a heads up, starting *Jan 25th*, the default <http://demoapp.demo.clear.ml/|ClearML demo server> will move to a *daily* reset cycle (replacing the current weekly cycle). Anybody needing more than 24h data retention is welcome to use our <

Hey Just a heads up, starting Jan 25th , the default http://demoapp.demo.clear.ml/ will move to a daily reset cycle (replacing the current weekly cycle). Any...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Lol, I Wonder What The Adblock Rule Was ;)

Lol, I wonder what the adblock rule was ;)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<https://allegro.ai/docs>

https://allegro.ai/docs

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

<!here> New video is out :slightly_smiling_face: Cloud Autoscalers are awesome <https://www.youtube.com/watch?v=j4XVMAaUt3E>

New video is out 🙂 Cloud Autoscalers are awesome https://www.youtube.com/watch?v=j4XVMAaUt3E

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hi Guys! I Have Great News, We Finally Fully Implemented Support For Continuing Previously Trained Models

Hi Guys! I have great news, we finally fully implemented support for continuing previously trained models 🎉 Here is a quick example (this is torch, but any ...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Hi Guys/Gals, If You Want To Checkout The Latest Rc We Have 0.15.0Rc0 Out :

Hi Guys/Gals, If you want to checkout the latest RC we have 0.15.0rc0 out : pip install trains==0.15.0rc0 pip install trains-agent==0.15.0rc0Many of the impr...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Slack Security ... Go Figure

Slack security ... Go figure 😉

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

docs are up

clearml

4 years ago

0 Votes

10 Answers

497 Views

0 Votes 10 Answers 497 Views

Happy Friday Everyone ! We Have A New Repo Release We Would Love To Get Your Feedback On

Happy Friday everyone ! We have a new repo release we would love to get your feedback on 🚀 🎉 Finally easy FRACTIONAL GPU on any NVIDIA GPU 🎊 Run our nvidi...

clearml

8 months ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Https://M.Facebook.Com/Story.Php?Story_Fbid=2484620658505570&Id=1620822758218702&Refid=52&__Tn__=-R

https://m.facebook.com/story.php?story_fbid=2484620658505570&id=1620822758218702&refid=52&tn=-R

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

4 years ago

0 Votes

1 Answers

506 Views

0 Votes 1 Answers 506 Views

Lstmeow Is Back! Bots/Gals/Guys Feel Free To

LSTMeow is back! Bots/Gals/Guys feel free to 👍 None

clearml

4 years ago

0 Votes

3 Answers

1K Views

0 Votes 3 Answers 1K Views

This Will Close It

This will close it Task.current_task().close()I think we should rename completed() because it just marks the Task as completed on the backend but does not ac...

clearml

3 years ago

0 Votes

1 Answers

469 Views

0 Votes 1 Answers 469 Views

There Is No V1.0 Release Without A Prompt V1.0.1 Following It, And We Are No Different

🙏 There is no v1.0 release without a prompt v1.0.1 following it, and we are no different 😊 pip install clearml==1.0.1

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Apparently Everyone Can ...

apparently everyone can ...

clearml

4 years ago

0 Votes

6 Answers

1K Views

0 Votes 6 Answers 1K Views

Hi ! ClearML Server + SDK v1.9.0 is out! 🎉 🚀 🎊 Happy Holidays and Happy New Year! ❇️ 🎇 🎄

clearml

one year ago

0 Votes

0 Answers

896 Views

0 Votes 0 Answers 896 Views

<!everyone> Trains v0.14.2 is out (<https://github.com/allegroai/trains/releases/tag/0.14.2|Change log>) Highlights: <https://github.com/allegroai/trains/blob/master/trains/storage/manager.py#L13|trains.storage.StorageManager> - with caching for any http

Trains v0.14.2 is out ( https://github.com/allegroai/trains/releases/tag/0.14.2 ) Highlights: https://github.com/allegroai/trains/blob/master/trains/storage/...

clearml

4 years ago

Show more results

0 Heya, Is There Any Plan For Clearml To Leverage The New

Hi FierceHamster54
This is already supported, unfortunately the open-source version only supports static allocation (i.e you can spin multiple agents and connect each one to specific number of GPUs), the dynamic option (where you have single agent allocating jobs to multiple GPUs / Slices is only part of the enterprise edition
(there is the hidden assumption there that if you spent so much on a DGX you are probably not a small team 🙂 )

2 years ago

I want in my CI tests to reproduce a run in an agent

you mean to run it on the CI machine ?

because the env changes and some things break in agents and not locally

That should not happen, no? Maybe there is a bug that needs fixing on clearml-agent ?

2 years ago

Since I'm assuming there is no actual task to run, and you do not need to setup the environment (is that correct?)
you can do:
$ CLEARML_OFFLINE_MODE=1 python3 my_main.pywdyt?

2 years ago

It's the same but done from outside, you want the same and "offline" as well right?

2 years ago

0 Hello, I Am Trying To Run The

Hi ShinyRabbit94

system_site_packages: true

This is set automatically when running in "docker mode" no need to worry 🙂
What is exactly the error you are getting ?
Could it be the container itself has the python packages installed in a venv not as "system packages" ?

2 years ago

0 Additional Mounts In Docker When Running Agent Task

GentleSwallow91 what you are looking for is here 🙂
https://github.com/allegroai/clearml-agent/blob/178af0dee84e22becb9eec8f81f343b9f2022630/docs/clearml.conf#L149

2 years ago

0 Greetings Everyone, In The Course Of My Work, I Utilize A Particular Library That Necessitates More Than Just A Simple Clone And Dependency Installation Procedure. It Also Requires The Cloning Of An Additional Repository, Along With Its Installation, And

Oh if this is the case, then by all means push it into your Task's docker_setup_bash_script
It does not seem to have to be done after the git clone, the only part the I can see is setting the PYTHONPATH to the additional repo you are pulling, and that should work.
The main hurdle might be passing credentials to git, but if you are using SSH it should be transparent
wdyt?

one year ago

0 For The Clearml-Server Component, Can The Clearml File Server Be Configured To Any Kind Of Storage ? Example Hdfs Or Even A Database Etc..

can the ClearML File server be configured to any kind of storage ? Example hdfs or even a database etc..

DeliciousBluewhale87 long story short, no 🙂 the file server, will just store/retrieve/delete files from a local/mounted folder

Is there any ways , we can scale this file server when our data volume explodes. Maybe it wouldnt be an issue in the K8s environment anyways. Or can it also be configured such that all data is stored in the hdfs (which helps with scalablity).I would su...

2 years ago

0 Hi, I Noticed That When I Commit Changes And Not Push Them And Try To Run A Job I Am Getting

Hi @<1566596960691949568:profile|UpsetWalrus59>

just wondering - shouldn't the job still work if I didn't push the commit yet

How would that work? it does not know which commit to take? it would also fail on git diff apply, no?

one year ago

0 Hi Everyone! We Are Trying To Run Pipelines From Gitlab Ci Runners, But Are Faced With The Following Error When Performing

OSError: [Errno 28] No space left on deviceHi PreciousParrot26
I think this says it all 🙂 there is no more storage left to run all those subprocesses

btw:

I am curious about why a

ThreadPool

of

16

threads is gathered,

This is the maximum simultaneous jobs it will try to launch (it will launch more after the launching is doe, notice not the actual execution) but this is just a way to limit it.

2 years ago

0 Hi Guys, Does Anybody Have The Same Issue Like Me? Is There Any Workaround?

Hi VivaciousWalrus21

After restarting training huge gaps appear in iteration axis (see the screenshot).

The Task.init actually tries to understand what was the last reported interation and continue from that iteration, I'm assuming that what happens is that your code does that also, which creates a "double shift" that you see as the jump. I think the next version will try to be "smarter" about it, and detect this double gap.
In the meantime, you can do:
` task = Task.init(...)...

2 years ago

0 Hi Everyone! We Are Trying To Run Pipelines From Gitlab Ci Runners, But Are Faced With The Following Error When Performing

PreciousParrot26 I think this is really a matter of the CI process having very limited resources. just to be clear, you are correct and the steps them selves are Not executed inside the CI environment, but it seems that even running the pipeline logic is somehow "too much" for the limited resources... Make sense ?

2 years ago

0 Hi Guys, Does Anybody Have The Same Issue Like Me? Is There Any Workaround?

My pleasure 🙂

2 years ago

0 Hi All, I Have Deployed A Clearml Server With Docker To One Of Our Local Machine. I Had Set Up The Filesserver Folder As Mount Point To The Cloud. How Easy Is It To Migrate Our Existing Experiments Later On To A Clearml Server That We Deploy In The Cloud

Correct

one year ago

0 Hello! How Can I Use "Report_Scatter2D" In Order To Report Timestamp In The X-Axis?

The bug was fixed 🙂

3 years ago

0 Hi People! I Think The Clearml

😞 I'll pass to the guys

one year ago

0 Hello, Is It Possible To Edit Scalars/Plots From An Experiment (Rename Or Delete Them) With The Python Client Or With The Server Api?

Hmm I think this is not doable ... 😞
(the underlying data is stored in DBs and changing it is not really possible without messing about with the DB)

one year ago

0 Hi We Just Got The Aws Autoscaler To Create A New Instance When You Enqueue A Task To The Relevant Queue. However, For Some Reason The Task Itself Is Never Run, It Stays In The Pending State. When Looking At The Worker Details, It Says "No Queues Curren

Hi @<1551376687504035840:profile|StraightSealion9>

AWS Autoscaler to create a new instance when you enqueue a task to the relevant queue.

Does that mean that you were able to enqueue a Task and have it launch on the remote EC2 machine ?

one year ago

0 Hi, I Run 'Manually' On My Local Machine With No Errors. Then, I Clone The Completed Task And Enqueue It. I Get To Stage When 'Environment Setup Completed Successfully'. But Right After I Get An Error Related To 'Connect' Method - Task.Connect(Config.Mode

Ok..so I should generally avoid connecting complex objects? I guess I would create a 'mini dictionary' with a subset of params, and connectvthis instead.

In theory it should always work, but this specific one fails on a very pythonic paradigm (see below)

from copy import copy
an_object = copy(object)

A good rule of thumb is to connect any object/dict that you want to track or change later

one year ago

0 Hi! I Was Wondering Regarding This Issue:

Thanks WittyOwl57 ! let me check

3 years ago

0 Hi Everyone, I Have Questions Related To Clearml-Serving.

Is there any references (vlog/blog) on deploying real-time model and do the continuous training pipeline in clear-ml?

Something along the lines of this one ?
https://clear.ml/blog/creating-a-fully-automatic-retraining-loop-using-clearml-data/
Or this one?
https://www.youtube.com/watch?v=uNB6FKIi8Wg

2 years ago

0 Hi, Is There A General Github Actions Workflow Just To Login Into Your Clearml App (Demo Or Server) So I Can Run Python Files Related To Clearml. I'Ve Seen Clearml-Actions-Train-Model And Clearml-Actions-Get-Stats And They Seem To Be Very Specific. Maybe

Can you share the pipeline?

2 years ago

0 Is There Any Examples Of Mounting An Aws Efs Mount To A Self Hosted K8 Agent Deploy?

. Curious what advantage it would be to use the StorageManager

Basically if you set the clearml cache folder to the EFS, users can always do:
from clearml import StorageManager local_file = StorageManager.get_local_copy(" ")where local_file is stored on persistent cache (EFS) and the cache is automatically cleaned based on last accessed file

2 years ago

0 1St: Is It Possible To Make A Pipeline Component Call Another Pipeline Component (As A Substep)? Or Only The Controller Can Do It? 2Nd: I Am Trying To Call A Function Defined In The Same Script, But Unable To Import It. I Passing The Repo Parameter To The

Apparently the error comes when I try to access from

get_model_and_features

the pipeline component

load_model

. If it is not set as pipeline component and only as helper function (provided it is declared before the components that calls it (I already understood that and fixed, different from the code I sent above).

ShallowGoldfish8 so now I'm a bit confused, are you saying that now it works as expected ?

2 years ago

0 Hi, Love What You Guys Did With The New Datasets! I Need Some Help Though. I Assume There Will Be A No-Code Way To Do This, Maybe Not Now But In The Future. But Anyway, I Have Three Different Datasets, And I Want To Create A Merged Version Of All Three Of

but can it NOT use /tmp for this i’m merging about 100GB

You mean to configure your Temp folder for when squashing ?
you can do hack the following:
` import tempfile
tempfile.tempdir = "/my/new/temp"

Dataset squash

tempfile.tempdir = None `But regradless I think this is worth a GitHub issue with feature request, to set the temp folder///

2 years ago

0 I Wanted To Ask, How To Run Pipeline Steps Conditionally? E.G If Step Returns A Specific Value, Exit The Pipeline Or Run Another Step Instead Of The Sequential Step

VexedCat68 both are valid. In case the step was cached (i.e. already executed) the node.job will be None, so it is probably safer to get the Task based on the "executed" field which stores the Task ID used.

2 years ago

0 Hey, What Is The Recommended Approach To Speed Up The Spin Up Of A Task In A Gcp Autoscaled Instance ? It Takes 20Mins To Build The Venv Environment Needed By The Clearml-Agent To Run It, Would Providing A Vm Image With Preinstalled Pip Packages On It Hel

It takes 20mins to build the venv environment needed by the clearml-agent

You are Joking?! 😭
it does apt-get install python3-pip , and pip install clearml-agent, how is that 20min?

2 years ago

When looking at the worker details, it says "No queues currently assigned to this worker"

Yes, I think we should have better information there, the "AWS service" is not directly pulling jobs from any specific queue, hence nothing there. It is "listening" to queues and launching machines, those machines will be listening to the queue. I wonder if it is just easier to also make sure it is listed as "assigned" to those queues . wdyt?

one year ago

0 Why Does My Task Execution Freeze After Pip Installation (Running Agent In Foreground Mode)? The Task Is:

Why does my task execution freeze after pip installation (running agent in foreground mode)?

Hi AdventurousButterfly15
Are you running in agent docker mode or venv mode ?
What do you mean freeze? do you see anything on the Taks console log in the UI? what's the host OS ?

2 years ago

0 I Have A Question Regarding Running The Code On The Remote Machine, Each Time I Run The Code I See The Console In The Clearml Server Start Downloading All The Libraries I Used In The Code And When I Run Another Code The Same Thing Happens So Why It Has To

how to put or handle this configuration and where?

In your clearml.conf on the machine with the agent just add at the bottom of the file agent.venvs_cache.path=~/.clearml/venvs-cache

2 years ago

Show more results