AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 5 months ago

Reputation

Badges 1

25 × Eureka!

Questions 48
Answers 8049

0 Votes

0 Answers

969 Views

0 Votes 0 Answers 969 Views

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of Trains :smile_cat: ) <https://twitter.com/PyTorch/status/1272919483980500999>

Gals, Guys & :robot_face: If you want to get some inspiration on building DL Continuous Integration pipelines, I suggest this post (obviously built on top of...

clearml

4 years ago

0 Votes

0 Answers

968 Views

0 Votes 0 Answers 968 Views

@PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with what seems like an easy to use single app. From the Reddit thread it seems that it is still not

PunySquid88 I'm not very familiar with what they do, but it seems that although it has a backend server as an option, it will mostly target single users with...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Apparently Everyone Can ...

apparently everyone can ...

clearml

4 years ago

0 Votes

3 Answers

972 Views

0 Votes 3 Answers 972 Views

Hi , v0.15 is out, 🎉 🚀 Your feedback had a major influence on the features we added 🙂 thank you! A selected list of features: Column resizing / ordering /...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

@YummyWhale40 awesome thanks!

YummyWhale40 awesome thanks!

clearml

4 years ago

0 Votes

0 Answers

941 Views

0 Votes 0 Answers 941 Views

<!here> Gals/Guys/:robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : <https://github.com/allegroai/trains/issues/161> For example: generate an alert if my experiment reaches a certain

Gals/Guys/ :robot_face: If you have ideas on improving the Slack Monitoring service, please add them on the dedicated Github Issue : https://github.com/alleg...

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

YEY!!!! *Download as CSV* :exploding_head:

YEY!!!! Download as CSV 🤯

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Well To Be Honest, We Kind Of Thought It'S Redundant. Basically Storing Artifacts In Experiments And Having Them Retrieved Quickly From The Code Itself Was Way More Convenient For Us Then To Manually Have To Do Clone/Pull Of The Data... Example: Create Da

Well to be honest, we kind of thought it's redundant. Basically storing artifacts in experiments and having them retrieved quickly from the code itself was w...

clearml

4 years ago

0 Votes

0 Answers

987 Views

0 Votes 0 Answers 987 Views

Hey <!here> Just a heads up, starting *Jan 25th*, the default <http://demoapp.demo.clear.ml/|ClearML demo server> will move to a *daily* reset cycle (replacing the current weekly cycle). Anybody needing more than 24h data retention is welcome to use our <

Hey Just a heads up, starting Jan 25th , the default http://demoapp.demo.clear.ml/ will move to a daily reset cycle (replacing the current weekly cycle). Any...

clearml

3 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

Is It A One Time Thing? Or Recurring?

Is it a one time thing? or recurring?

clearml

4 years ago

0 Votes

2 Answers

949 Views

0 Votes 2 Answers 949 Views

Hi ! trains 0.16.2 is finally out with the new pipelines interface! Check out the new example https://github.com/allegroai/trains/blob/master/examples/pipeli...

clearml

3 years ago

0 Votes

0 Answers

966 Views

0 Votes 0 Answers 966 Views

<!channel> *important notice* : it seems Nvidia broke some of their PPA's security :confused: , causing `apt-get updates` to fail inside containers. This in term will cause `clearml-agent` to fail with specific Nvidia containers. _If you are seeing simila

important notice : it seems Nvidia broke some of their PPA's security 😕 , causing apt-get updates to fail inside containers. This in term will cause clearml...

clearml

2 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

I Would Guess Connectivity Issues, The Tls Is Probably Python Inaccurate Response (I Mean In A Way, It Is Also A Tls Error, But I Would Imagine This Has More To Do With The Actual Network Connection)

I would guess connectivity issues, the TLS is probably python inaccurate response (I mean in a way, it is also a TLS error, but I would imagine this has more...

clearml

4 years ago

0 Votes

1 Answers

1K Views

0 Votes 1 Answers 1K Views

This Is Usually Due To Enterprise Level Issued Https Certificates Not Part Of The Local Installation (Basically Any Python Generated Ssl Request Will Fail)

This is usually due to enterprise level issued https certificates not part of the local installation (basically any python generated SSL request will fail)

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

New Rc For Trains-Agent Is Out

New RC for trains-agent is out pip install trains-agent==0.13.2rc1

clearml

4 years ago

0 Votes

0 Answers

1K Views

0 Votes 0 Answers 1K Views

you set it :slightly_smiling_face:

you set it 🙂

clearml

4 years ago

0 Votes

6 Answers

401 Views

0 Votes 6 Answers 401 Views

Hi :robot_face: , humans We have the new documentation site up and running 🎉 None 🎊 This is still a work in progress, so we keep the previous version alive...

clearml

3 years ago

0 Votes

3 Answers

370 Views

0 Votes 3 Answers 370 Views

These Are Xgboost Internal Metrics That Are Automatically Picked By Clearml

@<1523703325881536512:profile|ConvolutedSealion94> these are xgboost internal metrics that are automatically picked by clearml

xgboost

one year ago

Show more results

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

GrotesqueDog77 one issue with this design, in order to run a sub-component, the call must be done from the parent component, does that make sense?

` def step_one(data):
return data

def step_two(path):
return model

def both_steps()
path = step_one("stuff")
return step_two(path)

def pipeline():
both_steps() Which would make both_steps ` a component and step_one and step_two sub-components
wdyt?

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

If you take a look here, the returned objects are automatically serialized and stored on the files server or object storage, and also deserialized when passed to the next step.
https://github.com/allegroai/clearml/blob/master/examples/pipeline/pipeline_from_decorator.py

You can of course do the same manually

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

But every agent is a different pod so I do not know how properly share the folder with images.

Can I conclude Kubernetes running the agents ?

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

Yes, but I'm not sure that they need to have separate task

Hmm okay I need to check if this can be easily done
(BTW, the downside of that, you can only cache a component, not a sub-component)

one year ago

0 Heyo, After Building Some Custom Pipelining Functionality On Mlflow, I Started Looking For Better Software That Can Beat What I Created - With A Similar Amount Of Effort. Problem Has Been That Up Till Now, All I Found Could Make Things Way Better But Al

ContemplativePuppy11

yes, nice move. my question was to make sure that the steps are not run in parallel because each one builds upon the previous one

if they are "calling" one another (or passing data) then the pipeline logic will deduce they cannot run in parallel 🙂 basically it is automatic

so my takeaway is that if the funcs are class methods the decorators wont break, right?

In theory, but the idea of the decorator is that it tracks the return value so it "knows" how t...

one year ago

0 Hi All, I Was Trying To Use Clearml-Task To Run A Custom Docker(With Poetry To Install All The Python Dependencies And Activated The Environment) Using Clearml Gpu, But It Seems Like Clearml Always Create A Virtual Environment And Run The Python Script Fr

you should have a gpu argument there, set it to true

one year ago

0 Hi Folks, I Have A Question Related To The Storage Of Artifacts, As It Is Not Entirely Clear To Me Where To Configure It. If I Read The Documentation

when I run it on my laptop...

Then yes, you need to set the default_output_uri on Your laptop's clearml.conf (just like you set it on the k8s glue)
Make sense ?

2 years ago

0 Hello Guys, I Have A Strange Situation With A Pipeline Controller I'M Testing Atm. If I Run The Controller Directly In My Pycharm On Notebook It Connects Correctly To The K8S Cluster With Trains Installed. After This, If I Go Directly In The Ui, I Reset T

No worries, just found it. Thanks!
I'll make sure to followup on the GitHub issue for better visibility 🙂

3 years ago

0 Is There Any Reason Why Doing The Following Is Not Possible? Am I Doing It Right? I Want To Run A Pipeline With Different Parameters But I Get The Following Error?

but does that mean I have to unpack all the dictionary values as parameters of the pipeline function?

I was just suggesting a hack 🙂 the fix itself is transparent (I'm expecting it to be pushed tomorrow), basically it will make sure the sample pipeline will work as expected.
regardless and out of curiosity, if you only have one dict passed to the pipeline function, why not use named arguments ?

2 years ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

"sub nodes" inside pipeline, in my opinion, makes them much more useful, in sense that all the steps are visible.

Yeah I really like this idea... continuing this thread, would it also make sense to have a Task object per "sub-node" and run the sub-nodes as subprocess of the parent Node? I'm thinking this sounds like a combination of both local pipeline execution and remote pipeline execution.
wdyt?

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

GrotesqueDog77 when you say "the second issue" , do you mean the fact that both step 1 and step 2 should have access to the same filesystem?

one year ago

0 Hey Guys, I Believe

how would I get an agent to launch in the same instance of my clearml server

Actually that is my point, you do not have to spin the agent on the clearml-server instance. We added the services agent as part of the docker-compose for easier deployment, that said you can always manually SSH to the server, or run on any other machine, like you would spin any other clearml-agent .
Does that make sense ?

2 years ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

Sounds good to me, adding it to the to do list, probably should not be very complicated to add 🙂

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

GrotesqueDog77 this should just work, decorate the functions with @PipelineDecorator.component and call the functions one after the other
paths = step_one() step_two(paths)ClearML will make sure it serializes the strings and pass them to step two (of course step two should actually run on a machine with access to the same folder, but this is another issue 🙂 )

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

If this is the case, then you have to set a shared PV for the pods, this way they can actually have a persistent cache, which would also be shared.
BTW: a single function call might not be a perfect match for a pipeline component , the overhead of starting a node might not be negligible as it needs to install required python packages bring the code etc.

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

Ohh, clearml is designed so that you should not worry about that, download_dataset = StorageManger.get_local_copy() this is cashed, meaning the machine that runs that like the second time will not re download the path.
This means step 1 is redundant, no?
Usually when data is passed between components it is automatically uploaded as artifact to the Task (stored on the files server or object storage etc.) then downloaded and passed to the next steps.
How large is the data that you are wo...

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

because step can be constructed with multiple

sub-components

but not all of them might be added to the UI graph

Just to make sure I fully understand when we decorate with @sub_node we want that to also appear in the UI graph (and have it's own Task / metrics etc)
correct?

one year ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

Makes total sense!
Interesting, you are defining the sub-component inside the function, I like that, this makes the code closer to how this is executed!

one year ago

0 Hi, I Went Through This Slack'S History And The Problem Already Popped Up A Couple Of Times But Doesn'T Look Like Solved. On My Machine I Currently Have 4 Gpus, No Problems If I Want To Allocate All 4 Or Just 1 Using

BTW:

Error response from daemon: cannot set both Count and DeviceIDs on device request.

Googling it points to a docker issue (which makes sense considering):
https://github.com/NVIDIA/nvidia-docker/issues/1026
What is the host OS?

3 years ago

0 Hi All! I Have A Question About Pipelines. My Pipeline Consists Of Several Steps:

Hi GrotesqueDog77
What do you mean by share resources? Do you mean compute or storage?

one year ago

0 Hi. When Using Sklearn'S

DistressedGoat23 check this example:
https://github.com/allegroai/clearml/blob/master/examples/optimization/hyper-parameter-optimization/hyper_parameter_optimizer.py
aSearchStrategy = RandomSearchIt will collect everything on the main Task

This is a curial point for using clearml HPO since comparing dozens of experiments in the UI and searching for the best is just not manageable.

You can of course do that (notice you can actually order them by scalars they report, and even do ...

one year ago

0 I Seem To Be Missing Something ... I'Ve Only Got One Task Running To Train A Segmentation Model On My Local Machine, And In A Few Days It'S Hit Over 1.15M Api Calls. It Looks Like It'S Sending Every Single Console Output ... Are There Settings To Control

Hi @<1572395184505753600:profile|GleamingSeagull15>
Try adjusting:
None
to 30 sec
It will reduce the number of log reports (i.e. API calls)

one year ago

0 Hi, I Am Running An Optimization Task With Optimizeroptuna (Using Your Doc

AbruptWorm50 my apologies I think I mislead you you, yes you can pass geenric arguments to the optimizer class, but specifically for optuna, this is disabled (not sure why)
Specifically to your case, the way it works is:
your code logs to tensorboard, clearml catches the data and moves it to the Task (on clearml-server), optuna optimization is running on another machine, trail valies are maanually updated (i.e. the clearml optimization pulls the Task reported metric from the server and updat...

2 years ago

0 I Have Set

there is almost zero overhead if your docker container alreadyt has everything (including the agent) preinstalled and you set it with CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1
it then should basically just run the code.

4 months ago

0 Hi! I Have Local Minio Setup, Via Minio Browser I Can Upload 50-100 Mb Per Second As Its Local. But When I Try To Use Task.Upload_Artifact It Uploads 500 Kb Per Second. Does Anyone Have An Idea About This?

There is some overhead, but it should be negligible.

4 years ago

0 Hi, I Expect There Is A Limitation In Time The Free Service

I think the limit is a few GB, I'm not sure, I'll have to check
And yes the oldest experiments will be deleted first (with the exception of published experiments, they will be deleted last)

3 years ago

0 Hi Guys, I Have Been Running The Clearml-Serving For A While Now And I Realize That From Time To Time After A Couple Of Hours The Serving Task (Control Plane) That Is Configured Through The Cli Goes Into Status Abort. This Happens Even Though All The Pods

Okay we have located the issue, thanks guys! We will push a patch release hopefully later today

7 months ago

0 When I Run Experiments I Set

Hi IntriguedRat44
Sorry, I missed this message...
I'm assuming you are running in manual mode (i.e. not through the agent), in that case we do not change the CUDA_VISIBLE_DEVICES.
What do you see in the resource monitoring? Is it a single GPU or multiple GPUs?
(Check the :monitor:gpu in the Scalar tab under results,)
Also what's the Trains/ClearML version you are suing and the OS ?

3 years ago

0 Hello People

you can also just create a venv and run the tests there (with the latest python package) ?

2 years ago

0 Getting This Error At

Ohh then YES!
the Task will be closed by the process, and since the process is inside the Jupyter and the notebook kernel is running, it is still running

3 years ago

Show more results