AgitatedDove14

48 Questions, 8049 Answers

Active since 10 January 2023

Last activity 6 months ago

Reputation

Badges 1

25 × Eureka!

Answers 8049

0 Hello All, Thanks For This Really Cool Software And Community! I Have A Question On

Hi ObedientDolphin41

I keep bumping against the

ModuleNotFoundError: No module named

exception.

Import the package inside the component function (the one you decorated), it will make sure it lists it in the requirements section automatically.
You can also set it manually by passing it to as the "packages" argument on the decorator function:

https://github.com/allegroai/clearml/blob/7016138c849a4f8d0b4d296b319e0b23a1b7bd9e/clearml/automation/controller.py#L3239

one year ago

0 Hello All, Thanks For This Really Cool Software And Community! I Have A Question On

Do note that the needed module is just a local folder with scripts.

Oh that is the issue, is it in the git repo ?

one year ago

0 Hello All, Thanks For This Really Cool Software And Community! I Have A Question On

Hmm that makes sense, btw the PYTHONPATH set by the agent would be the working dir listed under the Task, But if you set the agent.force_git_root_python_path the agent would also add the root git repo to the python path

one year ago

0 Hello All, Thanks For This Really Cool Software And Community! I Have A Question On

Calling the script without the

PipelineDecorator.run_locally()

i.e. running the pipeline remotely still gives the

ModuleNotFoundError: No module named

Do you have the needed module listed on the pipeline controller Task ? (press on the details link, then go to Execution tab / "Installed Packages"

one year ago

0 Hello, I Have A Local Install Using The Docker Compose Approach. I'M Trying To Set

it means it should work in

~/clearml.conf

no?

Yes exactly

I was hoping to be able to set the default server-wide

I think this type of server-side wide defaults is not supported in the open-source version.
But in most cases, setting it up on the clearml-agents is probably the important thing. btw: you can also set it in an OS environment CLEARML_DEFAULT_OUTPUT_URI

2 years ago

0 Please Tell Me, Is The Limit Of 10 Copies For Comparison, Is It Ideological Or Can It Be Changed Somehow?

I want to be able to compare scalars of more than 10 experiments, otherwise there is no strong need yet

Make sense, in the next version, not the one that will be released next week, the one after with reports (shhh do tell anyone 🙂 ) , they tell me this is solved 🎊

one year ago

0 After Presenting Clearml To My Team, I Got The Question "We'Re Already On Aws, Why Not Use Sagemaker?" Tbh, I'Ve Never Gone Through The Ml Workflow With Sagemaker. The Only Advantage I Could Think Of Is That We Can Use Our On-Prem Machines For Training,

Hi @<1541954607595393024:profile|BattyCrocodile47> and @<1523701225533476864:profile|ObedientDolphin41>

"we're already on AWS, why not use SageMaker?"

TBH, I've never gone through the ML workflow with SageMaker.

LOL I'm assuming this is why you are asking 🙂

First, you can use SageMaker and still log everything to ClearML (2 lines integration). At least you will have visibility to everything that is running/failing 🙂
SageMaker job is a container, which means for ...

one year ago

0 Hello, I Have A Local Install Using The Docker Compose Approach. I'M Trying To Set

MistakenDragonfly51 just making sure I understand, on Your machine (the one running the pytorch example),
you have set " CLEARML_DEFAULT_OUTPUT_URI " / configured the "clearml.conf" file with default_output_uri , yet the model checkpoint was Not uploaded?

2 years ago

0 Hello, I Have A Local Install Using The Docker Compose Approach. I'M Trying To Set

Hi MistakenDragonfly51

I'm trying to set

default_output_uri

in

This should be set wither on your client side, or on the worker machine (running the clearml-agent).
Make sense ?

2 years ago

0 Hi Everyone, I'M Using Clearml-Serving With Triton And Have A Couple Of Questions Regarding Model Management:

Hi @<1690896098534625280:profile|NarrowWoodpecker99>

Once a model is loaded into GPU memory for the first time, does it stay loaded across subsequent requests,

yes it does.

Are there configuration options available that allow us to control this behavior?

I'm assuming your're thinking dynamic loading/unloading models from memory based on requests?
I wish Triton added that 🙂 this is not trivial and in reality to be fast enough the model has to leave in RAM then moved to GPU (...

3 months ago

0 Hi Everyone, I'M Using Clearml-Serving With Triton And Have A Couple Of Questions Regarding Model Management:

Hi @<1713001673095385088:profile|EmbarrassedWalrus44>
So Triton has load/unload model, but these are slowwww, meaning you cannot use them inside a request (you'll just hit the request timeout every time it tries to load the model)
as you can see this is classified as "wish-list" , this is not trivial to implement and requires large CPU RAM to store the entire model, so "loading" becomes moving CPU to GPU memory (which also is not the fastest but the best you can do). As far as I understand ...

3 months ago

0 Hi Everyone, I'M Using Clearml-Serving With Triton And Have A Couple Of Questions Regarding Model Management:

. That speed depends on model sizes, right?

in general yes

Hope that makes sense. This would not work under heavy loads, but eg we have models used once a week only. They would just stay unloaded until use - and could be offloaded afterwards.

but then you still might encounter timeout the first time you access them, no?

3 months ago

0 Hello Everyone, I'M Currently Trying Clearml-Serving To Serve A Model Via An Endpoint. I Followed The Tutorial In The Documentation, But When I Try A Request, I Get An Error. Here It Is: Curl -X Post "

Do you have any advice for this step, (monitoring)? I feel like it's not very well documented.

Yeah I think it is complicated.
I would start with the example here: None
Basically what it does is create histogram over time of the values the Rest API gets. Then in graphana it visualizes those values.
Notice that the request latency / frequency are automatically logged ...

7 months ago

Also what's the additional p doing at the last line if the screenshot ?

7 months ago

0 Hello! Question About

Something like the TYPE_STRING that Triton accepts.

I saw the github issue, this is so odd , look at the triton python package:
https://github.com/triton-inference-server/client/blob/4297c6f5131d540b032cb280f1e[…]1fe2a0744f8e1/src/python/library/tritonclient/utils/init.py

one year ago

0 Hello! Question About

can we use a currently setup virtualenv by any chance?

You mean, if the cleamrl-agent needs to setup a new venv each time? are you running in docker mode ?
(by default it is caching the venv so the second time it is using a precached full venv, installing nothing)

one year ago

0 Hello! Question About

Hi @<1547028116780617728:profile|TimelyRabbit96>

Trying to do model inference on a video, so first step in

Preprocess

class is to extract frames.

Basically this depends on the RestAPI, usually would will be sending a link to data to be processed and returned Synchronously
What you should have a custom endpoint doing the extraction, send Raw data into another endpoint doing the model inference, basically think "pipeline" end points:
[None](https://github.com/allegro...

one year ago

0 Hello! Question About

, but are you suggesting sending the requests to Triton frame-by-frame?

yes! trition backend will do the autobatching, and in an enterprise deployment the gRPC loadbalancer will split it across multiple GPU nodes 🙂

one year ago

0 Hello! Question About

One issue that I see is that the Dockerfile inside the agent container

Not sure I follow, these are settings for the default container to be used when the agent spins a Task for you.
How are you running the agent itself ?

one year ago

0 Hello! Question About

and of course if your docker has packages preinstalled they are automatically used (not reinstalled)

one year ago

0 Hello Guys, I Have A Strange Situation With A Pipeline Controller I'M Testing Atm. If I Run The Controller Directly In My Pycharm On Notebook It Connects Correctly To The K8S Cluster With Trains Installed. After This, If I Go Directly In The Ui, I Reset T

Sure thing 🙂

3 years ago

0 Hello Friends! I Am Trying To Play Around With The Configs For

Hi @<1547028116780617728:profile|TimelyRabbit96>
You are absolutely correct, we need to allow to override configuration
The code you want to change is here:
None
You can try:

channel = self._ext_grpc.aio.insecure_channel(triton_server_address, options=dict([('grpc.max_send_message_length', 512 * 1024 * 1024),  ('grpc.max_receive_message_len...

one year ago

0 Hello Friends! I Am Trying To Play Around With The Configs For

We are working on 1.3.0 so this is right in time

one year ago

0 Hello Friends! I Am Trying To Play Around With The Configs For

yes exactly my thinking 👍

one year ago

0 Help Please, After Creating My Data Drift Monitoring Dashboard Using Clearml Serving And Grafana, How Can I Configure My Alerts To Be Notified When The Distribution Of My Metrics (Variables) Changes On My Heatmaps?

I set up the alert rule on this metric by defining a threshold to trigger the alert. Did I understand correctly?

Yes exactly!

Or the new metric should...

basically combining the two, yes looks good.

5 months ago

0 Another Issue Is The Agent Uses Python 2 For Some Reason Even Though Locally I’M Using Python 3 And The Agent Is Supposed To Use A Python 3 Venv.

Can you send the full log? This is odd, it will by default use the python executable it (the agent) is running with.
Regardless you can specify the python executable to be used here:
https://github.com/allegroai/clearml-agent/blob/bd411a19843fbb1e063b131e830a4515233bdf04/docs/clearml.conf#L44

3 years ago

0 Hello! Question About

So actually while we’re at it, we also need to return back a string from the model, which would be where the results are uploaded to (S3).

Is this being returned from your Triton Model? or the pre/post processing code?

one year ago

0 Hello! Question About

Notice this is per frame (single) not per 8

one year ago

0 I Think I Have Found An Issue In The Model Visualisation, When You Click On A Model You See The General Tab On The Right, If You Click On Network Or Label And Go Back To The General Tab All The Field Are Now N/A Etc You Could See The Behavior Here:

Yes, I think you are correct, verified on Firefox & Chrome. I'll make sure to pass it along.
Thanks SteadyFox10 !

4 years ago

0 Hi Team, Me Again! Im Curious If Someone Can Explain To Me Better How Task And Optimisers Integrate With Each Other. In The Example Hyperparameter Optimisation, There Is Both A Task Initialised With

I see what you mean.
an_optimizer = HyperParameterOptimizer( base_task_id='39d2c27baa8145929b2e21f686a17046', hyper_parameters=[], objective_metric_title='epoch_accuracy', objective_metric_series='epoch_accuracy', objective_metric_sign='max', optimizer_class=aSearchStrategy, max_iteration_per_job=0, total_max_jobs=0, auto_connect_task=False, ) print(an_optimizer.get_top_experiments(top_k=5))

3 years ago

Show more results