JumpyRaven4

7 Questions, 37 Answers

Active since 18 May 2023

Last activity one month ago

Reputation

Badges 1

36 × Eureka!

Questions 7
Answers 37

0 Votes

1 Answers

296 Views

0 Votes 1 Answers 296 Views

Hi Guys, I Have A New Question Related To Triggerscheduler. I Am Seeing Very Erratic Behavior On Datasets Triggers. I Ahve A Cron Scheduler That Creates A Dataset After A File Gets Dropped On S3 Into A Project And Some Tags, In Particular "Processed=Fals

Hi Guys, I have a new question related to TriggerScheduler. I am seeing very erratic behavior on datasets triggers. I ahve a Cron scheduler that creates a da...

clearml

one month ago

0 Votes

11 Answers

2K Views

0 Votes 11 Answers 2K Views

Hi Guys, I Feel Like I'M Missing Something Regarding The Way I Should Be Cloning Tasks. I Have Tasks Templates That I Want To Be Able To Clone And Dynamically Change The Package Requirements Required To Run The Said Task. I Have Tried Most Of What I Coul

Hi Guys, I feel like I'm missing something regarding the way I should be cloning tasks. I have tasks templates that I want to be able to clone and dynamicall...

clearml

one year ago

0 Votes

15 Answers

2K Views

0 Votes 15 Answers 2K Views

Hi Guys, We Are Running Clearml-Serving On A Kube Cluster On Aws And We Have Noticed That We Are Getting Some 502 Errors Once In A While That We Can'T Seem To Trace Back.

Hi Guys, we are running clearml-serving on a kube cluster on AWS and we have noticed that we are getting some 502 errors once in a while that we can't seem t...

clearml

one year ago

0 Votes

4 Answers

706 Views

0 Votes 4 Answers 706 Views

Hi Guys, I Have Been Looking In The New Version Of The Web App. I Have Several Models Deployed Through Clearml-Serving However, The New Model Endpoints Page Shows Nothing. I Couldn'T Find Much Info About That New Page, Is That Something That Is Supposed

Hi Guys, I have been looking in the new version of the web app. I have several models deployed through clearml-serving however, the new Model Endpoints page ...

clearml

6 months ago

0 Votes

4 Answers

2K Views

0 Votes 4 Answers 2K Views

Hi Guys, I Have A Question Regarding Model Tracking. I Have Pipelines That Use Xgboost Through The Scikit-Learn Api To Perform:

Hi Guys, I have a question regarding Model tracking. I have pipelines that use Xgboost through the scikit-learn api to perform: - Feature selection through n...

clearml

2 years ago

0 Votes

0 Answers

766 Views

0 Votes 0 Answers 766 Views

Hi Guys, Been Using Clearml-Serving For Several Years Now And Recently Upgraded To The New Version Of The App Server And Clearml-Serving. However, When Deploying Models I'M Now Getting The Following Error:

Hi Guys, been using clearml-serving for several years now and recently upgraded to the new version of the app server and clearml-serving. However, when deplo...

clearml

6 months ago

0 Votes

26 Answers

2K Views

0 Votes 26 Answers 2K Views

Hi Guys, I Have Been Running The Clearml-Serving For A While Now And I Realize That From Time To Time After A Couple Of Hours The Serving Task (Control Plane) That Is Configured Through The Cli Goes Into Status Abort. This Happens Even Though All The Pods

Hi Guys, I have been running the clearml-serving for a while now and I realize that from time to time after a couple of hours the serving task (control plane...

clearml

one year ago

0 Hi Guys, I Feel Like I'M Missing Something Regarding The Way I Should Be Cloning Tasks. I Have Tasks Templates That I Want To Be Able To Clone And Dynamically Change The Package Requirements Required To Run The Said Task. I Have Tried Most Of What I Coul

one year ago

0 Hi Guys, I Have Been Running The Clearml-Serving For A While Now And I Realize That From Time To Time After A Couple Of Hours The Serving Task (Control Plane) That Is Configured Through The Cli Goes Into Status Abort. This Happens Even Though All The Pods

hey Marin real quick actually, on your update to the requirements.txt file isn't that constraint on fastapi inconsistent?

one year ago

so i still can't figure out what sets the task status to aborted

one year ago

ok so I haven't looked at the latest changes after the sync this morning but the ones we put in yesterday seems to have fixed the issue, the service is still running this morning at least.

one year ago

no requests are being served as in there is no traffic indeed

one year ago

Hi Martin, thanks a lot for looking into this so quickly. Will you let me know the version number once it's pushed? Thanks!

one year ago

0 Hi Guys, I Have Been Looking In The New Version Of The Web App. I Have Several Models Deployed Through Clearml-Serving However, The New Model Endpoints Page Shows Nothing. I Couldn'T Find Much Info About That New Page, Is That Something That Is Supposed

ok I see that now. Everything is there on the UI and webserver though so we went ahead and implemented ourselves on the clearml-serving piece.

6 months ago

any timeline on this that you are aware of?

6 months ago

how can you be >= 0.109.1 and lower than 0.96

one year ago

Hey Martin, I will, but it's a bit more tricky because we have modifications in the code that I have to merge on our side

one year ago

my understanding was that the deamon thread was deserializing the task of the control plane every 300 seconds by default

one year ago

so they ping the werb server?

one year ago

what is actually setting the task status to Aborted ?

one year ago

0 Hi Guys, I Have A New Question Related To Triggerscheduler. I Am Seeing Very Erratic Behavior On Datasets Triggers. I Ahve A Cron Scheduler That Creates A Dataset After A File Gets Dropped On S3 Into A Project And Some Tags, In Particular "Processed=Fals

I will actually write here what I found. trigger_on_tags and trigger_required are actually the same and concatenated with OR. You need to make sure you are using the "__$all" before if that's the behavior you want.
there is a bug in my opinion on the deserialization process because the triggers get de-dupped by trigger name or when using trigger_project there are dozens of triggers being created with the same name (one per dataset in the project). This leads to random behavior dependi...

one month ago

0 Hi Guys, I Have A Question Regarding Model Tracking. I Have Pipelines That Use Xgboost Through The Scikit-Learn Api To Perform:

Hi Alex,
thanks for your answer. I'm curious about your third point using OutputModel. I could not figure out from the documentation how do you actually use it. I constructed the OutputModel object as such:

out = OutputModel(task, name="my_model", framework="xgboost")
However, I could not find any method in the doc that would allow me to pass the model object to that instance or said otherwise, I can't understand how to use that Output model to register my model which would be stored in a...

2 years ago

0 Hi Guys, We Are Running Clearml-Serving On A Kube Cluster On Aws And We Have Noticed That We Are Getting Some 502 Errors Once In A While That We Can'T Seem To Trace Back.

Hi Martin,

Actually we are using ALB with a 30 seconds timeout
we do not have GPUs instances
docker version 1.3.0

one year ago

0 Hi Guys, We Are Running Clearml-Serving On A Kube Cluster On Aws And We Have Noticed That We Are Getting Some 502 Errors Once In A While That We Can'T Seem To Trace Back.

alright, so actually we noticed that the problem disappears if we use only sync requests. Meaning if I create a sleep endpoint that is async we get the 502 but if it's sync we don't

one year ago

0 Hi Guys, We Are Running Clearml-Serving On A Kube Cluster On Aws And We Have Noticed That We Are Getting Some 502 Errors Once In A While That We Can'T Seem To Trace Back.

that's a fair point. Actually we have switched from using siege because we believe it is causing the issues and are using Locust now instead. We have been running for days at the same rate and don't see any errors being reported...

one year ago

thanks for your reply!

one year ago

0 Hi Guys, We Are Running Clearml-Serving On A Kube Cluster On Aws And We Have Noticed That We Are Getting Some 502 Errors Once In A While That We Can'T Seem To Trace Back.

we have tried both and got the same issue (gunicorn vs uvcorn).
No I meant creating a

@router.post(
    "/sleep",
    tags=["temp"],
    response_description="Return HTTP Status Code 200 (OK)",
    status_code=status.HTTP_200_OK,
    response_model=TestResponse,
)
# def here instead of async def
def post_sleep(time_sleep: float) -> TestResponse:
    """ """
    time.sleep(time_sleep)
    return TestResponse(status="OK")

one year ago

We put back the additional changes and so far it seems that this has solved our issue. Thanks a lot for the quick turnaround on this.

one year ago

I can't be sure of the version I can't check at the moment, I have 1.3.0 from the top of my head but could be way off

one year ago

Geez, I have been looking for this for a while, thanks for saving my day...again.

one year ago

was allow_archived removed from Task.query_tasks?

one year ago

0 Hi Guys, We Are Running Clearml-Serving On A Kube Cluster On Aws And We Have Noticed That We Are Getting Some 502 Errors Once In A While That We Can'T Seem To Trace Back.

yeah I don't know I think we are probably just trying to fit to high a throughput for that box but it's weird that the packet just get dropped i would have assumed the response time should degrade and requests be queued.

one year ago

0 Hi Guys, We Are Running Clearml-Serving On A Kube Cluster On Aws And We Have Noticed That We Are Getting Some 502 Errors Once In A While That We Can'T Seem To Trace Back.

I'm not sure what to do with that info I must say since the serve_model is async for good reasons I guess

one year ago

This being said, now I'm running into another issue that this seems to be "erasing" all the packages that had been set in the base task I'm cloning from. I can't find a method that would return these packages so that I could add to it?

one year ago

0 Hi Guys, We Are Running Clearml-Serving On A Kube Cluster On Aws And We Have Noticed That We Are Getting Some 502 Errors Once In A While That We Can'T Seem To Trace Back.

I have tested with an endpoint that basically add two numbers and never managed to trigger the 502. I'm starting to wonder if we are not running just too many workers. I had it wrong that 2 vcpus should mean 5 workers should be good but I think i should probably be closer to 2 but I m not sure why that would lead requests being dropped

one year ago

0 Hi Everyone, I'M Trying To Setup Clearml-Serving As Per

Hi @<1523701087100473344:profile|SuccessfulKoala55> ,
I'm running in almost the same error (see below) but I want to connect the the free clearml server version at None so I have set up the corresponding env variables in example.env:

CLEARML_WEB_HOST="

"
CLEARML_API_HOST="

"
CLEARML_FILES_HOST="

"
CLEARML_API_ACCESS_KEY="---"
CLEARML_API_SECRET_KEY="---"
CLEARML_SERVING_TASK_ID="---"

I have set up the right values from...

2 years ago

Show more results