
Reputation
Badges 1
32 × Eureka!we have tried both and got the same issue (gunicorn vs uvcorn).
No I meant creating a
@router.post(
"/sleep",
tags=["temp"],
response_description="Return HTTP Status Code 200 (OK)",
status_code=status.HTTP_200_OK,
response_model=TestResponse,
)
# def here instead of async def
def post_sleep(time_sleep: float) -> TestResponse:
""" """
time.sleep(time_sleep)
return TestResponse(status="OK")
Geez, I have been looking for this for a while, thanks for saving my day...again.
was allow_archived
removed from Task.query_tasks?
that's a fair point. Actually we have switched from using siege because we believe it is causing the issues and are using Locust now instead. We have been running for days at the same rate and don't see any errors being reported...
my understanding was that the deamon thread was deserializing the task of the control plane every 300 seconds by default
what is actually setting the task status to Aborted
?
Hi @<1523701087100473344:profile|SuccessfulKoala55> ,
I'm running in almost the same error (see below) but I want to connect the the free clearml server version at None so I have set up the corresponding env variables in example.env:
CLEARML_WEB_HOST="
"
CLEARML_API_HOST="
"
CLEARML_FILES_HOST="
"
CLEARML_API_ACCESS_KEY="---"
CLEARML_API_SECRET_KEY="---"
CLEARML_SERVING_TASK_ID="---"
I have set up the right values from...
Hey Martin, I will, but it's a bit more tricky because we have modifications in the code that I have to merge on our side
I'm assuming that task.data.script.requirements is not the right way to do this...
tx that's what I was doing more or less 😆
alright, so actually we noticed that the problem disappears if we use only sync requests. Meaning if I create a sleep endpoint that is async we get the 502 but if it's sync we don't
Hi Martin, thanks a lot for looking into this so quickly. Will you let me know the version number once it's pushed? Thanks!
how can you be >= 0.109.1 and lower than 0.96
We put back the additional changes and so far it seems that this has solved our issue. Thanks a lot for the quick turnaround on this.
we are actually building from our fork of the code into our own images and helm charts
so they ping the werb server?
Hi Alex,
thanks for your answer. I'm curious about your third point using OutputModel. I could not figure out from the documentation how do you actually use it. I constructed the OutputModel object as such:
out = OutputModel(task, name="my_model", framework="xgboost")
However, I could not find any method in the doc that would allow me to pass the model object to that instance or said otherwise, I can't understand how to use that Output model to register my model which would be stored in a...
ok so I haven't looked at the latest changes after the sync this morning but the ones we put in yesterday seems to have fixed the issue, the service is still running this morning at least.
ok great I ll check what other changes we have missed yesterday
no requests are being served as in there is no traffic indeed
Hey tahnks a lot Alex, that's exactly what I was looking for. cheers
I can't be sure of the version I can't check at the moment, I have 1.3.0 from the top of my head but could be way off
so i still can't figure out what sets the task status to aborted
This being said, now I'm running into another issue that this seems to be "erasing" all the packages that had been set in the base task I'm cloning from. I can't find a method that would return these packages so that I could add to it?
hey Marin real quick actually, on your update to the requirements.txt file isn't that constraint on fastapi inconsistent?
ACtually the request are never registered to the gunicorn app, and the ALB log show that there is no response from the target "-".
Hi Martin,
- Actually we are using ALB with a 30 seconds timeout
- we do not have GPUs instances
- docker version 1.3.0