
Reputation
Badges 1
89 × Eureka!Hi SuccessfulKoala55 I gave up after 20 mins and also got a notification from firefox "This page is slowing down Firefox. The speed up your browser, stop this page". I'm heading out soon so I could leave it on. Also, had the same behaviour in chrome.
` # dataset_class.py
from PIL import Image
from torch.utils.data import Dataset as BaseDataset
class Dataset(BaseDataset):
def __init__(
self,
images_fps,
masks_fps,
augmentation=None,
):
self.augmentation = augmentation
self.images_fps = images_fps
self.masks_fps = masks_fps
self.ids = len(images_fps)
def __getitem__(self, i):
# read data
img = Image.open(self.images_fps[i])
mask = Image...
we normally do something like that - not sure what why it's freezing for you without more info
Can you try to go into 'Settings' -> 'Configuration' and verify that you have 'Show Hidden Projects' enabled
Hi yes all sorted ! 🙂
Going for something like this:
` >>> queue = QueueMetrics(queue='queueid')
queue.avg_waiting_times `
thank you guys 😄 😄
echo -e $(aws ssm --region=eu-west-2 get-parameter --name 'my-param' --with-decryption --query "Parameter.Value") | tr -d '"' > .env set -a source .env set +a git clone https://${PAT}@github.com/myrepo/toolbox.git mv .env toolbox/ cd toolbox/ docker-compose up -d --build docker exec -it $(docker-compose ps -q) clearml-agent daemon --detached --gpus 0 --queue default
When I run in the UI I get the following responseError: Can not start new instance, An error occurred (InvalidParameterValue) when calling the RunInstances operation: Invalid availability zone: [eu-west-2]
When I run programatically it just stalls and I don't get any read out
Okay great thanks SuccessfulKoala55
Nope AWS aren't approving the increased vCPU request. I've explained the use case several times and they've not approved
This was the response from AWS:
"Thank you for for sharing the requested details with us. As we discussed, I'd like to share that our internal service team is currently unable to support any G type vCPU increase request for limit increase.
The issue is we are currently facing capacity scarcity to accommodate P and G instances. Our engineers are working towards fixing this issue. However, until then, we are unable to expand the capacity and process limit increase."
AgitatedDove14 is any working on a GCP or Azura autoscaler at the moment?
Okay thanks for the update 🙂 the account manager got involved and the limit has been approved 🚀
The latest commit to the repo is 22.02-py3
( https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/clearml_serving/engines/triton/Dockerfile#L2 ) I will have a look at versions now 🙂
Okay just for clarity...
Originally, my Nvidia drivers were running on an incompatible version for the triton serverThis container was built for NVIDIA Driver Release 510.39 or later, but version 470.103.01 was detected and compatibility mode is UNAVAILABLE.
To fix this issue I updated the drivers on my base OS i.e.sudo apt install nvidia-driver-510 -y sudo reboot
Then it worked. The docker-compose logs from clearml-serving-triton
container did not make this clear (i.e. by r...
It might only be a req for the docker/docker-compose-triton-gpu.yml
file but I'd need to check
$ curl -X 'POST' '
' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ "url": "
" }' {"digit":5}
I'll add a more detailed response once it's working
I can raise this as an issue on the repo if that is useful?
Still debugging.... That fixed the issue with the
nvcr.io/nvidia/tritonserver:22.02-py3
container which now returns
` =============================
== Triton Inference Server ==
NVIDIA Release 22.02 (build 32400308)
Copyright (c) 2018-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
Various files include modifications (c) NVIDIA CORPORATION. All rights reserved.
This container image and its contents are governed by the NVIDIA Deep Learning Co...
okay so this could be a python script that generates the clearml.conf in the working dir in the container?
great thank you it's working. Just wanted to check before adding all env vars 🙂