Reputation
Badges 1
89 × Eureka!the agent it for replicating what you run locally elsewhere i.e. remote GPU machine
I'll like to call Run Time via the task object.... I think I need to calculate manually
i.e.
task = clearml.Task.get_task(id) time = task.data.last_update - task.data.started
lmk if I can expand on this more 🙂
Can you try to go into 'Settings' -> 'Configuration' and verify that you have 'Show Hidden Projects' enabled
I also noticed that my queue stats haven't been updated since 7/1/2022 @ 12:41am
In short we clone the repo, build the docker container, and run agent in the container. The reason we do it this, rather than provide a docker image to the clearml-agent is two fold:
We actively develop our custom networks and architectures within a containerised env to make it easy for engineers to have a quick dev cycle for new models. (same repo is cloned and we build the docker container to work inside) We use the same repo to serve models on our backend (in a slightly different contain...
I'm sure it used to be in task.artifacts but that's returning an empty dict
prev_task.artifacts {}
AgitatedDove14 is any working on a GCP or Azura autoscaler at the moment?
Error: Can not start new instance, Could not connect to the endpoint URL: " "
okay so this could be a python script that generates the clearml.conf in the working dir in the container?
This was the error I was getting from uploads using the old SDKhas been rejected for invalid domain. heap-2443312637.js:2:108655 Referrer Policy: Ignoring the less restricted referrer policy "no-referrer-when-downgrade" for the cross-site request:
I make 2x in eu-west-2 on the AWS console but still no luck
For ClearML UI2021-10-19 14:24:13 ClearML results page: Spinning new instance type=aws4gpu ClearML Monitor: GPU monitoring failed getting GPU reading, switching off GPU monitoring 2021-10-19 14:24:18 Error: Can not start new instance, Could not connect to the endpoint URL: " " Spinning new instance type=aws4gpu 2021-10-19 14:24:28 Error: Can not start new instance, Could not connect to the endpoint URL: " ` "
Spinning new instance type=aws4gpu
2021-10-19 14:24:38
Error: Can no...
Spin up instance using AWS auto-scaler and use the init script to:
Get key-value pairs from AWS ssm and write to .env file clone private git repo build docker-image locally and use .env file during docker-compose enter container and spin up clearml-agent
echo -e $(aws ssm --region=eu-west-2 get-parameter --name 'my-param' --with-decryption --query "Parameter.Value") | tr -d '"' > .env set -a source .env set +a git clone https://${PAT}@github.com/myrepo/toolbox.git mv .env toolbox/ cd toolbox/ docker-compose up -d --build docker exec -it $(docker-compose ps -q) clearml-agent daemon --detached --gpus 0 --queue default
Hi AgitatedDove14 ,
I noticed that ClearML parses clearml.automation.UniformParameterRange to configuration space to be used with BOHB. When I've used BOHB previously I can use UniformFloatHyperparameter from the configuration space package that allows me to set a parameter in logspace. That is the range is defended by something like numpy.logspace rather than numpy.linspace
Okay thanks for the update 🙂 the account manager got involved and the limit has been approved 🚀
I can run clearml.OutputModel(task, framework='pytorch') to get the model from a previous task. but how can I get the pytorch model ( torch.nn.Module ) from the output model object
Hi SuccessfulKoala55 thanks I didn't know it was possible to use in place of the pw. So in the .conf I can just add the git PAT instead of pw?
git_user: ${GITHUB_USER} git_pass: ${GITHUB_PAT}
I was having an issue with availability zone. I was using 'eu-west-2' instead of 'eu-west-2c'
` # dataset_class.py
from PIL import Image
from torch.utils.data import Dataset as BaseDataset
class Dataset(BaseDataset):
def __init__(
self,
images_fps,
masks_fps,
augmentation=None,
):
self.augmentation = augmentation
self.images_fps = images_fps
self.masks_fps = masks_fps
self.ids = len(images_fps)
def __getitem__(self, i):
# read data
img = Image.open(self.images_fps[i])
mask = Image...
Thanks JitteryCoyote63 , I'll double check the permissions of key/secrets and if no luck I'll check with the team
Hey having a few issues with this