Reputation
Badges 1
89 × Eureka!(deepmirror) ryan@ryan:~$ python -c "import clearml print(clearml.__version__)" 1.1.4
Same with new version(deepmirror) ryan@ryan:~/GitHub/deepmirror/ml-toolbox$ python -c "import clearml; print(clearml.__version__)" 1.6.1
Generating SHA2 hash for 1 files 100%|███████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 2548.18it/s] Hash generation completed Uploading dataset changes (1 files compressed to 130 B) to BUCKET File compression and upload completed: total size 130 B, 1 chunked stored (average size 130 B)
Hi SuccessfulKoala55 yes I can see the one upload using 1.6.1 but all old datasets have now been remove. I guess you want people to start moving over?
This was the error I was getting from uploads using the old SDKhas been rejected for invalid domain. heap-2443312637.js:2:108655 Referrer Policy: Ignoring the less restricted referrer policy "no-referrer-when-downgrade" for the cross-site request:
`
import os
import glob
from clearml import Dataset
DATASET_NAME = "Bug"
DATASET_PROJECT = "ProjectFolder"
TARGET_FOLDER = "clearml_bug"
S3_BUCKET = os.getenv('S3_BUCKET')
if not os.path.exists(TARGET_FOLDER):
os.makedirs(TARGET_FOLDER)
with open(f'{TARGET_FOLDER}/data.txt', 'w') as f:
f.writelines('Hello, ClearML')
target_files = glob.glob(TARGET_FOLDER + "/**/*", recursive=True)
# upload dataset
dataset = Dataset.create(dataset_name=DATASET_NAME, dataset_project=DATASET_PR...
gdn4.xlarge (the best price for 16GB of GPU ram). Not so surprising they would want a switch
We use albumentations with scripts that execute remotely and have no issues. Good question from CostlyOstrich36
we normally do something like that - not sure what why it's freezing for you without more info
okay so this could be a python script that generates the clearml.conf in the working dir in the container?
great thank you it's working. Just wanted to check before adding all env vars 🙂
the agent it for replicating what you run locally elsewhere i.e. remote GPU machine
nope you'll just need to install clearml
I can run clearml.OutputModel(task, framework='pytorch')
to get the model from a previous task. but how can I get the pytorch model ( torch.nn.Module
) from the output model object
`
2021-10-19 14:19:07
Spinning new instance type=aws4gpu
Error: Can not start new instance, An error occurred (InvalidParameterValue) when calling the RunInstances operation: Invalid availability zone: [eu-west-2]
Spinning new instance type=aws4gpu
ClearML Monitor: GPU monitoring failed getting GPU reading, switching off GPU monitoring
Error: Can not start new instance, An error occurred (InvalidParameterValue) when calling the RunInstances operation: Invalid availability zone: [eu-west-2]
S...
Yep just about to do that. Just annoying to add arg parser etc
For referenceimport subprocess for i in ['1', '2']: command = ['python', 'hyp_op.py', '--testnum', f'{i}'] process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
` # dataset_class.py
from PIL import Image
from torch.utils.data import Dataset as BaseDataset
class Dataset(BaseDataset):
def __init__(
self,
images_fps,
masks_fps,
augmentation=None,
):
self.augmentation = augmentation
self.images_fps = images_fps
self.masks_fps = masks_fps
self.ids = len(images_fps)
def __getitem__(self, i):
# read data
img = Image.open(self.images_fps[i])
mask = Image...
The latest commit to the repo is 22.02-py3
( https://github.com/allegroai/clearml-serving/blob/d15bfcade54c7bdd8f3765408adc480d5ceb4b45/clearml_serving/engines/triton/Dockerfile#L2 ) I will have a look at versions now 🙂
In short we clone the repo, build the docker container, and run agent in the container. The reason we do it this, rather than provide a docker image to the clearml-agent is two fold:
We actively develop our custom networks and architectures within a containerised env to make it easy for engineers to have a quick dev cycle for new models. (same repo is cloned and we build the docker container to work inside) We use the same repo to serve models on our backend (in a slightly different contain...
Yes, it's the dependencies. At the moment I'm doing this as a work around.
` autoscaler = AwsAutoScaler(hyper_params, configurations)
startup_bash_script = [
'...',
]
autoscaler.startup_bash_script = startup_bash_script ` I'd prefer to run it on the Web UI. Also, we seem to have problems when it's executed remotely
Yes on the apps page is the possible to tigger programatically?
remote execution is working now. Internal worker nodes had not spun up the agent correctly 😛
I've got it... i just remembered I can calltask_id
from the cloned tasked and check the status of that 🙂
Looks to be working 🚀 just need to test one more thing. Thank you CostlyOstrich36
Hey having a few issues with this
Okay great thanks SuccessfulKoala55