Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
VivaciousPenguin66
Moderator
17 Questions, 107 Answers
  Active since 10 January 2023
  Last activity 17 days ago

Reputation

0

Badges 1

93 × Eureka!
0 Votes
8 Answers
678 Views
0 Votes 8 Answers 678 Views
2 years ago
0 Votes
1 Answers
689 Views
0 Votes 1 Answers 689 Views
Question when using remote storage blobs (e.g. Azure). I am using it as a output_url location, and it is storing both datasets, and also experiment artefacts...
2 years ago
0 Votes
7 Answers
512 Views
0 Votes 7 Answers 512 Views
///[Please note, all the below was executed on the command line of the compute node, not the server head node]/// I've been following the example on Keras, b...
2 years ago
0 Votes
5 Answers
537 Views
0 Votes 5 Answers 537 Views
Are there any tips for how to set these boxes in the profile for access to Azure Blob Storage using SAS? I can create a Shared Access Key (SAS) through the A...
2 years ago
0 Votes
1 Answers
604 Views
0 Votes 1 Answers 604 Views
Does anyone have an example of how to use the services queue to start a load balancer on Azure? Virtual Machine Scale Sets through the Azure Management Pytho...
2 years ago
0 Votes
30 Answers
526 Views
0 Votes 30 Answers 526 Views
I buried this issue in another thread to do with deployment, but I was wondering if anyone else has had problems using clearml-serving package to serve a PyT...
2 years ago
0 Votes
30 Answers
538 Views
0 Votes 30 Answers 538 Views
With clearml-serving could someone explain to me what a config.pbtxt file is and its format? When executing a PyTorch model for serving I get an error pasted...
2 years ago
0 Votes
10 Answers
557 Views
0 Votes 10 Answers 557 Views
This wasn't a big deal, but I noticed when pushing a dataset to the server, with cloud storage, that the upload information looked a bit bonkers in terms of ...
2 years ago
0 Votes
5 Answers
523 Views
0 Votes 5 Answers 523 Views
I have setup a clearml-server running on a Azure VM instance and have used default parameters when it comes to specifying storage locations for data and arte...
2 years ago
0 Votes
18 Answers
910 Views
0 Votes 18 Answers 910 Views
2 years ago
0 Votes
1 Answers
548 Views
0 Votes 1 Answers 548 Views
Silly question alert...... Really simple one to start with. If I have the more or less the default settings for a clearml-agent on a compute node, so therefo...
2 years ago
0 Votes
10 Answers
609 Views
0 Votes 10 Answers 609 Views
When I setup my local virtual environment I use a combination of Conda and pip. I use conda as my environment manager, and then use pip for packages that are...
2 years ago
0 Votes
2 Answers
507 Views
0 Votes 2 Answers 507 Views
I was wondering, if I want to use Task.create() instead of Task.init() to create a new experiment object, I am aware that automatic logging will not be done....
2 years ago
0 Votes
6 Answers
517 Views
0 Votes 6 Answers 517 Views
I have been successfully deploying and training a PyTorch CNN on a clearml-agent managed compute resource and have been testing some the capabilities, includ...
2 years ago
0 Votes
2 Answers
604 Views
0 Votes 2 Answers 604 Views
I have got experiments training PyTorch networks on a remote compute run by clearml-agent . I am using the Ignite framework to train image classification net...
2 years ago
0 Votes
15 Answers
560 Views
0 Votes 15 Answers 560 Views
2 years ago
0 Votes
4 Answers
554 Views
0 Votes 4 Answers 554 Views
I have just installed the PYPI version of clearml-serving and I get the following error at the command line. clearml-serving --help clearml-serving - CLI for...
2 years ago
0 I Have Been Successfully Deploying And Training A Pytorch Cnn On A

SuccessfulKoala55 However, this was the first time an experiment with this dataset was executed on this compute node. I have been doing a lot of trial and error with this setup to get the models training, and so on my first compute node, I had the data downloading locally quite early on, so I haven't seen the script have to download a local dataset cache as it was already done.

2 years ago
0 I Have Been Successfully Deploying And Training A Pytorch Cnn On A

` Starting Task Execution:

usage: train_clearml_pytorch_ignite_caltech_birds.py [-h] [--config FILE]
[--opts ...]

PyTorch Image Classification Trainer - Ed Morris (c) 2021

optional arguments:
-h, --help show this help message and exit
--config FILE Path and name of configuration file for training. Should be a
.yaml file.
--opts ... Modify config options using the command-line 'KEY VALUE'
p...

2 years ago
0 I Have Got Experiments Training Pytorch Networks On A Remote Compute Run By

AgitatedDove14 Brilliant!
I will try this, thank you sir!

2 years ago
0 When I Setup My Local Virtual Environment I Use A Combination Of Conda And Pip. I Use Conda As My Environment Manager, And Then Use Pip For Packages That Are Not In The Conda Repositories.

The following code is the training script that was used to setup the experiment. This code has been executed on the server in a separate conda environment and verified to run fine (minus the clearml code).

` from future import print_function, division
import os, pathlib

Clear ML experiment

from clearml import Task, StorageManager, Dataset

Local modules

from cub_tools.trainer import Ignite_Trainer
from cub_tools.args import get_parser
from cub_tools.config import get_cfg_defaults

#...

2 years ago
0 Are There Any Tips For How To Set These Boxes In The Profile For Access To

Ah ok, so it's the query string you use with the SAS box. Great.

2 years ago
0 Are There Any Tips For How To Set These Boxes In The Profile For Access To

Thanks CostlyOstrich36 , you can also get access to the keys in the Azure Storage Explorer.
Looking at the Properties section gives the secure keys.

2 years ago
0 I Am Having An Issue Publishing A Completed Model Training. The Model Has Been Deployed On Remote Compute, Using A Docker Image, And The Datasets Have Been Served From An Azure Blob Storage Account. The Model Trains Successfully, And Completes, After The

I checked the apiserver.log file in /opt/clearml/logs and this appears to be the related error when I try to publish an experiment:

` [2021-06-07 13:43:40,239] [9] [ERROR] [clearml.service_repo] ValidationError (Task:8a4a13bad8334d8bb53d7edb61671ba9) (setup_shell_script.StringField only accepts string values: ['container'])
Traceback (most recent call last):
File "/opt/clearml/apiserver/bll/task/task_operations.py", line 325, in publish_task
raise ex
File "/opt/clearml/a...

2 years ago
0 I Am Having An Issue Publishing A Completed Model Training. The Model Has Been Deployed On Remote Compute, Using A Docker Image, And The Datasets Have Been Served From An Azure Blob Storage Account. The Model Trains Successfully, And Completes, After The

Hi SuccessfulKoala55
Thanks for the input.
I was actually about to grab the new docker_compose.yml and pull the new images.
Weirdly it was working before, so what's changed?
I don't believe I've updated the agents or the clearml sdk on the experiment submission vm either.
I will definitely update the server now, and report back.

2 years ago
0 I Am Having An Issue Publishing A Completed Model Training. The Model Has Been Deployed On Remote Compute, Using A Docker Image, And The Datasets Have Been Served From An Azure Blob Storage Account. The Model Trains Successfully, And Completes, After The

SuccessfulKoala55
Good news!
It looks like pulling the new clearml-server version has solved the problem.
I can happily publish models.

Interestingly, I was able to publish models before using this server, so I must have inadvertently updated something that has caused a conflict.

2 years ago
2 years ago
0 Hi, I Have A Question About Clearml-Data. Clearml-Data Probably Does Well On Data Versioning, But When It Comes To Actual Loading Of Data, Are There Examples Of How It Can Make Use Of Advanced Features Such That Those In

Like AnxiousSeal95 says, clearml server will version a dataset for you and push it to a unified storage place, as well as make it differenceable.

I’ve written a workshop on how to train image classifiers for the problem of bird species identification and recently I’ve adapted it to work with clearml.

There is an example workbook on how to upload a dataset to clearml server, in this a directory of images. See here: https://github.com/ecm200/caltech_birds/blob/master/notebooks/clearml_add...

2 years ago
0 Hello Clearml Friends. I'M Trying To Setup A Clearml Agent On My Workstation To Queue Jobs On My Gpu.

You need to make sure the user is part of the docker group.
Follow these commands post install of Docker engine, and don't forget to restart the terminal session for the changes to take full effect .

` sudo groupadd docker

sudo usermod -aG docker ${USER} `Don't install Docker engine with root, your sysadmin will have kittens!

2 years ago
0 Hello Clearml Friends. I'M Trying To Setup A Clearml Agent On My Workstation To Queue Jobs On My Gpu.

I think perhaps as standard, the group docker is already created.

The bit that isn't done is making your user part of that group.

2 years ago
0 ///[Please Note, All The Below Was Executed On The Command Line Of The Compute Node,

SuccessfulKoala55
I can see the issue your are referring to regarding the execution of the triton docker image, however as far as I am aware, this was not something I explicitly specified. The ServingService.launch_service() method from the ServingService Class from the clearml-serving package would appear to have both specified:

` def launch_engine(self, queue_name, queue_id=None, verbose=True):
# type: (Optional[str], Optional[str], bool) -> None
"""
...

2 years ago
0 ///[Please Note, All The Below Was Executed On The Command Line Of The Compute Node,

I have rerun the serving example with my PyTorch job, but this time I have followed the MNIST Keras example.
I appended a GPU compute resource to the default queue and then executed the service on the default queue.
This resulted in a Triton serving engine container spinning up on the compute resource, however it failed due to the previous issue with ports conflicts:

` 2021-06-08 16:28:49
task f2fbb3218e8243be9f6ab37badbb4856 pulled from 2c28e5db27e24f348e1ff06ba93e80c5 by worker ecm-clear...

2 years ago
0 ///[Please Note, All The Below Was Executed On The Command Line Of The Compute Node,

This potentially might be a silly question, but in order to get the inference working, I am assuming that no specific inference script has to be written for handling the model?

This is what the clearml-serving package takes care of, correct?

2 years ago
0 ///[Please Note, All The Below Was Executed On The Command Line Of The Compute Node,

SuccessfulKoala55 I may have made some progress with this bug, but have stumbled onto another issue in getting the Triton service up and running.

See comments in the github issue.

2 years ago
0 Hello Clearml Friends. I'M Trying To Setup A Clearml Agent On My Workstation To Queue Jobs On My Gpu.

I dip in and out of Docker, and that one gets me almost every time!

2 years ago
0 Greetings And Hello

I love the new design of the site.

When is clearml-deploy coming to the open source release?
Or is this a commercial only part?

2 years ago
0 I Buried This Issue In Another Thread To Do With Deployment, But I Was Wondering If Anyone Else Has Had Problems Using

When I run the commands above you suggested, if I run them on the compute node but on the host system within conda environment I installed to run the agent daemon from, I get the issues as we appear to have seen when executing the Triton inference service.

` (py38_clearml_serving_git_dev) edmorris@ecm-clearml-compute-gpu-002:~$ python
Python 3.8.10 (default, May 19 2021, 18:05:58)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.

...

2 years ago
0 I Have Been Successfully Deploying And Training A Pytorch Cnn On A

SuccessfulKoala55 A second queued job which executed on the same node, but didn't this time need to cache the dataset locally as it was done by the previous experiment, hasn't had this issue.

That all being said, apart from the console reporting looking messy, it doesn't appear to have impacted the training, or indeed the metric collection of the first experiment where it occurred.

2 years ago
0 I Buried This Issue In Another Thread To Do With Deployment, But I Was Wondering If Anyone Else Has Had Problems Using

AgitatedDove14

So can you verify it can download the model ?

Unfortunately it's still falling over, but then I got the same result for the credentials using both URI strings, the original, and the modified version, so it points to something else going on.

I note that the StorageHelper.get() method has a call which modifies the URI prior to it being passed to the function which gets the storage account and container name. However, when I run this locally, it doesn't seem to do a...

2 years ago
0 Hi Everyone, Does Anyone Have Any Pointers On How To Make The Clearml-Server Web Service Secure Using Ssl By Setting Up Nginx? I Have Played Around With It A Bit In Relation To Getting A Jupyterhub Setup Working Over Https, However, I Think That Was Mor

SuccessfulKoala55
SUCCESS!!!

This appears to be working.
Setup certifications us sudo certbot --nginx .

Then edit the default configuration file in /etc/nginx/sites-available

` server {
listen 80;
return 301 https://$host$request_uri;
}

server {

listen 443;
server_name your-domain-name;

ssl_certificate           /etc/letsencrypt/live/your-domain-name/fullchain.pem;
ssl_certificate_key       /etc/letsencrypt/live/your-domain-name/privkey.pem;

...

2 years ago
0 Hi Everyone, Does Anyone Have Any Pointers On How To Make The Clearml-Server Web Service Secure Using Ssl By Setting Up Nginx? I Have Played Around With It A Bit In Relation To Getting A Jupyterhub Setup Working Over Https, However, I Think That Was Mor

WearyLeopard29 no I wasn’t able to do that although I didn’t explicitly try.
I was wondering if this was as a high a security risk then the web portal?
Access is controlled by keys, whereas the web portal is not.
I admit I’m a data scientist, so any proper IT security person would probably end up a shivering wreck in the corner of the room if they saw some of my common security practises. I do try to be secure, but I am not sure how good I am at it.

2 years ago
0 I Buried This Issue In Another Thread To Do With Deployment, But I Was Wondering If Anyone Else Has Had Problems Using

So moving onto the container name.
Original code has the following calls:

if not f.path.segments: raise ValueError( "URI {} is missing a container name (expected " "[https/azure]://<account-name>.../<container-name>)".format( uri ) ) container = f.path.segments[0]
Repeating the same commands locally results in the following:

` >>> f_a.path.segments
['artefacts', 'Caltech Birds%2FTraining', 'TRAIN...

2 years ago
Show more results compactanswers