Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
This Wasn'T A Big Deal, But I Noticed When Pushing A Dataset To The Server, With Cloud Storage, That The Upload Information Looked A Bit Bonkers In Terms Of Units:


This was the code:

` import os
import argparse

# ClearML modules
from clearml import Dataset

parser = argparse.ArgumentParser(description='CUB200 2011 ClearML data uploader - Ed Morris (c) 2021')
parser.add_argument(
    '--dataset-basedir',
    dest='dataset_basedir',
    type=str,
    help='The directory to the root of the dataset', 
    default='/home/edmorris/projects/image_classification/caltech_birds/data/images')
parser.add_argument(
    '--clearml-project',
    dest='clearml_project',
    type=str,
    help='The name of the clearml project that the dataset will be stored and published to.', 
    default='Caltech Birds/Datasets')
parser.add_argument(
    '--clearml-dataset-url',
    dest='clearml_dataset_url',
    type=str,
    help='Location of where the dataset files should be stored. Default is Azure Blob Storage. Format is  ', 
    default='')
args = parser.parse_args()

for task_type in ['train','test']:
    print('[INFO] Versioning and uploading {0} dataset for CUB200 2011'.format(task_type))
    dataset = Dataset.create('cub200_2011_{0}_dataset'.format(task_type), dataset_project=args.clearml_project)
    dataset.add_files(path=os.path.join(args.dataset_basedir,task_type), verbose=False)
    dataset.upload(output_url=args.clearml_dataset_url)
    print('[INFO] {0} Dataset finalized....'.format(task_type), end='')
    dataset.finalize()

    print('[INFO] {0} Dataset published....'.format(task_type), end='')
    dataset.publish() `

  
  
Posted 3 years ago
169 Views
0 Answers
3 years ago
one year ago