Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello! Is There Any Way To Use Original Files In Cleaml Datasets ? I Have Batch Of Tar Archives And Want To Create Dataset From Them, However Clearml Compresses Them. I Tried To Use

Hello! Is there any way to use original files in cleaml datasets ? I have batch of tar archives and want to create dataset from them, however clearml compresses them. I tried to use compression = None , but it didnt help

dataset.upload(verbose=True, output_url='
', compression = None,chunk_size = -1 )
  
  
Posted 13 days ago
Votes Newest

Answers 4


Why does it matter how clearml stores datasets? If you get the dataset locally, all files will be unzipped.

  • It takes time to compress. 8 archives , 5gb each , takes half of hour.
  • I can stream archives from bucket directly to network for training without getting them locally, which saves storage space
  
  
Posted 13 days ago

Hi @<1523702307240284160:profile|TeenyBeetle18> , if they are already on GS then you can use add_external_files to log them.
None
What do you think?

  
  
Posted 13 days ago

Why does it matter how clearml stores datasets? If you get the dataset locally, all files will be unzipped.

  
  
Posted 13 days ago

Seems like it does not let to use ability of clearml to track and version datasets. I mean, I can't create next version of dataset from dataset with external files

  
  
Posted 13 days ago
53 Views
4 Answers
13 days ago
12 days ago
Tags