Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello, Can Clearml.Dataset Be Put In Multiple Threads Or Processes? Especially These Two: Clearml_Dataset.Add_Files(Dst_Project_Path.Absolute()) Clearml_Dataset.Upload() Our Dataset Is About 2 Million Files, And It Is Way Too Slow

Hello, can clearml.Dataset be put in multiple threads or processes?

Especially these two:
clearml_dataset.add_files(dst_project_path.absolute())
clearml_dataset.upload()

Our dataset is about 2 million files, and it is way too slow

  
  
Posted 6 months ago
Votes Newest

Answers


Hi @<1590514584836378624:profile|AmiableSeaturtle81> ! add_files already uses multi-threading, so threads would not help (see the max_workers argument).
If you are using a cloud provider such as s3 it would be useful setting this argument, or look for config entries in clearml.conf that would speed-up the upload (such as aws.s3.boto3.max_multipart_concurrency )

  
  
Posted 6 months ago
476 Views
1 Answer
6 months ago
6 months ago
Tags