Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello! Is There A Way To Avoid Or Accelerate

Hello! Is there a way to avoid or accelerate Generating SHA2 hash for ... files when uploading datasets?

  
  
Posted 2 years ago
Votes Newest

Answers 6


SmugSnake6 I think the latest version (1.8.0) tries to parallelize it
You can also control max_workers

  
  
Posted 2 years ago

With default settings, to upload 2 datasets of 120 GB and 70 Gb it took more than 6 hours! And this is to upload the dataset on the server itself, the upload pipeline is done on the same computer as clearml

  
  
Posted 2 years ago

Sooo for the SHA2 generation, I've tested 2 very different CPUs, and it makes a HUGE difference 😅 I probably have to upgrade my server

  
  
Posted 2 years ago

With default settings, to upload 2 datasets of 120 GB and 70 Gb it took more than 6 hours!

SmugSnake6 at the end s the an outcome of limited bandwidth or limited CPU ?

  
  
Posted 2 years ago

From what I could see, generating SHA2:
i7-10700K: ~ 10 - 15 minutes Xeon E3-1240: 4 - 5 hours!Then in both cases I still have about an 1h30 to upload the images to the fileserver. Which I also find quite a bit slow, but the ClearML fileserver is on my old Xeon. I plan to upgrade my server and to test it again

  
  
Posted 2 years ago

Xeon E3-1240: 4 - 5 hours!wow... yes definitely worth upgrading 🙂

  
  
Posted 2 years ago
944 Views
6 Answers
2 years ago
one year ago
Tags