Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Hey There!

Hey there! 🙂

First of all, thank you for creating this Slack Community and giving us the opportunity to work with your wonderful software. I am in need of some help and am wondering if you have any ideas how I could solve a problem.

I am trying to find a good way to handle massive datasets with ClearML. Specifically, I want to work with 300 GB of text on S3 storage, such that

it is easy for me and my coworkers to stream the contents without needing 300 GB of disk space and/or RAM and the code used for this dataset can easily be used for future datasets and not as importantly, but it might be nice if the loading could be parallelized
I have looked at both StorageManager and Dataset , but neither of them seem to have features which do not rely on the hard disk. Concretely my question now is: Is there a way/feature of ClearML to at least partially do this? If not, do you know of any ClearML compatible alternatives?

Thank you in advance!

Posted one year ago
Votes Newest


Hi SpicyCrab51 , Thanks for the warm words 😄 Happy you enjoy our product!
As for your needs, I suggest you explore our https://clear.ml/docs/latest/docs/hyperdatasets/overview , they indeed were made to solve issues similar to what you're facing!
You can see a talk we gave that cover the Hyperdatasets https://www.youtube.com/watch?v=CcL8NNZfHlY !
Note that it is an enterprise feature, and is not part of the open source.
Contact me if you need more info 🙂

Posted one year ago
1 Answer
one year ago
one year ago
Similar posts