Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Hi, I'M Trying To Create A Dataset With 186 Parent Datasets. The Process Fails Due To Oom, The Machine Has 64 Gb Of Ram. Does A Workaround Exists, For Example, Generating Intermediate Datasets ? Or Does The Total Memory Consumed Depends On The Number Of

I'm trying to create a dataset with 186 parent datasets.
The process fails due to OOM, the machine has 64 GB of RAM.

Does a workaround exists, for example, generating intermediate datasets ?
Or does the total memory consumed depends on the number of files in all the parent datasets, and I need to buy more memory ?

Posted 10 months ago
Votes Newest

Answers 2

Hi @<1571308010351890432:profile|HurtAnt92> ! Yes, you can create intermediate datasets. Just batch your datasets, for each batch create new child datasets, then create a dataset that has as parents all of these resulting children.
I'm surprized you get OOM tho, we don't load the files in memory, just the name/path of the files + size, hash etc. Could there be some other factor that causes this issue?

Posted 10 months ago

  • Thank you, I will give it a try. I debugged the code to investigate the cause of the failure. It appears that the code fails at line 1312 inside Dataset.create during the execution of the instance._serialize() function. I will further explore the code to identify the precise point of failure.
Posted 10 months ago
2 Answers
10 months ago
10 months ago