Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Any Ideas Why This Is Happening? It Was Fine Yesterday

Any ideas why this is happening? It was fine yesterday

  
  
Posted 3 years ago
Votes Newest

Answers 14


@<1523701157564780544:profile|TenseOstrich47> This is typically indicative of insufficient server disk space causing ES to go into read-only mode or turn active shards into inactive or unassigned (see FAQ ).

The disk watermarks controlling the ES free-disk constraints are defined by default as % of the disk space (so it might look to you like you still have plenty of space, but ES thinks otherwise). You can configure different ES settings in the docker-compose.yml file (see here - there are 3 settings, all can be identical)

If you don't have enough free disk space, clean up files to create more, or resize your partition (or increase your disk size if on a cloud instance).

  
  
Posted 3 years ago

@<1687643893996195840:profile|RoundCat60> Hey Alex. Could you take a look at this when you're free later on please

  
  
Posted 3 years ago

that should be the case, we have default_output_uri: set to an s3 bucket

  
  
Posted 3 years ago

@<1523701157564780544:profile|TenseOstrich47> The storage in question here is what's available on the machine hosting the ClearML server's docker containers (specifically, the ES one).

  
  
Posted 3 years ago

I thought nothing should be stored locally on the agent? Shouldn't all files be logged to the storage rather than the instance itself?

  
  
Posted 3 years ago

TenseOstrich47 this looks like elasticserach is out of space...

  
  
Posted 3 years ago

TenseOstrich47 this sounds like a good idea.
When you have a script, please feel free to share, I think it will be useful for other users as well 🙂

  
  
Posted 3 years ago

I literally cannot reset a single task

  
  
Posted 3 years ago

ES can't use s3 for storage, nor can MongoDB

  
  
Posted 3 years ago

After some additional inspection, seems like the issue is docker related.
7.7G /var/lib/docker/overlay2/ this is the directory which is causing the device storage issues.

  
  
Posted 3 years ago

From what I can tell, docker has some leakage here. Temp files are not removed correctly, resulting in the build up of disk storage usage.
See the following for more details
https://stackoverflow.com/questions/46672001/is-it-safe-to-clean-docker-overlay2
https://forums.docker.com/t/some-way-to-clean-up-identify-contents-of-var-lib-docker-overlay/30604
https://docs.docker.com/storage/storagedriver/overlayfs-driver/

Im going to write a clean up script and add that to the cron. I dont believe there is an easy way around this issue as docker trades off disk storage for simplicity

  
  
Posted 3 years ago

Thanks Jake, I will have a look. Is there a reason a lot disk space would be used on the server instance? Is there something in the config I can change to ensure that minimal memory is used on that server, and mostly s3 is used for storage?

  
  
Posted 3 years ago
1K Views
14 Answers
3 years ago
8 months ago
Tags