Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
[Caching Of Environment And Storage When Using Aws Auto Scaler]

[Caching of environment and storage when using AWS auto scaler]
First off : We are aiming to set up ClearML for large-scale DL training for multiple projects and are pretty much basing our entire pipeline around it - having the ClearML auto scaler at all is super great and an impressive tool!

On to my question: We have a fairly complex setup (Pytorch and a bunch of other CV libraries) and at least a medium-sized training data (~15GB for now, potentially increasing to 500GB.) Therefore, we would greatly benefit if the environment install and data (loaded via clearml.StorageManager().get_local_copy(…) ) would be cached on these auto-scaled remote machines.
However, as far as I can see, each new task spins up it’s own Docker container, installs everything, loads the data, does the training and then completely deletes the container again. As all data resides within the container, it is lost afterwards.
Am I correct with that? Would there be any way to have data and/or environment cached anyhow?

  
  
Posted one year ago
Votes Newest

Answers 4


Ok, I re-checked and saw that the data was indeed cached and re-loaded - maybe I waited a little too long last time and it was already a new instance. Awesome implementation guys!

  
  
Posted one year ago

I can see that the data is reloaded each time, even if the machine was not shut down in between.

You can verify by looking into the Task's Log, it will contain all the docker arguments, one of them should be the cache folder mount

  
  
Posted one year ago

Hi ScantChimpanzee51

having the ClearML auto scaler at all is super great and an impressive tool!

Thank you! 😍

As all data resides within the container, it is lost afterwards.

Nothing to fear there, if you are using the StorageManager, the destination is always the cache folder, which the agent automatically mounts to the host machine.
That said if the EC2 instance is taken down (i.e. idle) then the cache is lost with it.

Make sense?

  
  
Posted one year ago

So the container itself gets deleted but everything is still cached because the cache directory is mounted to the host machine in the same place? Makes absolute sense and is what I was hoping for, but I can’t confirm this currently - I can see that the data is reloaded each time, even if the machine was not shut down in between. I’ll check again to find the cached data on the machine

  
  
Posted one year ago
682 Views
4 Answers
one year ago
one year ago
Tags