Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi There :) Can Anybody Tell Me What The Best Practice Is For Performing A Normalization In The Preprocess.Py Script Used By Clearml-Serving? Currently I Use A Sklearn Minmaxscaler Which Is Loaded And Applied Before And After The Data Is Send To The Model

Hi there :)
can anybody tell me what the best practice is for performing a normalization in the preprocess.py script used by clearml-serving? Currently I use a sklearn MinMaxScaler which is loaded and applied before and after the data is send to the model. I do that because I do not know how to get the pickle file into the docker container (maybe download it in the init?) and load the MinMaxScaler within the script, as the sklearn dependency is missing. Additionally I am not sure if using a MinMaxScaler object is the best practice to perform normalization during serving time? I would really appreciate hearing your opinions on that!

  
  
Posted one year ago
Votes Newest

Answers 8


Hi @<1526371965655322624:profile|NuttyCamel41>

. I do that because I do not know how to get the pickle file into the docker container

What would the pickle file do?

and load the MinMaxScaler within the script, as the sklearn dependency is missing

what do you mean by that? are you getting an error when loading your model ?

  
  
Posted one year ago

Hi @<1523701205467926528:profile|AgitatedDove14> , thanks for your answer I will check if I get that working!

  
  
Posted one year ago

Hi @<1523701205467926528:profile|AgitatedDove14> , that is an interesting idea! But wouldn't it be better to load the model in the load() function, so that the model doesn't have to be loaded again with every request? Or is there kind of internal link that when the load() method is implemented it is expected that there was a custom model loaded and applied in the process() function?

  
  
Posted one year ago

Hi @<1523701205467926528:profile|AgitatedDove14> thanks for your answer! 🙂 I think my case is a bit different. I do not want to load a custom model but I want to load a custom object used for preprocessing. So I think the load method would not fit, as the local_file_name parameter I get in the load function would lead to the model file. And as far as I can see there is no mechanism installed to load other objects than the model file inside the Preprocess class, right?

  
  
Posted one year ago

Hi @<1523701205467926528:profile|AgitatedDove14> , I serialized a sklearn MinMaxScaler object which I created on the training data using pickle. So when serving the model I would like to load that pickle file in the preprocess script such that I can perform the same normalization as done during training. Unless there is a better practice applying the same normalization during training and serving time.

  
  
Posted one year ago

And as far as I can see there is no mechanism installed to load other objects than the model file inside the Preprocess class, right?

Well actually this is possible, let's assume you have another Model that is part of the preprocessing, then you could have:
something like that should work

def preprocess(...)
    if not getattr(self, "_preprocess_model):
        self._preprocess_model = joblib.load(Model(model_id).get_weights())
  
  
Posted one year ago

I see if this is the case, use this example:
None
It should cover the entire thing, no?
None

  
  
Posted one year ago

Yes! I checked it should work (it checks if you have load(...) function on the preprocess class and if you do it will use it:
None

def load(local_file)
    self._model = joblib.load(local_file_name)
    self._preprocess_model = joblib.load(Model(hard_coded_model_id).get_weights())
  
  
Posted one year ago
1K Views
8 Answers
one year ago
one year ago
Tags