Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi Guys, I Have A Question Regarding Model Tracking. I Have Pipelines That Use Xgboost Through The Scikit-Learn Api To Perform:

Hi Guys,

I have a question regarding Model tracking.

I have pipelines that use Xgboost through the scikit-learn api to perform:

  • Feature selection through nested cross validation --> These seem to be captured both in the Model page and artifact of the corresponding task
  • train the final model with the selected features --> I have a feeling that this does not get captured. I m saying feeling because the 10s of models being stored are hard to tell apart since only hash is changing between one another (see screenshot). Any idea how I can identify the results of the last .fit call?
  • These models are actually wrapped in several layers of classes so that the actual model object I'd like to upload is not the bare classifier. What does it take to implement its own model type? Should I create my own framework or? Can Output model be used to do so? I'm trying to avoid having to use upload_artifact because I'd like the models to show up in the model board.
    Thanks a lot for any input.
    image
  
  
Posted one year ago
Votes Newest

Answers 4


Hey tahnks a lot Alex, that's exactly what I was looking for. cheers

  
  
Posted one year ago

Hey @<1569858449813016576:profile|JumpyRaven4> , about your first point, what exactly is the question?

About your second point - you can try to manually save the final model and give it a proper file name, that way we will show it in the UI with the name you provided. Make sure to use xgboost.save_model and not raw pickle.

For your final question , given that your models have customised code, I can suggest trying to use clearml.OutputModel which will register the file you provide it as the serialized model. Meaning it’s going to be your responsibility to decide how to serialize/deserialize the model

  
  
Posted one year ago

This is the method you're looking for None . But make sure you have a model saved on disk before using it. And if you don't want the model to be deleted from disk after it, make sure to set auto_delete_file=False

  
  
Posted one year ago

Hi Alex,
thanks for your answer. I'm curious about your third point using OutputModel. I could not figure out from the documentation how do you actually use it. I constructed the OutputModel object as such:

  • out = OutputModel(task, name="my_model", framework="xgboost")
    However, I could not find any method in the doc that would allow me to pass the model object to that instance or said otherwise, I can't understand how to use that Output model to register my model which would be stored in a variable my_model ?
  
  
Posted one year ago