Reputation
Badges 1
89 × Eureka!TBH our Preprocess class has an import in it that points to a file that is not part of the preprocess.py so I have no idea how you think this can work.
yeah so in docker run:-e TASK_ID='b5f339077b994a8ab97b8e0b4c5724e1' \ -e MODEL_NAME='best_model' \
and then in Preprocess:self.model = get_model(task_id=os.environ['TASK_ID'], model_name=os.environ['MODEL_NAME'])
but is it true that I can have multiple models on the same docker instance with different endpoints?
I put two models in the same endpoint, then only one was running,
Sorry I wanted to say "service id"
Same service-id but different endpoints
the current diffs
git-nbdiffdriver diff: git-nbdiffdriver: command not found fatal: external diff died, stopping at ...
Having human readable ids always help communication but programmatically we definitely going to use the "real" id. But I think we are too early into this and I will report back on how it is going with this.
I would think having a unique slug is a good idea so the team can communicate purely be that single number. Maybe we will call tasks as slug_yyyymmdd
ahh, because task_id is the "real" id of a task
also random tasks are popping up in the DevOps project in the UI
I want the model to be stored in a way that clearml-serving can recognise it as a model
I am running a script
because we already had these get_artifact(), get_model() functions that the DSes use to get the data into notebooks to further analyse their stuff, I might as well just use those with a custom preprocess and call the predict myself.
while in our own code:if model_type == 'XGBClassifier': model = XGBClassifier() model.load_model(filename)
Apparently our devops guy figured it out that you needed to have a different port number and a different docker container given 8080 was already occupied
git status gives correct information
I know there is a aux cfg with key value pairs but how can use it in the python code?"auxiliary_cfg": { "TASK_ID": "b5f339077b994a8ab97b8e0b4c5724e1", "V": 132 }
TBH ClearML doesn't seem to be picking the model up so I need to do it manually
nah, it runs about 1 minute of standards SQL->dataframes->xgboost pipeline with some file saving
and immediately complained about a package missing, which apparently I can't specify when I establish the model endpoint but I need to re compose the docker container by passing an env variable to it????
and then have a wrapper that gets the model data and selects which way to construct and deserialise the model class.
` def get_model(task_id, model_name):
task = Task.get_task(task_id)
try:
model_data = next(model for model in task.models['output'] if model.name == model_name)
except StopIteration as ex:
raise ValueError(f'Model {model_name} not found in: {[model.name for model in task.models["output"]]}')
filename = model_data.get_local_copy()
model_type =...
pickle.dump({ 'model': model, 'X_train': X_train, 'Y_train': Y_train, 'X_test': X_test, 'Y_test': Y_test, 'impute_values': impute_values }, open(self.output_filename, 'wb'))
Why do I need an output_uri for the model saving? The dataset API can figure this out on its own
{"detail":"Error processing request: ('Expecting data to be a DMatrix object, got: ', <class 'pandas.core.frame.DataFrame'>)"}