Reputation
Badges 1
89 × Eureka!ahh, because task_id is the "real" id of a task
python='python3' ~/anaconda3/envs/.venv/bin/python3
these are the service instances (basically increased visibility into what's going on inside the serving containers
But these have: different task ids, same endpoints (from looking through the tabs)
So I am not sure why they are here and why not somewhere else
nah, it runs about 1 minute of standards SQL->dataframes->xgboost pipeline with some file saving
I passed an env variable to the docker container so I figure this out
because we already had these get_artifact(), get_model() functions that the DSes use to get the data into notebooks to further analyse their stuff, I might as well just use those with a custom preprocess and call the predict myself.
Is there an explicit OutputModel + xgboost example somewhere?
I just disabled all of them with auto_connect_frameworks=False
I was just looking at the model example. How does output model store the binary? For example of an xgboost model
pickle.dump({ 'model': model, 'X_train': X_train, 'Y_train': Y_train, 'X_test': X_test, 'Y_test': Y_test, 'impute_values': impute_values }, open(self.output_filename, 'wb'))
Apparently our devops guy figured it out that you needed to have a different port number and a different docker container given 8080 was already occupied
I guess so, this was done by our DevOps guy and he said he is following instructions
Why do I need an output_uri for the model saving? The dataset API can figure this out on its own
yes, I do, I added a auxiliary_cfg and I saw it immediately both in CLI and in the web ui
What I try to do is that DSes have some lightweight baseclass that is independent of clearml they use and a framework have all the clearml specific code. This will allow them to experiment outside of clearml and only switch to it when they are in an OK state. This will also help not to pollute clearml spaces with half backed ideas
but is it true that I can have multiple models on the same docker instance with different endpoints?
if I swap it to Framework then it autocrecords both
I didn't realise that pickling is what triggers clearml to pick it up. I am actually saving a dictionary that contains the model as a value (+ training datasets)
but I am one level lower than top. so:
~/work/repo is the main repo dir
~/work/repo/code/run.py and I am running python run.py in ~/work/repo/code
If I do this it still autorecords the sklearn one
I think I figured this out but now I have a problem:
auto_connect_frameworks={ 'xgboost': False, 'scikitlearn': False }
I think this is because of the version of xgboost that serving installs. How can I control these?
but here I can tell them: return a dictionary of what you want to save
git status gives correct information
python run.py param1 param2