Hi @<1657918706052763648:profile|SillyRobin38>
In the
preprocess.py
files, we will have so many similar lines which is not good.
Actually the clearml-serving supports also directories, i.e. you can package an entire module as part of the preprocess, which would be easier for your code
Another option is to package your code in a python package and have that installed on the container (there is a special env var that allows you to add those to the serving container)
None
, but what I really want to achieve is to share this code:
You mean to share the code between them, unless this is a "preinstalled" package in the container, each endpoint has it's own separate set of modules / files
(this is on purpose, so you could actually change them, just image diff versions of the same common.py file)
@<1523701205467926528:profile|AgitatedDove14> No, actually I can upload a directory for the model thanks to the ClearML, but what I really want to achieve is to share this code:
├── common
│ ├── common.py
Between these two preprocess.py
:
└── yolo8
└── preprocess.py
└── yolo7
└── preprocess.py
what is the best approach to update the package if we have frequent update on this common code?
since this package has an indirect affect on the model endpoint, I would package with the preprocess code of the endpoint.
Each server is updating it's own local copy, and it will make sure it can take it and deploy it hand over hand without breaking its ability to serve these endpoints.
the "wastefulness" of holding multiple copies is negligible when comparing to a situation where everyone is sharing the same exact copy, and upgrading results in everyone freezing their ability to serve
@<1523701205467926528:profile|AgitatedDove14> Thanks for the response, Yeah each endpoint will have it's own modules/files, just wanted to know if there is a way to share such common code between different endpoints in a way that the common code can be get synced like the preprocessing code.
Just I do have one question, please suppose that we have 1000 vm instances that are running, and please suppose that I will create a package from the common code and install it alongside of the container, what is the best approach to update the package if we have frequent update on this common code?
@<1523701205467926528:profile|AgitatedDove14> Thanks for the prompt response
@<1523701205467926528:profile|AgitatedDove14> About the proposed ways for fixing this issue, I've got my hands a little dirty with the code, and I think maybe adding another option to include some other files in the clearml-serving model add
command would be beneficial here. Please suppose that I have the current directory for now:
├── common
│ ├── common.py
└── yolo8
├── 1
│ ├── model_NVIDIA_GeForce_RTX_3080.plan
│ └── model_Tesla_T4.plan
├── config.pbtxt
└── preprocess.py
└── yolo7
├── 1
│ ├── model_NVIDIA_GeForce_RTX_3080.plan
│ └── model_Tesla_T4.plan
├── config.pbtxt
└── preprocess.py
And now, I want to have the same code across these two models. If I want to add the entire directory here, as you can see, it will be complicated, or if possible, it might have some flaws. Regarding the second option to add preprocessing as Python packages and install them alongside other things at build time, I think it might have a syncing issue because that code will change a lot and would be an issue to install that package everytime.
If it can be handled in such a way that the preprocessing code is managed, it would be much cleaner, right? Also, I've come up with some sort of uploading preprocessing code as an artifact and then getting all of the tasks and then find the most updated common code and get a local copy from it and it is working for me. But I'm thinking maybe adding such capability into ClearML itself would be great, and I was wondering what your thoughts are on this. Is it okay to fork the repo and implement another option in here which will get a list of source codes and upload it into ClearML storage and then fetch it again in a way that preprocessing is handled? Should I do that and create a PR, or is it not necessary?
You mean to add these two to the model when deploying?
│ ├── model_NVIDIA_GeForce_RTX_3080.plan
│ └── model_Tesla_T4.plan
Notice the preprocess.py
is Not running on the GPU instance, it is running on a CPU instance (technically not the same machine)