HungryArcticwolf62 the new clearml-serving is almost out (eta late next week), you can already start playing here:
HungryArcticwolf62 transformer model is at the end a pytorch/tf model, with pre/post processing.
the pytorch/tf model inference is done with Triton (probably the most efficient engine today), where clearml runs the pre/post on a different CPU machine (making sure we fully utilize all the HW. Does that answer the question?
Latest docs here:
expect a release after the weekend 😉
Hi HungryArcticwolf62 ,
from what I understand you simply want to access models afterwards - correct me if I'm wrong.
What I think would solve your problem is the following:
task = Task.init(...., output_uri=True)This should upload the model to the server and thus make it accessible by other entities within the system.
Am I on track?
Actually, this opens my mind on what I'm trying to achieve. I'm trying to find a way to store the model (will try using the output_uri argument), and also a way to serve models using clearml-serving. Since I don't know yet how clearml-serving works, I wanted first to archive the correct files.
Hi AgitatedDove14 , CostlyOstrich36
Thanks for the links. I see that clearml-serving supports a predefined list of engines, transformer no included. Do you have any documentation on how one would implement an engine and integrate it into the on prem version?