AgitatedDove14 ClearML server itself and all of its components (API server etc.) are on x.x.x.69 machine.
Agents and serving are on x.x.x.68 worker machine. My model files are also there, just placed in some usual non-shared linux directory.
And I didn't do any specific configurations of the clearml fileserver docker - everything is on its defaults without a single line changed except the IP address of the ClearML server.
I tried a couple of approaches to upload my preexisting models into ClearML:
- To send them directly from .68 via the following script:
from clearml import Task, InputModel
task = Task.init(project_name='LogSentinel', task_name='Register remote model from .68')
model_file_path = "file:///10.14.158.68/home/lab-usr/logsentinel/deeplog-bestloss.pth
model = InputModel.import_model(
name="deeplog_bilstm",
weights_url=model_file_path,
project="LogSentinel",
framework="pytorch"
)
task.connect(model)
It registers the model without any visible errors, it appears in the model repository.
- To copy the model.pth file itself to the .69 machine, then run the script for LOCAL model file upload:
from clearml import Task, InputModel
task = Task.init(project_name='LogSentinel', task_name='Register model')
model_file_path = "file:///home/lab-usr/logsentinel/deeplog-bestloss.pth
model = InputModel.import_model(name="deeplog_bilstm", weights_url=model_file_path, project="LogSentinel", framework="pytorch")
task.connect(model)
It registers it in model storage, also no errors, but neither of them works when clearml-serving is directed to use them via clearml-serving model add
, because triton serving fails with error of not being able to find model file, and requests to the endpoint return "error 405 - method not allowed".