Reputation
Badges 1
42 × Eureka!Hi @<1523701087100473344:profile|SuccessfulKoala55> , thanks for your message! 🙂 I am aware that the console is also logged on the server, but I somehow find it not optimal to look for relevant information in the console log and would like to place the information in a more structured way.
Hi @<1523701205467926528:profile|AgitatedDove14> , thanks for your answer! Can you tell me, how specifically I map my clearml.conf to the containers? By the way, the credentials are already set (and working) in the clearml.conf.
Hi @<1523701070390366208:profile|CostlyOstrich36> , I just have solved the issue! :) After calling clearml-serving create --name "model serving" the printed task id has to be filled in the values.yaml of the clearml-serving helm chart under clearml.servingTaskId. After installing the helm chart, the draft of the service task is started automatically so there is no need to manually enqueue it.
Would it be possible to add this info to the docs? Maybe a small hint on this page [None](https...
Hi @<1523701118159294464:profile|ExasperatedCrab78> , I have a sad update on this issue. It does not seem to be completely solved yet. 😕 But I think I can at least describe it a bit better now:
- Models which are located on the clearML servers (created by
Task.init(..., output_uri=True)) still run perfectly. - Models which are located on azure blob storage make different problems in different scenarios (which made me think we resolved this issue):- When I start the docker con...
Ok, I have found the issue. 🙌 When I try to serve a model which is saved on azure (generated by Task.init(..., output_uri='azure://...') ) I get the poll failed for model directory 'test_model_pytorch': failed to open text file for read /models/test_model_pytorch/config.pbtxt: No such file or directory error. A model which was saved on the clearML server (generated by Task.init(..., output_uri=True) ) can be served without any problems.
For now I am not sure why th...
I think you are correct with your guess that the services were not shut down properly. I noticed that some services were still shown as running on the clear ml dashboard. I aborted all and at least got rid of the error ValueError: triton-server process ended with error code 1 . But the two errors you named are still there and I also got these two warnings:
` clearml-serving-triton | Warning: more than one valid Controller Tasks found, using Task ID=4709b0b383a04bb1a033e99fd325dc...
Hi @<1523701205467926528:profile|AgitatedDove14> thanks for your hint! I already convert it to torch script using tracing. Everything around the model should be fine, since it already worked with the docker clearml-serving setup.
I think the real issue is that I am not able to specify a platform for the model, as the error above tells me that no platform is given no matter how I try to pass it.
Hi @<1523701827080556544:profile|JuicyFox94> I figured out what the problem is! For some recent experimentation I set an acces_key and secret_key as environment variables in my os. When I deleted them everything worked fine so the environment variables overwrote the keys given by the clearml.conf. Is that the desired default behaviour?
And just one tip for everbody having similar problems: Switch to using the SDK instead of the CLI for better debugging. This helped me to find the cause of m...
By the way, the example which worked for me in the beginning also produces the same error now poll failed for model directory 'test_model_pytorch': failed to open text file for read /models/test_model_pytorch/config.pbtxt: No such file or directory . So there really seems to be something wrong with the docker containers.
Hi @<1523701070390366208:profile|CostlyOstrich36> , of course! Here it is (with blurred urls, paths and account names)
What do you mean by "How are you creating the model?"? I executed a pytorch model training saved a traced version of the model so that saved with the executed task. This was also no problem with the docker container setup.