Reputation
Badges 1
6 × Eureka!Thank you for the quick response! I will have a try 👍
Hi @<1523701827080556544:profile|JuicyFox94> , no, I expose the services using NodePort
It works as expected. Thank you again!
It turned out that the issue was caused by my network environment. Somehow my network environment was throttled and led to the issue. Changing to a better network environment made it work.
However, when I tried to upload even larger artifacts in a row (around 200MB for each), it failed due to the failure of livenessprob
and readinessprob
of fileserver
pod. By default, the timeout of the two probes is 1s. I increased the timeout to 100s and that fixed the issue. @<152370182708055...
Is it possible that there is a bug in the fileserver
that prevents us uploading a large file (say around 25MB)? Btw, if I switch the default output URI in the SDK to upload to a Azure blob storage instead of fileserver
, the functionality works good.
Hi CostlyOstrich36 , I deployed the ClearML server in a k8s cluster using helm chart of version 5.5.0: https://github.com/allegroai/clearml-helm-charts/tree/clearml-5.5.0/charts/clearml , which deployed v1.9.2 server, I think.
For the SDK, I am using v1.9.1.