Reputation
Badges 1
606 × Eureka!But it is not related to network speed, rather to clearml. I simple file transfer test gives me approximately 1 GBit/s transfer rate between the server and the agent, which is to be expected from the 1Gbit/s network.
Yea, correct! No problem. Uploading such large artifacts as I am doing seems to be an absolute edge case 🙂
I guess this is from clearml-server and seems to be bottlenecking artifact transfer speed.
I just realized that I forgot again that I am using importlib and this is probably why everythings weird. I tried to reproduce the error was a smaller project and was not able to get the error again. Sorry for having wasted your time! 😐
Thanks for your help again. I will just use detect_with_conda_freeze: true
. Seems like a perfect solution for me!
I was wondering whether some solution is builtin in clearml, so I do not have to configure each server manually. However, from your answer I take that this is not the case.
I just wanna add: I can run this task on the same workstation with the same conda installation just fine.
btw: I also tested the clearml-agent running on a different machine and with python 3.8 and I get the same problems.
https://clearml.slack.com/archives/CTK20V944/p1620855259093200 This thread may also be interesting for you.
So only short update for today: I did not yet start a run with conda 4.7.12.
But one question: Actually conda can not be at fault here, right? I can install pytorch just fine locally on the agent, when I do not use clearml(-agent)
Or there should be an early error for trying to run conda based tasks on pip agents
Can you ping me when it is updated in None so I can update my installation?
Would it help you diagnose this problem if I ran conda env create --file=environment.yml
and see whether it works?
From the logs when ran with --foreground I
I do not see any conda create
command.
Yes, that works fine. Just the http vs https was the problem. The UI will automatically change s3://<minio-address>:<port>
to
http://<minio-address>:<port>
in http://myclearmlserver.org/settings/webapp-configuration . However what is needed for me is https://<minio-address>:<port>
To answer my own question: In the WebUI where one inputs the credentials, use https
for the host instead of the auto-added http
This my environment installed from env file. Training works just fine here:
But I do not have anything linked correctly since I rely in conda installing cuda/cudnn for me
Thank you! I agree with CostlyOstrich36 that is why I meant false sense of security 🙂
Thank you SuccessfulKoala55 so actually only the file-server needs to be secured.
You can add and remove clearml-agents to/from the clearml-server anytime.