Reputation
Badges 1
606 × Eureka!Yea, that I knew 😄 But somehow I didn't think about the clearml.conf
Thank you! I agree with CostlyOstrich36 that is why I meant false sense of security 🙂
Thank you SuccessfulKoala55 so actually only the file-server needs to be secured.
Yea, something like this seems to be the best solution.
Thank you very much. I am going to try that.
Okay, great! I just want to run the cleanup services, however I am running into ssh issues so I wanted to restart it to try to debug.
Yea, correct! No problem. Uploading such large artifacts as I am doing seems to be an absolute edge case 🙂
Thank you. Seems like this is not the best solution: https://serverfault.com/questions/132970/can-i-automatically-add-a-new-host-to-known-hosts#comment622492_132973
ca-certificates 2021.1.19 h06a4308_1
certifi 2020.12.5 py38h06a4308_0
cudatoolkit 11.0.221 h6bb024c_0
ld_impl_linux-64 2.33.1 h53a641e_7
libedit 3.1.20191231 h14c3975_1
libffi 3.3 he6710b0_2
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
ncurses ...
Do you know how I can make sure I do not have CUDA or a broken installation installed?
Okay, but are you logs still stored on MinIO with only using sdk.development.default_output_uri
?
But this means the logger will use the default fileserver or not?
@<1576381444509405184:profile|ManiacalLizard2> Maybe you are using the enterprise version with the vault? I suppose the enterprise version is running differently, but I dont have experience with it.
For the open-source version, each clearml-agent is using it's own clearml.conf
Let me try it another time. Maybe something else went wrong.
Wait, nvm. I just tried it again and now it worked.
Okay. It works now. I don't know what went wrong before. Probably a user error 😅
Perfect! That sounds like a good solution for me.
I will create a minimal example.
Good to know!
I think the current solutions are fine. I will try it first and probably will have some more questions/problems 🙂
test_clearml
, so directly from top-level.
Type "help", "copyright", "credits" or "license" for more information.
>>> from clearml_agent.helper.gpu.gpustat import get_driver_cuda_version
>>> get_driver_cuda_version()
'110'
@<1523701087100473344:profile|SuccessfulKoala55> Only when I delete on self-hosted.
@<1523712723274174464:profile|LazyFish41> WebApp: 1.10.0-357 • Server: 1.10.0-357 • API: 2.24
This has been happening with every version of clearml-server ever. Most probably there should be a queue in front of ES, so it does not process to many request at the same time?