The package is just subdir by the way. So it should not be in installed packages anyways, right?
From the logs when ran with --foreground I I do not see any conda create command.
Okay, but are you logs still stored on MinIO with only using sdk.development.default_output_uri ?
Is this really working for you guys? I have no clue what's wrong. Seems so unlikely that my code works with artifacts, datasets, but not logging...
Seems more like a bug or something is not properly configured on my side.
conda env update -p .clearml/venvs-builds/3.8 ./environment.yml
with environment.yml
name: clearml
channels:
- pytorch
- anaconda
- conda-forge
- defaults
dependencies:
- pytorch==1.8.0
But it is not related to network speed, rather to clearml. I simple file transfer test gives me approximately 1 GBit/s transfer rate between the server and the agent, which is to be expected from the 1Gbit/s network.
Agent runs in docker mode. I ran the agent on the same machine as the server this time.
Yea, and the script ends with clearml.Task - INFO - Waiting to finish uploads
Yea, it was finished after 20 hours. Since the artifact started uploading when the experiment finishes otherwise, there is no reporting for the the time where it uploaded. I will debug it and report what I find out
Nothing changes, still bad owner or permissions.
I guess the supported storage mediums (e.g. S3, ceph, etc...) dont have this issue, right?
ca-certificates 2021.1.19 h06a4308_1
certifi 2020.12.5 py38h06a4308_0
cudatoolkit 11.0.221 h6bb024c_0
ld_impl_linux-64 2.33.1 h53a641e_7
libedit 3.1.20191231 h14c3975_1
libffi 3.3 he6710b0_2
libgcc-ng 9.1.0 hdf63c60_0
libstdcxx-ng 9.1.0 hdf63c60_0
ncurses ...
Would it help you diagnose this problem if I ran conda env create --file=environment.yml and see whether it works?
Seems possible because I didn't know I had to specify an entrypoint somewhere. I will do some additional tests.
But this means the logger will use the default fileserver or not?
Thank you for the quick reply. Maybe anyone knows whether there is an option to let docker delete images after container exit?
Ah, very cool! Then I will try this, too.
An upload of 11GB took around 20 hours which cannot be right. Do you have any idea whether ClearML could have something to do with this slow upload speed? If not I am going to start debugging with the hardware/network.
It could be that either the clearml-server has bad behaviour while clean up is ongoing or even after.
Yes, I did not change this part of the config.