i don't have one, as i said it is not very reproduceable. the same code runs fine one time, and another time (running the exact same experiment) it works the same but with the logging issues. as i mentioned, IMO it is not something related to the code itself but to connectivity with clearml servers. i'm running on GCP machines, which is not the first time i'm experiencing connectivity issues with clearml when working on them (we migrated from AWS ec2 a few weeks ago). the first issue was with very long time task.connect
executions (up to several hours for connecting some dictionaries which should be executed within seconds). maybe the issue is related with (low) prioritizing requests from GCP?
Hi DangerousBee35 , do you have some stand-alone code snippet that reproduces this behaviour?