So, I monkey patched this fix into my code, however, that still did not help, so frankly I have just made it to try again within the _add_external_files
method that I'm patching to just check again and list files again if it fails. I think that would be also something that you could add, retries into the _add_external_files
method itself, so that it retries calling StorageManager.exists_file
because that appears to be the main point of failure in this case. I mean, not a failure caused by ClearML per se, but that is where it fails the whole rest of the process because the file is not added, despite it existing, just the server decides to refuse the request. So, if there could be some way to retry that bit (a configurable number of times and with a configurable delay of course) or similar, that would also be great. @<1523701435869433856:profile|SmugDolphin23>