So I am deploying clearml-server on an on-prem server, and the checkpoints etc. are quite large for the experiments I will do.
Instead I want to periodically upload / back up this data to s3, and free up local disk space. Is that something that is supported?
I see that in my docker-compose installation, most of the big files are in /opt/clearml/data
Hi @<1535069219354316800:profile|PerplexedRaccoon19> , I'm not sure I understand what you mean. Can you elaborate on the use case?
That makes sense, but that would mean that each client/user has to manage the upload themselves, right?
(I'm trying to use clearml to create an abstraction over the compute / cloud)
I think you can periodically upload them to s3, I think the StorageManager would help with that. Do consider that artifacts are logged in the system with links (each artifact is a link in the end) So even if you upload it to and s3 bucket in the backend there will be a link leading to the file-server so you would have to amend this somehow.
Why not upload specific checkpoints directly to s3 if they're extra heavy?
I'm thinking of using s3fs on the entire /opt/clearml/data folder. What do you think?
Either that or have a shared mount between the machines