Thanks AgitatedDove14 ! Just to make sure I’m understanding correctly, do you mean that the ClearML Web server in https://clear.ml/docs/latest/docs/deploying_clearml/clearml_server issues a delete command to the ClearML API server, which is then responsible for trying to delete the files in S3? And that I need to enter an AWS key/secret in the profile page of the web app here?
This seems to be more complicated than what it looks like (ui/backend combination), not are not working on it, just that it might take some time as it passes control to the backend (which by design does not touch external storage points).
Maybe we should create an S3 cleanup service, listing buckets and removing if the Task ID does not exist any longer. wdyt?
I am referring to the UI. The default cleanup service should work with S3 with a correctly configured clearml service agent if I understand the workings correctly.
(It seems like the web server doesn’t log the call to AWS, I just see this:{SERVER IP} - - [22/Dec/2021:23:58:37 +0000] "POST /api/v2.13/models.delete_many HTTP/1.1" 200 348 "
ID}/models/{MODEL ID}/general?{QUERY STRING PARAMS THAT DETERMINE TABLE APPEARANCE} {BROWSER INFO} "-"
in
issues a delete command to the ClearML API server,...
almost, it issues the boto S3 delete commands (directly to the S3 server, not through the cleaml-server)
And that I need to enter an AWS key/secret in the profile page of the web app here? (edited)
correct
It seems like the web server doesn’t log the call to AWS, I just see this:
This points to the browser actually sending the AWS delete command. Let me check with FE tomorrow
Thanks Shay, good to know i just hadn't configured something correctly!
Hi UnevenDolphin73 sorry for the slow reply, been on leave!
We don’t have a solution right now, but if there’s no fix to the frontend in the near future we’ll probably try to write a script that queries the ClearML API for all artefacts, queries S3 for all artefacts, and figures out orphaned artefacts to delete.
. Would you have any suggestions about where I could look to debug? Maybe the docker logs of the web server?
Let me check, we had the same issue reported today, Let me double check with front-end people and get back to you
The default cleanup service should work with S3 with a correctly configured clearml service agent if I understand the workings correctly.
Yes I think you are correct
I am referring to the UI.
In that case, no 😞 . This is actually a backend server change (from the UI it should be relatively simple). Is this somehow a showstopper ?
No, it is just a pain to find files that have been deleted by a user, but are actually not deleted in the fileserver/s3 🙂
But no worries, nothing that is crucial.
I created this issue today, which can alleviate the pain temporarily: https://github.com/allegroai/clearml-server/issues/133
QuaintPelican38 did you have a workaround for this then? Some cleanup service or similar?
hi DeterminedCrab71 AgitatedDove14 do you have any updates on this? I have accumulated a lot of data (artifacts and datasets) on my bucket and I'll need a way to delete them whenever I remove my tasks
Ok great! I’ve actually provided a key and secret so I guess it should be working. Would you have any suggestions about where I could look to debug? Maybe the docker logs of the web server?
I get a popup saying that the actual files weren’t deleted from S3 (so presumably only the metadata on the server gets deleted).
Hi QuaintPelican38
The browser client actual issues the delete "command", (the idea is separation of the meta-data and data, e.g. artifacts). That means you have to provide the key/secret to the UI (see profile page)
Hi QuaintPelican38
We already tried to implement deleting S3 resources in the UI but got blocked by CORS.
we will implement deleting these resources through the server in a near version.
in the mean time you can use the SDK to delete entities with their artifacts.
Hey AgitatedDove14 is there any update on this?
Hi ReassuredTiger98
Are you referring to the UI (as much as I understand there was an improvement, but generally speaking, it still needs the users to have the S3 credentials in the UI client, not backend)
Or are you asking on the cleanup service ?
I don't mind running the cleanup service periodically so long i have an avenue to do so. Actually, if you can build such a service, then wouldn't it make sense to just call this service whenever a task ID is deleted?