![Profile picture](https://clearml-web-assets.s3.amazonaws.com/scoold/avatars/AmiableSeaturtle81.png)
Reputation
Badges 1
114 × Eureka!I purged all docker images and it still doesnt seem right
I see no side panel and it doesnt ask for login name
This is what I see on fresh clearml
Where all my mounts are on /mnt/data/clearml-server instead of /opt/clearml
Our datasets are more than 1TB in size and will grow in size (probably 4TB and up), this means we also need 4TB local storage just to upload the dataset back in zipped format. This is not a good solution.
What we can do I guess is do the downloading locally by some chunks of files?
Download locally 100 files, add_to_clearml dataset, repeat
I know these keys work, url and everything else works because I use these creds daily
@<1523701087100473344:profile|SuccessfulKoala55> Anything on this?
I need the zipping, chunking to manage millions of files
elastisearch also takes like 15GB of ram
Yes, but does add_external_files makes chunked zips as add_files do?
I see the debug images in fileserver folder
We fixed the issue, thanks, had to update everything to latest.
ok, then, I have a solution, but it still makes duplicate names
- new_dataset._dataset_link_entries = {} # Cleaning all raw/a.png files
- resize a.png and save it in another location named a_resized.png
- Add back other files i need (excluding raw/a.png), I add them to new_dataset._ dataset_link_entries
- Use add_external_files to include it in dataset. Im also using dataset_path=[a list of relative paths]
What I would expect:
100 Files removed (all a.png)
100 Files added (all a_resized.png)
...
hi, thanks for reaching out. Getting desperate here.
Yes, its self hosted
No, only currently running experiments are deleted (task itself is gone, but debug images and models are present in fileserver folder)
What I do see is some random elastisearch errors popping up from time to time
[2024-01-05 09:16:47,707] [9] [WARNING] [elasticsearch] POST
None ` [status:N/A requ...
Can I do it while i have multiple ongoing training?
WebApp: 1.14.1-451 • Server: 1.14.1-451 • API: 2.28
we are cleaning, but there is a major problem
When deleting a task from web UI, nothing is deleted elsewhere
Debug images are not deleted, models are not deleted. And I suspect that scalars and logs are not deleted too
Im not sure why is that so
- is 50GB elastisearch normal? Have you seen it. elsewhere or are we doing something wrong, one thing I think is that we are probably logging too frequently
- Is it possible to somehow clean up this?
I also have noticed that this incident usually happens in the morning at around 6-7AM
Are there maybe some clearnup tasks or backups running on clearml server at those times?
What do you mean by reusing the task for clearml Dataset, got a code example?
We have multiple different projects with multiple people working on each project.
This is our most used code on dataset uploading
We had a similar problem. Clearml doesnt support data migration (not that I know of)
So you have two ways to fix this:
- Recreate the dataset when its already in Azure
- Edit each elasticsearch database file entry to point to new destination (we did this)
I get the same when I copy /opt/clearml/data folder into /mnt/data/clearml/data
Here are my clearml versions and elastisearch taking up 50GB
In which ui? Because there are two ways to do it. When clicking on artifacti url there is a popup (but has no way to change host url)
Our s3 host doesnt have port (didnt specify port in clearml.conf anywhere and upload works)
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT9A...