@<1523701435869433856:profile|SmugDolphin23> Setting it without http is not possible as it auto fills them back in
But there are stil some wierd issues, i cannot see the files uploaded in bucket
elastisearch also takes like 15GB of ram
will it be appended in clearml?
"s3" is part of domain to the host
Sounds similar to our issue? We have self hosted S3
None
here is also another magic stuff
from docker inspect I can see that allegorai/clearml uses:
"CLEARML_SERVER_VERSION=1.11.0",
"CLEARML_SERVER_BUILD=373"
Image hash:ed05631045c4237f59ad48f477e06dd72274ab67e70d2f9adc489431d1ce75d7
I do notice another strange thing
Agent-services is down because It has no API key to clearm
hi, thanks for reaching out. Getting desperate here.
Yes, its self hosted
No, only currently running experiments are deleted (task itself is gone, but debug images and models are present in fileserver folder)
What I do see is some random elastisearch errors popping up from time to time
[2024-01-05 09:16:47,707] [9] [WARNING] [elasticsearch] POST
None ` [status:N/A requ...
ok, then, I have a solution, but it still makes duplicate names
- new_dataset._dataset_link_entries = {} # Cleaning all raw/a.png files
- resize a.png and save it in another location named a_resized.png
- Add back other files i need (excluding raw/a.png), I add them to new_dataset._ dataset_link_entries
- Use add_external_files to include it in dataset. Im also using dataset_path=[a list of relative paths]
What I would expect:
100 Files removed (all a.png)
100 Files added (all a_resized.png)
...
Our datasets are more than 1TB in size and will grow in size (probably 4TB and up), this means we also need 4TB local storage just to upload the dataset back in zipped format. This is not a good solution.
What we can do I guess is do the downloading locally by some chunks of files?
Download locally 100 files, add_to_clearml dataset, repeat
is there any way to see if I even have the data in mongodb?
@<1523701070390366208:profile|CostlyOstrich36> Any news on this? We are currently stuck without this fix, cant finish up clearml setup
WebApp: 1.14.1-451 • Server: 1.14.1-451 • API: 2.28
When I look at LinkEntry object, link property is correct, no duplicates. Its relative_path thats duped and also key name in _dataset_link_entries
We fixed the issue, thanks, had to update everything to latest.
In which ui? Because there are two ways to do it. When clicking on artifacti url there is a popup (but has no way to change host url)
Our s3 host doesnt have port (didnt specify port in clearml.conf anywhere and upload works)
![image](https://clearml-web-assets.s3.amazonaws.com/scoold/images/TT9A...
@<1523701070390366208:profile|CostlyOstrich36> Updated webserver and the problem still persists
This is the new stack:
WebApp: 1.15.1-478 • Server: 1.14.1-451 • API: 2.28
notice, we didnt update API (we had running experiments)
Is is even known if the bug is fixed on that version?
I already found the source code and i modified it as needed.
How can I now get this info from Task that is created when Dataset is created?
Couldnt find anything like clearml.Dataset(id=id).get_size()