Reputation
Badges 1
123 × Eureka!here is also another magic stuff
@<1523701070390366208:profile|CostlyOstrich36> Any news on this? We are currently stuck without this fix, cant finish up clearml setup
@<1523701601770934272:profile|GiganticMole91> Thats rookie numbers. We are at 228 GB for elastic now
Adding bucket in clearml.conf causes the same error: clearml.storage - ERROR - Failed uploading: Could not connect to the endpoint URL: " None "


Our s3 host doesnt have port (didnt specify port in clearml.conf anywhere and upload works)


will it be appended in clearml?
"s3" is part of domain to the host
we use Ceph Storage Cluster, interface to it is the same as S3
I dont get what I have misconfigured.
The only thing I have not added is "region" field in clearml.conf because we literally dont have, its a self hosted cluster.
You can try and replicate this s3 config I have posted earlier.
What you want is to have a service script that cleans up archived tasks, here is what we used: None
I know these keys work, url and everything else works because I use these creds daily
@<1523701070390366208:profile|CostlyOstrich36> Still unable to understand what im doing wrong.
We have self hosted S3 Ceph storage server
Setting my config like this breaks task.init
Sounds similar to our issue? We have self hosted S3
None
It is also possible to just make a copy of all the database files and move them to another server
When I look at LinkEntry object, link property is correct, no duplicates. Its relative_path thats duped and also key name in _dataset_link_entries
Our datasets are more than 1TB in size and will grow in size (probably 4TB and up), this means we also need 4TB local storage just to upload the dataset back in zipped format. This is not a good solution.
What we can do I guess is do the downloading locally by some chunks of files?
Download locally 100 files, add_to_clearml dataset, repeat
Yes, but does add_external_files makes chunked zips as add_files do?
7 out of 30 GB is currently used and is quite stable
ok, then, I have a solution, but it still makes duplicate names
- new_dataset._dataset_link_entries = {} # Cleaning all raw/a.png files
- resize a.png and save it in another location named a_resized.png
- Add back other files i need (excluding raw/a.png), I add them to new_dataset._ dataset_link_entries
- Use add_external_files to include it in dataset. Im also using dataset_path=[a list of relative paths]
What I would expect:
100 Files removed (all a.png)
100 Files added (all a_resized.png)
...
@<1523701070390366208:profile|CostlyOstrich36> Hello, im still unable to understand how to fix this
I also have noticed that this incident usually happens in the morning at around 6-7AM
Are there maybe some clearnup tasks or backups running on clearml server at those times?
ClearML team should really write up some tutorial about this. I see this question weekly now. The short answer on what we did when we migrated servers was to wite a python script that takes data from clearml mongodb(stores tasks and datasets) and elastic (stores debug image urls, logs, scalars) and migrate them to other clearml instance databases
i can add "source /workspace/.venv/bin/activate", to clearml.conf docker_init_bash_script
However it then tries to access pip, but i dont need no pip, how to disable it, i already have my packages, and uv doesnt even require pip
@<1523701070390366208:profile|CostlyOstrich36> Hello John, we are still unable to use clearml with our self hosted s3 CEPH instances, is there any update on the hotfix for 1.14?