Reputation
Badges 1
123 × Eureka!Im basically trying to force the agent to use uv defined python
i also think that if my package manager is set to uv, then it should only use uv and ignore pip at all
@<1523701070390366208:profile|CostlyOstrich36> 👀
I have tried:
Airflow - Pain to setup, old UI and other problems
Prefect - Literaly just tried to setup a simple distributed system, took me a week, I do not recommend this tool at all, horrible documentation, noone helps at slack.
Dagster - Absolute beauty, nice UI, easy to setup (as a pip package or just a docker + postgres), i highly recommend this tool. Takes a bit to get used to it. I will in coming week try this combo of dagster + clearml, where i periodically check some things and if...
@<1709740168430227456:profile|HomelyBluewhale47> We have the same problem. Millions of files, stored on CEPH. I would not recommend you to do it this way. Everything gets insanely slow (dataset.list_files, downloading the dataset, removing files)
The way I use Clearml Datasets for large number of samples now is to save a json which stores all paths to samples in Dataset metadata:
clearml_dataset.set_metadata(metadata, metadata_name=metadata_key)
However this then means that you need wrappe...
I also dont have side panel for some reason
@<1523701070390366208:profile|CostlyOstrich36> Hello John, we are still unable to use clearml with our self hosted s3 CEPH instances, is there any update on the hotfix for 1.14?
py file:
task: clearml.Task = clearml.Task.init(
project_name="project",
task_name="task",
output_uri=" None ",
)
clearml.conf:
{
# This will apply to all buckets in this host (unless key/value is specifically provided for a given bucket)
host: " our-host.com "
key: "xxx"
secret: "xxx"
multipart: false
...
When I look at LinkEntry object, link property is correct, no duplicates. Its relative_path thats duped and also key name in _dataset_link_entries
Yes, but does add_external_files makes chunked zips as add_files do?
Yes, credetials seems to work
Im trying to figure out not why I dont see the uploaded files / folders
- I checked maybe clearml task uses fileserver instead but i dont see any files in fileserver folder
- Nothing is uploaded in bucket (i will ask IT guy to check if im uploading any files in logs)

Is it possible to split the large elasticsearch indexes? I know elasticsearch has something called rollover, but im not sure that clearml supports this
It looks like im moving forward
Setting url in clearml.conf without "s3" as suggested works (But I dont add port ther, not sure if it breaks something, we dont have a port)
host: " our-host.com "
Then in test_task.py
task: clearml.Task = clearml.Task.init(
project_name="project",
task_name="task",
output_uri=" None ",
)
I think connection is created
What im getting now is bucket error, i suppose I have to specify it so...
@<1523701070390366208:profile|CostlyOstrich36> Still unable to understand what im doing wrong.
We have self hosted S3 Ceph storage server
Setting my config like this breaks task.init
Hi, ok im really close now to working system
Debug image is uploading to s3, im seeing the files, all ok there
Problem now is viewing these images in web UI
Going to Debug Samples panel in Task drops me a popup to fill in s3 credentials
I cant figure out what the right setup is for the creds to work
This is what I have now (Note that we dont have region)
@<1523701070390366208:profile|CostlyOstrich36> Hello, im still unable to understand how to fix this
@<1523701070390366208:profile|CostlyOstrich36> Updated webserver and the problem still persists
This is the new stack:
WebApp: 1.15.1-478 • Server: 1.14.1-451 • API: 2.28
notice, we didnt update API (we had running experiments)
I also have noticed that this incident usually happens in the morning at around 6-7AM
Are there maybe some clearnup tasks or backups running on clearml server at those times?
Not really, but i think i will figure out the uv caching
I have another question @<1523701070390366208:profile|CostlyOstrich36>
How can i make the clearml agent to just run the image with just the uv
dont install any packages, nothing
i found docker_init_bash_script in clearml.config
i know there are some envs to pass in task init but that does not fully do what i want - just simply run the image, i have all the dependencies
we are cleaning, but there is a major problem
When deleting a task from web UI, nothing is deleted elsewhere
Debug images are not deleted, models are not deleted. And I suspect that scalars and logs are not deleted too
Im not sure why is that so
Is that supposted to be so? How to fix it?
elastisearch also takes like 15GB of ram
Here are my clearml versions and elastisearch taking up 50GB
Getting errors in elastisearch when deleting tasks, get retunred "cant delete experiment"



