Reputation
Badges 1
123 × Eureka!In which ui? Because there are two ways to do it. When clicking on artifacti url there is a popup (but has no way to change host url)
Our s3 host doesnt have port (didnt specify port in clearml.conf anywhere and upload works)


host: " our-host.com "
Then in test_task.py
task: clearml.Task = clearml.Task.init(
project_name="project",
task_name="task",
output_uri=" None ",
)
I think connection is created
What im getting now is bucket error, i suppose I have to specify it so...
how to get rid of this auto appended line
I see in clearml-agent that it is created here
Where can i override this so that it uses uv instead of trying to install python with apt
Im basically trying to force the agent to use uv defined python
i also think that if my package manager is set to uv, then it should only use uv and ignore pip at all
@<1523701070390366208:profile|CostlyOstrich36> 👀
i can add "source /workspace/.venv/bin/activate", to clearml.conf docker_init_bash_script
However it then tries to access pip, but i dont need no pip, how to disable it, i already have my packages, and uv doesnt even require pip
Hi, ok im really close now to working system
Debug image is uploading to s3, im seeing the files, all ok there
Problem now is viewing these images in web UI
Going to Debug Samples panel in Task drops me a popup to fill in s3 credentials
I cant figure out what the right setup is for the creds to work
This is what I have now (Note that we dont have region)
I have tried:
Airflow - Pain to setup, old UI and other problems
Prefect - Literaly just tried to setup a simple distributed system, took me a week, I do not recommend this tool at all, horrible documentation, noone helps at slack.
Dagster - Absolute beauty, nice UI, easy to setup (as a pip package or just a docker + postgres), i highly recommend this tool. Takes a bit to get used to it. I will in coming week try this combo of dagster + clearml, where i periodically check some things and if...
@<1709740168430227456:profile|HomelyBluewhale47> We have the same problem. Millions of files, stored on CEPH. I would not recommend you to do it this way. Everything gets insanely slow (dataset.list_files, downloading the dataset, removing files)
The way I use Clearml Datasets for large number of samples now is to save a json which stores all paths to samples in Dataset metadata:
clearml_dataset.set_metadata(metadata, metadata_name=metadata_key)
However this then means that you need wrappe...
I also dont have side panel for some reason
@<1523701070390366208:profile|CostlyOstrich36> Hello John, we are still unable to use clearml with our self hosted s3 CEPH instances, is there any update on the hotfix for 1.14?
py file:
task: clearml.Task = clearml.Task.init(
project_name="project",
task_name="task",
output_uri=" None ",
)
clearml.conf:
{
# This will apply to all buckets in this host (unless key/value is specifically provided for a given bucket)
host: " our-host.com "
key: "xxx"
secret: "xxx"
multipart: false
...
When I look at LinkEntry object, link property is correct, no duplicates. Its relative_path thats duped and also key name in _dataset_link_entries
Yes, but does add_external_files makes chunked zips as add_files do?
Yes, credetials seems to work
Im trying to figure out not why I dont see the uploaded files / folders
- I checked maybe clearml task uses fileserver instead but i dont see any files in fileserver folder
- Nothing is uploaded in bucket (i will ask IT guy to check if im uploading any files in logs)

Not really, but i think i will figure out the uv caching
I have another question @<1523701070390366208:profile|CostlyOstrich36>
How can i make the clearml agent to just run the image with just the uv
dont install any packages, nothing
i found docker_init_bash_script in clearml.config
i know there are some envs to pass in task init but that does not fully do what i want - just simply run the image, i have all the dependencies
we are cleaning, but there is a major problem
When deleting a task from web UI, nothing is deleted elsewhere
Debug images are not deleted, models are not deleted. And I suspect that scalars and logs are not deleted too
Im not sure why is that so
Is that supposted to be so? How to fix it?
elastisearch also takes like 15GB of ram


