i can add "source /workspace/.venv/bin/activate", to clearml.conf docker_init_bash_script
However it then tries to access pip, but i dont need no pip, how to disable it, i already have my packages, and uv doesnt even require pip
I need the zipping, chunking to manage millions of files
maybe someone on your end can try to parse such a config and see if they also have the same problem
Sounds similar to our issue? We have self hosted S3
None
What you want is to have a service script that cleans up archived tasks, here is what we used: None
Where can i override this so that it uses uv instead of trying to install python with apt
how to get rid of this auto appended line
Im basically trying to force the agent to use uv defined python
im also batch uploading, maybe thats the problem?
- The dataset is about 1TB containing 1 million files
- I dont have the SSD space locally to do the upload
- So i download a part of the dataset, use add_files() and then upload() to that batch
- Upload the dataset
I noticed that each batch is slower and slower
You can check out boto3 python client (This is what we use to download / upload all S3 stuff), but minio-client probably already uses it under the hood.
We also use aws cli to do some downloading, it is way faster than python.
Regarding pdfs, yes, you have no choice but to preprocess it
No, i specify where to upload
I see the data on S3 bucket is beeing uploaded. Just the log messages are really confusing