Reputation
Badges 1
123 × Eureka!@<1523701482157772800:profile|AnxiousSeal95> I see a lot of people here migrating data from one data source to another.
For us it was that we experimented with Clearml to get the feeling and we used clearml built in file storage to save debug images an all other artifacts.
Then we grew rapidly and we had to migrate to S3 storage.
I had to write a script that goes through elasticsearch and mongo db to point to new S3 links wher the data was migrated to.
I do however understand that migration...
We had a similar problem. Clearml doesnt support data migration (not that I know of)
So you have two ways to fix this:
- Recreate the dataset when its already in Azure
- Edit each elasticsearch database file entry to point to new destination (we did this)
The incident happened last friday (5 january)
Im giving you logs from around that time
@<1523701435869433856:profile|SmugDolphin23> Setting it without http is not possible as it auto fills them back in
is there any way to see if I even have the data in mongodb?
also, when uploading artifacts, I see where they are stored on the s3 bucket, but I cant find where the debug images are stored at
i need clearml.conf on my clearml server (in config folder which is mounted in docker-compose) or user PC? Or Both?
Its self hosted S3 thats all I know, i dont think it s Minio
Getting errors in elastisearch when deleting tasks, get retunred "cant delete experiment"
@<1523701070390366208:profile|CostlyOstrich36> Hello, im still unable to understand how to fix this
Is is even known if the bug is fixed on that version?
Hey, i see that 1.14.2 dropped
I tried it but the issue is still there, maybe the hotfix is in next patch?
Here is the setup so you can reproduce it (we dont have region field)
clearml.conf:s3 {
use_credentials_chain: false
credentials: [
{
host: "
s3.somehost.com "
key: "XXXXXXXXXXXXXXXXXXXX"
` secret: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
I solved the problem.
I had to add tensorboard loggger and pass it to pytorch_lightning trainer logger=logger
Is that normal?
@<1523701070390366208:profile|CostlyOstrich36> Any news on this? We are currently stuck without this fix, cant finish up clearml setup
@<1523701070390366208:profile|CostlyOstrich36> 👀
Yes, credetials seems to work
Im trying to figure out not why I dont see the uploaded files / folders
- I checked maybe clearml task uses fileserver instead but i dont see any files in fileserver folder
- Nothing is uploaded in bucket (i will ask IT guy to check if im uploading any files in logs)
Yes, but does add_external_files makes chunked zips as add_files do?
I need the zipping, chunking to manage millions of files
im also batch uploading, maybe thats the problem?
- The dataset is about 1TB containing 1 million files
- I dont have the SSD space locally to do the upload
- So i download a part of the dataset, use add_files() and then upload() to that batch
- Upload the dataset
I noticed that each batch is slower and slower
No, i specify where to upload
I see the data on S3 bucket is beeing uploaded. Just the log messages are really confusing
It is also possible to just make a copy of all the database files and move them to another server
Can I do it while i have multiple ongoing training?
i can add "source /workspace/.venv/bin/activate", to clearml.conf docker_init_bash_script
However it then tries to access pip, but i dont need no pip, how to disable it, i already have my packages, and uv doesnt even require pip
Where can i override this so that it uses uv instead of trying to install python with apt
how to get rid of this auto appended line
I see in clearml-agent that it is created here
i also think that if my package manager is set to uv, then it should only use uv and ignore pip at all