@<1523701482157772800:profile|AnxiousSeal95> I see a lot of people here migrating data from one data source to another.
For us it was that we experimented with Clearml to get the feeling and we used clearml built in file storage to save debug images an all other artifacts.
Then we grew rapidly and we had to migrate to S3 storage.
I had to write a script that goes through elasticsearch and mongo db to point to new S3 links wher the data was migrated to.
I do however understand that migration...
@<1523701070390366208:profile|CostlyOstrich36> Any news on this? We are currently stuck without this fix, cant finish up clearml setup
@<1523701070390366208:profile|CostlyOstrich36> It it still needed since Eugene thinks there is a bug?
The problem is that clearml.conf s3 config doesnt support empty region field, even empty strings crashes it
Our datasets are more than 1TB in size and will grow in size (probably 4TB and up), this means we also need 4TB local storage just to upload the dataset back in zipped format. This is not a good solution.
What we can do I guess is do the downloading locally by some chunks of files?
Download locally 100 files, add_to_clearml dataset, repeat
I guess I fucked up something when moving files
well, I connected to mongodb manually and it is empty, loaded with just examples
has 8 cores, so nothing fancy even
here is also another magic stuff
I hope that its all the experiments
I also see that elastisearch and mongo has some data
is there any way to see if I even have the data in mongodb?
WebApp: 1.16.0-494 • Server: 1.16.0-494 • API: 2.30
But be careful, upgrading is extremely dangerous
@<1523701601770934272:profile|GiganticMole91> Thats rookie numbers. We are at 228 GB for elastic now
We dont need a port
"s3" is part of url that is configured on our routers, without it we cannot connect
- Here is how client side clearml.conf looks like together with the script im using to create the tasks. Uploads seems to work and is fixed thanks to you guys 🙌



@<1523701070390366208:profile|CostlyOstrich36> Still unable to understand what im doing wrong.
We have self hosted S3 Ceph storage server
Setting my config like this breaks task.init
Hey, i see that 1.14.2 dropped
I tried it but the issue is still there, maybe the hotfix is in next patch?
Here is the setup so you can reproduce it (we dont have region field)
clearml.conf:s3 {use_credentials_chain: falsecredentials: [{host: " s3.somehost.com "key: "XXXXXXXXXXXXXXXXXXXX"
` secret: "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
But it seems like the data is gone, not sure how to get them back
We had a similar problem. Clearml doesnt support data migration (not that I know of)
So you have two ways to fix this:
- Recreate the dataset when its already in Azure
- Edit each elasticsearch database file entry to point to new destination (we did this)
So from our IT guys i now know that
"s3" part of url is subdomain, we use it in all other libs like boto3 and cloudpathlib, never had any problems
This is where the crash happens inside the clearml Task
also, when uploading artifacts, I see where they are stored on the s3 bucket, but I cant find where the debug images are stored at
Can I do it while i have multiple ongoing training?




