Reputation
Badges 1
20 × Eureka!AgitatedDove14 I just wanted to come back to the thread to let you know you can just copy the /opt/clearml/data
and /opt/clearml/config
over from the one VM to another and it worked fine!
Oh sweet! Is there instructions on how to disable the fileserver service?
I am adding a 100k S3 images that are not in the same folder and I can't move them in S3. So I have an list of S3 links, not a folder unfortunately.
If I ever write one I'll add it to the examples
Thank you AgitatedDove14 !
Yes I am, I found a way by modifying the user in MongoDB. Is there another way?
The use case is that we want a dataset of over 100k S3 images and they are scattered all over our bucket due to how we organise those images. If I send an array of URLs as the source_url
it will eventually fail after around 40 due to the GCS rate limit for updating the state.json
.
Yeah I can write a script to transfer it over, I was just wondering if there was a built in feature. I am just asking as it would be possible in the future we go from self hosted to hosting with ClearML and how we would transfer our currently existing datasets. All good though!
That would only work for the data itself no? I don't know how I would be able to get the description and name?
AgitatedDove14 I have tried that case, however if you go in the implementation you can see what actually happens is a for
loop that will continuously call the method. This is a problem because it will update my external state.json
file over 100k times, and the cloud provider will block my requests after around 40 😅
For context, the google cloud storage SDK allows an authorized user credentials. This makes it a bit awkward for our developers already using their creds.
The error in question:2022-12-01 17:33:03,687 - clearml.storage - ERROR - Failed creating storage object gs://<my_bucket> Reason: Service account info was not in the expected format, missing fields token_uri, client_email.
I opened a PR for it, it has a bug attached to it too: https://github.com/allegroai/clearml/pull/841
Hi AgitatedDove14 I found the possible bug, I'll open a PR for it and we can discuss! It has to do with how we pass credentials to the GCS client.