Reputation
Badges 1
90 × Eureka!python 3.8
I’ve worked around the issue by doing:sys.modules['model'] = local_model_package
As far I know storage can be https://clear.ml/docs/latest/docs/integrations/storage/#direct-access .
typical EBS is limited to being mounted to 1 machine at a time.
so in this sense, it won’t be too easy to create a solution where multiple machines consume datasets from this storage type
PS https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volumes-multi.html is possible under some limitations
I think it has something to do with clearml since I can run this code as pure python without clearml, and when I activate clearml, I see that torch.load() hits the
import_bind
.
__patched_import3
when trying to deserialize the saved model
AgitatedDove14 I haven’t done a full design for this 😉
Just referring to how DVC claims it can detect and invalidate changes in large remote files.
So I take it there is no such feature in http://clear.ml 🙂
CostlyOstrich36 If I delete the origin and all other info and set it to tag_name=‘xxx’ then it is able to work
AgitatedDove14 let me reach out to my pocket there 😉
AgitatedDove14 nope… you can run md5 on the file as stored in the remote storage (nfs or s3)
AgitatedDove14 looks like service-writing-time for me!
PS can you point me to some official example/ doc for how to persist/restore state so that tasks are restartable?
AgitatedDove14 can you share if there is a plan to put the gcp autoscaler in the open source?