It should store it on the fileserver, perhaps you're missing a configuration option somewhere?
Heh, well, John wrote that in the first reply in this thread 🙂
And in Task.init main documentation page (nowhere near the code), it says the following -
It is documented at None ... super deep in the code. If you don't know that output_uri in TASK's (!) init is relevant, you would never know...
I wouldn't put past ClearML automation (a lot of stuff depend on certain suffixes), but I don't think that's the case here hmm
Well you could start by setting the output_uri to True in Task.init .
@<1523701087100473344:profile|SuccessfulKoala55> : That is the link I posted as well. But this should be mentioned also at places where it is about about the external or non-external storage. Also it should be mentioned everywhere we talk about models or artifacts etc. Not necessarily in details, but at least with a sentence and a link.
@<1523701083040387072:profile|UnevenDolphin73> : I do not see any way to download the model manually from the web app either. All I see is the link to the file on my harddrive (see shreenshot).
The second process says there is not file at all. I think, all that happened is that the update_weights only uploaded the location of the .zip file (which we denote as a .model file) on my harddrive, but not the file itself.
@<1523701083040387072:profile|UnevenDolphin73> : I see. I did not make the connection that output_uri=True is what I was missing. I thought this was the default. But the default is actually "None", which is different than "True".
@<1523701083040387072:profile|UnevenDolphin73> : How do you figure? In the past, my colleagues and I just shared the .zip file via email / MS Teams and it worked. So I don't think so.
@<1523704157695905792:profile|VivaciousBadger56> It seems like whatever you pickled in the zip file relies on some additional files that are not pickled.
Hi all, sorry for not being so responsive today 🙏
By the way, output_uri is also documented as part of the Task.init() docstring ( None )
@<1523701083040387072:profile|UnevenDolphin73> : Thanks, but it does not mention the File Storage of "ClearML Hosted Server".
FWIW, we prefer to set it in the agent’s configuration file, then it’s all automatic
FWIW It’s also listed in other places @<1523704157695905792:profile|VivaciousBadger56> , e.g. None says:
In order to make sure we also automatically upload the model snapshot (instead of saving its local path), we need to pass a storage location for the model files to be uploaded to.
For example, upload all snapshots to an S3 bucket…
We have the following, works fine (we also use internal zip packaging for our models):
model = OutputModel(task=self.task, name=self.job_name, tags=kwargs.get('tags', self.task.get_tags()), framework=framework)
model.connect(task=self.task, name=self.job_name)
model.update_weights(weights_filename=cc_model.save())
Heh, good @<1523704157695905792:profile|VivaciousBadger56> 😁
I was just repeating what @<1523701070390366208:profile|CostlyOstrich36> suggested, credits to him
@<1523701087100473344:profile|SuccessfulKoala55> : I referenced this conversation in the issue None
But, I guess @<1523701070390366208:profile|CostlyOstrich36> wrote that in a different chat, right?
@<1523701083040387072:profile|UnevenDolphin73> : I do not get this impression, because during update_weights I get the message
2023-02-21 13:54:49,185 - clearml.model - INFO - No output storage destination defined, registering local model C:\Users..._Demodaten_FF_2023-02-21_13-53-51.624362.model
@<1523701083040387072:profile|UnevenDolphin73>
Yes, you're correct, I misread the exception.
Maybe it hasn't completed uploading? At least for Datasets one needs to explicitly wait IIRC
@<1523701087100473344:profile|SuccessfulKoala55> I think I might have made a mistake earlier - but not in the code I posted before. Now, I have the following situation:
- In my training Python process on my notebook I train the custom made model and put it on my harddrive as a zip file. Then I run the code
output_model = OutputModel(task=task, config_dict={...}, name=f"...")
output_model.update_weights(weights_filename=r"C:\path\to\mymodel.zip", is_package=True)
-
I delete the "C:\path\to\mymodel.zip", because it would not be available on my colleagues' computers.
-
In a second process, the model-inference process, I run
mymodel = task.models['output'][-1]
mymodel = mymodel.get_local_copy(extract_archive=True, raise_on_error=True)
and get the error
ValueError: Could not retrieve a local copy of model weights 8ad4db1561474c43b0747f7e69d241a6, failed downloading
I do not have an aws S3 instance or something like that. This is why I would like to store my mymodel.zip file directly on the ClearML Hosted Service. The model is around 2MB large.
How should I proceed?
@<1523704157695905792:profile|VivaciousBadger56> regrading: None
Is this a discussion or PR ?
(general ranting is saved for our slack channel 🙂 )
I can only say I’ve found ClearML to be very helpful, even given the documentation issue.
I think they’ve been working on upgrading it for a while, hopefully something new comes out soon.
Maybe @<1523701205467926528:profile|AgitatedDove14> has further info 🙂