I am not sure if it the fact the name of the file ends with .model
is an issue - but that would be somewhat crazy design...
But, I guess @<1523701070390366208:profile|CostlyOstrich36> wrote that in a different chat, right?
By the way, output_uri is also documented as part of the Task.init() docstring ( None )
Do you mean "exactly" as in "you finally got it" or in the sense of "yes, that was easy to miss"?
FWIW Itβs also listed in other places @<1523704157695905792:profile|VivaciousBadger56> , e.g. None says:
In order to make sure we also automatically upload the model snapshot (instead of saving its local path), we need to pass a storage location for the model files to be uploaded to.
For example, upload all snapshots to an S3 bucketβ¦
@<1523701083040387072:profile|UnevenDolphin73> : Thanks, but it does not mention the File Storage of "ClearML Hosted Server".
@<1523701087100473344:profile|SuccessfulKoala55> : That is the link I posted as well. But this should be mentioned also at places where it is about about the external or non-external storage. Also it should be mentioned everywhere we talk about models or artifacts etc. Not necessarily in details, but at least with a sentence and a link.
I have already been trying to contribute (have three pull requests), but honestly I feel it is a bit weird, that I need to update a documentation about something I do not understand, while I actually try to evaluate if ClearML is the right tool for our company...
@<1523704157695905792:profile|VivaciousBadger56> regrading: None
Is this a discussion or PR ?
(general ranting is saved for our slack channel π )
We're certainly working hard on improving the documentation (and I do apologize for the frustrating experience)
@<1523701083040387072:profile|UnevenDolphin73> : From which URL is your most recent screenshot?
@<1523701083040387072:profile|UnevenDolphin73> : How do you figure? In the past, my colleagues and I just shared the .zip
file via email / MS Teams and it worked. So I don't think so.
We'll try to add referenced to that in other places as well π
Heh, well, John wrote that in the first reply in this thread π
And in Task.init
main documentation page (nowhere near the code), it says the following -
@<1523704157695905792:profile|VivaciousBadger56> It seems like whatever you pickled in the zip file relies on some additional files that are not pickled.
We have the following, works fine (we also use internal zip packaging for our models):
model = OutputModel(task=self.task, name=self.job_name, tags=kwargs.get('tags', self.task.get_tags()), framework=framework)
model.connect(task=self.task, name=self.job_name)
model.update_weights(weights_filename=cc_model.save())
The documentation is messy, Iβve complained about it the in the past too π
@<1523701070390366208:profile|CostlyOstrich36>
My training outputs a model as a zip file. The way I save and load the zip file to make up my model is custom made (no library is directly used), because we invented the entire modelling ourselves. What I did so far:
output_model = OutputModel(task=..., config_dict={...}, name=f"...")
output_model.update_weights("C:\io__path\...", is_package=True)
and I am trying to load the model in a different Python process with
mymodel = task.models['output'][0]
mymodel = mymodel.get_local_copy(extract_archive=True, raise_on_error=True)
and I get in the clearml cache a .
training.pt file, which seems to be some kind of archive. Inside I have two files named data.pkl
and version
and a folder with the two files named 86922176
and 86934640
.
I am not sure how to proceed after trying to use pickle, zip and joblib. I am kind of at a loss. I suspect, my original zip file might be somehow inside, but I am not sure.
Sure, we could simply use the generic artifacts sdk, but I would like to use the available terminological methods and functions.
How should I proceed?
Hi @<1523704157695905792:profile|VivaciousBadger56> , you can configure Task.init(..., output_uri=True)
and this will save the models to the clearml file server
Well you could start by setting the output_uri
to True
in Task.init
.
@<1523704157695905792:profile|VivaciousBadger56> I'm not sure I'm following you - is the issue not being able to upload to the ClearML server or to load the downloaded file?
@<1523701083040387072:profile|UnevenDolphin73>
@<1523701087100473344:profile|SuccessfulKoala55> : I referenced this conversation in the issue None
Yes, you're correct, I misread the exception.
Maybe it hasn't completed uploading? At least for Datasets one needs to explicitly wait IIRC