Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Is There A Way To Save The Models Completely On The Clearml Server? It Seems That Clearml Server Does Not Store The Models Or Artifacts Itself, But They Are Stored Somewhere Else (E.G., Aws S3-Bucket) Or On My Local Machine And Clearml Server Is Only Sto

Is there a way to save the models completely on the ClearML server?

It seems that ClearML Server does not store the models or artifacts itself, but they are stored somewhere else (e.g., AWS S3-bucket) or on my local machine and ClearML Server is only storing configuration parameters and previews (e.g., when the artifact is a pandas dataframe). Is that right?

  
  
Posted 9 months ago
Votes Newest

Answers 45


@<1523701083040387072:profile|UnevenDolphin73> : I do not get this impression, because during update_weights I get the message

2023-02-21 13:54:49,185 - clearml.model - INFO - No output storage destination defined, registering local model C:\Users..._Demodaten_FF_2023-02-21_13-53-51.624362.model

  
  
Posted 9 months ago

Well you could start by setting the output_uri to True in Task.init .

  
  
Posted 9 months ago

@<1523701070390366208:profile|CostlyOstrich36>

My training outputs a model as a zip file. The way I save and load the zip file to make up my model is custom made (no library is directly used), because we invented the entire modelling ourselves. What I did so far:

output_model = OutputModel(task=..., config_dict={...}, name=f"...")
output_model.update_weights("C:\io__path\...", is_package=True)

and I am trying to load the model in a different Python process with

mymodel = task.models['output'][0]
mymodel = mymodel.get_local_copy(extract_archive=True, raise_on_error=True)

and I get in the clearml cache a . training.pt file, which seems to be some kind of archive. Inside I have two files named data.pkl and version and a folder with the two files named 86922176 and 86934640 .

I am not sure how to proceed after trying to use pickle, zip and joblib. I am kind of at a loss. I suspect, my original zip file might be somehow inside, but I am not sure.

Sure, we could simply use the generic artifacts sdk, but I would like to use the available terminological methods and functions.

How should I proceed?

  
  
Posted 9 months ago

FWIW, we prefer to set it in the agent’s configuration file, then it’s all automatic

  
  
Posted 9 months ago

Hi @<1523704157695905792:profile|VivaciousBadger56> , you can configure Task.init(..., output_uri=True) and this will save the models to the clearml file server

  
  
Posted 9 months ago

The documentation is messy, I’ve complained about it the in the past too πŸ™ˆ

  
  
Posted 9 months ago

missing a configuration option

Which one, where? Any idea? I did not set output_uri - do I have to do that?

I am refering to

  
  
Posted 9 months ago

Heh, well, John wrote that in the first reply in this thread πŸ™‚
And in Task.init main documentation page (nowhere near the code), it says the following -
image

  
  
Posted 9 months ago

@<1523701083040387072:profile|UnevenDolphin73> : If I do, what should I configure how?

  
  
Posted 9 months ago

We have the following, works fine (we also use internal zip packaging for our models):

model = OutputModel(task=self.task, name=self.job_name, tags=kwargs.get('tags', self.task.get_tags()), framework=framework)
model.connect(task=self.task, name=self.job_name)
model.update_weights(weights_filename=cc_model.save())
  
  
Posted 9 months ago

@<1523701087100473344:profile|SuccessfulKoala55> : That is the link I posted as well. But this should be mentioned also at places where it is about about the external or non-external storage. Also it should be mentioned everywhere we talk about models or artifacts etc. Not necessarily in details, but at least with a sentence and a link.

  
  
Posted 9 months ago

It should store it on the fileserver, perhaps you're missing a configuration option somewhere?

  
  
Posted 9 months ago

πŸ™‚

  
  
Posted 9 months ago

Heh, good @<1523704157695905792:profile|VivaciousBadger56> 😁
I was just repeating what @<1523701070390366208:profile|CostlyOstrich36> suggested, credits to him

  
  
Posted 9 months ago

Do you mean "exactly" as in "you finally got it" or in the sense of "yes, that was easy to miss"?

  
  
Posted 9 months ago

I am not sure if it the fact the name of the file ends with .model is an issue - but that would be somewhat crazy design...

  
  
Posted 9 months ago

@<1523701083040387072:profile|UnevenDolphin73> : From which URL is your most recent screenshot?

  
  
Posted 9 months ago

@<1523701087100473344:profile|SuccessfulKoala55> I think I might have made a mistake earlier - but not in the code I posted before. Now, I have the following situation:

  • In my training Python process on my notebook I train the custom made model and put it on my harddrive as a zip file. Then I run the code
output_model = OutputModel(task=task, config_dict={...}, name=f"...")
output_model.update_weights(weights_filename=r"C:\path\to\mymodel.zip", is_package=True)
  1. I delete the "C:\path\to\mymodel.zip", because it would not be available on my colleagues' computers.

  2. In a second process, the model-inference process, I run

mymodel = task.models['output'][-1]
mymodel = mymodel.get_local_copy(extract_archive=True, raise_on_error=True)

and get the error

ValueError: Could not retrieve a local copy of model weights 8ad4db1561474c43b0747f7e69d241a6, failed downloading

I do not have an aws S3 instance or something like that. This is why I would like to store my mymodel.zip file directly on the ClearML Hosted Service. The model is around 2MB large.

How should I proceed?

  
  
Posted 9 months ago

Hi all, sorry for not being so responsive today πŸ™

  
  
Posted 9 months ago

@<1523704157695905792:profile|VivaciousBadger56> I'm not sure I'm following you - is the issue not being able to upload to the ClearML server or to load the downloaded file?

  
  
Posted 9 months ago

@<1523701083040387072:profile|UnevenDolphin73>

  
  
Posted 9 months ago

@<1523701083040387072:profile|UnevenDolphin73> : I see. I did not make the connection that output_uri=True is what I was missing. I thought this was the default. But the default is actually "None", which is different than "True".

  
  
Posted 9 months ago

But we do use S3

  
  
Posted 9 months ago

Yes, you're correct, I misread the exception.
Maybe it hasn't completed uploading? At least for Datasets one needs to explicitly wait IIRC

  
  
Posted 9 months ago

It is documented at None ... super deep in the code. If you don't know that output_uri in TASK's (!) init is relevant, you would never know...

  
  
Posted 9 months ago

Exactly πŸ™‚

  
  
Posted 9 months ago

Either? πŸ™‚

  
  
Posted 9 months ago

@<1523701087100473344:profile|SuccessfulKoala55> : I referenced this conversation in the issue None

  
  
Posted 9 months ago

Unbelievable! That worked.

  
  
Posted 9 months ago

I have already been trying to contribute (have three pull requests), but honestly I feel it is a bit weird, that I need to update a documentation about something I do not understand, while I actually try to evaluate if ClearML is the right tool for our company...

  
  
Posted 9 months ago
7K Views
45 Answers
9 months ago
9 months ago
Tags