Reputation
Badges 1
662 × Eureka!Hmmm maybe ๐ค I thought that was expected behavior from poetry side actually
That's probably in the newer ClearML server pages then, I'll have to wait still ๐
Running a self-hosted server indeed. It's part of a code that simply adds or uploads an artifact ๐ค
We have a read-only user with personal access token for these things, works seamlessly throughout and in our current on premise servers... So perhaps something missing in the autoscaler definitions?
@<1523704157695905792:profile|VivaciousBadger56> It seems like whatever you pickled in the zip file relies on some additional files that are not pickled.
FWIW Itโs also listed in other places @<1523704157695905792:profile|VivaciousBadger56> , e.g. None says:
In order to make sure we also automatically upload the model snapshot (instead of saving its local path), we need to pass a storage location for the model files to be uploaded to.
For example, upload all snapshots to an S3 bucketโฆ
Well you could start by setting the output_uri
to True
in Task.init
.
We have the following, works fine (we also use internal zip packaging for our models):
model = OutputModel(task=self.task, name=self.job_name, tags=kwargs.get('tags', self.task.get_tags()), framework=framework)
model.connect(task=self.task, name=self.job_name)
model.update_weights(weights_filename=cc_model.save())
It should store it on the fileserver, perhaps you're missing a configuration option somewhere?
I'm not sure how the decorators achieve that; from the available examples and trials I've done, it seems that:
- Components anyway need to be available when you define the pipeline controller/decorator, i.e. same codebase
- The component code still needs to be self-composed (or, function component can also be quite complex)
- Decorators do not allow any dynamic build, because you must know how the component are connected at decoration time
With that said, it could be that the provided example...
- in the second scenario, I might have not changed the results of the step, but my refactoring changed the speed considerably and this is something I measure.
- in the third scenario, I might have not changed the results of the step and my refactoring just cleaned the code, but besides that, nothing substantially was changed. Thus I do not want a rerun.Well, I would say then that in the second scenario itโs just rerunning the pipeline, and in the third itโs not running it at all ๐
(I ...
Could also be that the use of ./
is the issue? I'm not sure what else I can provide you with, SweetBadger76
I think -
- Creating a pipeline from tasks is useful when you already ran some of these tasks in a given format, and you want to replicate the exact behaviour (ignoring any new code changes for example), while potentially changing some parameters.
- From decorators - when the pipeline logic is very straightforward and you'd like to mostly leverage pipelines for parallel execution of computation graphs
- From functions - as I described earlier :)
My current approach with pipelines basically looks like a GH CICD yaml config btw, so I give the user a lot of control on which steps to run, why, and how, and the default simply caches all results so as to minimize the number of reruns.
The user can then override and choose exactly what to do (or not do).
So caching results for steps with the same arguments is trivial. Ultimately I would say you can combine the task-based pipeline with a function-based pipeline to achieve such dynamic control as you specified in the first two scenarios.
About the third scenario I'm not sure. If the configuration has changed, shouldn't the relevant steps (the ones where the configuration changed and their dependent steps) be rerun?
At any case, I think if you stay away from the decorators, at the cost of a bi...
Also full disclosure - I'm not part of the ClearML team and have only recently started using pipelines myself, so all of the above is just learnings from my own trials ๐
I think this is about maybe the credential.helper
used
Also, creating from functions allows dynamic pipeline creation without requiring the tasks to pre-exist in ClearML, which is IMO the strongest point to make about it
Heh, my bad, the term "user" is very much ingrained in our internal way of working. You can think of it as basically any technically-inclined person in your team or company.
Indeed the options in the WebUI are too limited for our use case, so we're developed "apps" that take a yaml configuration file and build a matching pipeline.
With that, our users do not need to code directly, and we can offer much more fine control over the pipeline.
As for the imports, what I meant is that I encounter...
I guess it depends on what you'd like to configure.
Since we let the user choose parents, component name, etc - we cannot use the decorators. We also infer required packages at runtime (the autodetection based on import statements fails with a non-trivial namespace) and need to set that to all components, so the decorators do not work for us.
Ah, you meant โfree python codeโ in that sense. Sure, I see that. The repo arguments also exist for functions though.
Sorry for hijacking your thread @<1523704157695905792:profile|VivaciousBadger56>
That's fine as well - the code simply shows the name of the environment variable, not it's value, since that's taken directly from the agent listening to the services queue (and who's then running the scaler)
I'd like to set up both with and without GPUs. I can use any region, preferably some EU one.
I guess I'll have to rerun the experiment without tags for this?
FWIW, we prefer to set it in the agentโs configuration file, then itโs all automatic
Different AMI image/installing older Python instances that don't enforce this...
For future reference though, the environment variable should be PIP_USE_PEP517=false
Sure! It looks like this