Reputation
Badges 1
662 × Eureka!But since this has come up a lot recently, any updates on #340? π
proj_suffix = "" i = 2 while Task.get_project_id(f"{proj_name}{proj_suffix}") is not None: tasks = Task.get_tasks(project_name=f"{proj_name}{proj_suffix}") if not [task for task in tasks if not task.get_archived()]: # Empty project, we can use this one... break proj_suffix = f"_{i}" i += 1
And task = Task.init(project_name=conf.get("project_name"), ...) is basically a no-op in remote execution so it does not matter if conf is empty, right?
Maybe they shouldn't be placed under /tmp if they're mission critical, but rather the clearml cache folder? π€
Another example - trying to validate dataset interactions ends with
` else:
self._created_task = True
dataset_project, parent_project = self._build_hidden_project_name(dataset_project, dataset_name)
task = Task.create(
project_name=dataset_project, task_name=dataset_name, task_type=Task.TaskTypes.data_processing)
if bool(Session.check_min_api_server_version(Dataset.__min_api_version)):
get_or_create_proje...
Right and then for text (file path) use some regex or similar for extraction, and for dictionary simply parse the values?
I would expect the service to actually implicitly inject it to new instances prior to applying the user's extra configuration π€
Indeed with ~ the .root call ends with an empty string, so it has a bit of different flow
I think -
- Creating a pipeline from tasks is useful when you already ran some of these tasks in a given format, and you want to replicate the exact behaviour (ignoring any new code changes for example), while potentially changing some parameters.
- From decorators - when the pipeline logic is very straightforward and you'd like to mostly leverage pipelines for parallel execution of computation graphs
- From functions - as I described earlier :)
So caching results for steps with the same arguments is trivial. Ultimately I would say you can combine the task-based pipeline with a function-based pipeline to achieve such dynamic control as you specified in the first two scenarios.
About the third scenario I'm not sure. If the configuration has changed, shouldn't the relevant steps (the ones where the configuration changed and their dependent steps) be rerun?
At any case, I think if you stay away from the decorators, at the cost of a bi...
We have a mini default config (if you remember from a previous discussion we had) that actually uses the second form you suggested.
I wrote a small "fixup" script that combines this default with the one generated by clearml-init , and it simply does:def_config = ConfigFactory.parse_file(DEF_CLEARML_CONF, resolve=False) new_config = ConfigFactory.parse_file(new_config_file, resolve=False) updated_new_config = ConfigTree.merge_configs(new_config, def_config)
- in the second scenario, I might have not changed the results of the step, but my refactoring changed the speed considerably and this is something I measure.
- in the third scenario, I might have not changed the results of the step and my refactoring just cleaned the code, but besides that, nothing substantially was changed. Thus I do not want a rerun.Well, I would say then that in the second scenario itβs just rerunning the pipeline, and in the third itβs not running it at all π
(I ...
I don't think there's a PR issue for that yet, at least I haven't created one.
I could have a look at this and maybe make a PR.
Not sure what would the recommended flow be like though π€
Why not give ClearML read-only access credentials to the repository?
You mean the host is considered the bucket, as I wrote in my earlier message as the root cause?
It's pulled from the remote repository, my best guess is that the uncommitted changes apply only after the environment is set up?
@<1537605940121964544:profile|EnthusiasticShrimp49> Itβll take me still some time to find the MVC that generated this, but I do have the ClearML experiment page for it. I was running the thing from ipython , and was trying to create a task from a function:
Alternatively, it would be good to specify both some requirements and auto-detect π€
@<1523701205467926528:profile|AgitatedDove14> this
I will TIAS, but maybe worthwhile to also mention if it has to be the absolute path or if relative path is fine too!
Well you could start by setting the output_uri to True in Task.init .
If everything is managed with a git repo, does this also mean PRs will have a messy metadata file attached to them?
Heh, my bad, the term "user" is very much ingrained in our internal way of working. You can think of it as basically any technically-inclined person in your team or company.
Indeed the options in the WebUI are too limited for our use case, so we're developed "apps" that take a yaml configuration file and build a matching pipeline.
With that, our users do not need to code directly, and we can offer much more fine control over the pipeline.
As for the imports, what I meant is that I encounter...
No it does not show up. The instance spins up and then does nothing.
Thanks SuccessfulKoala55 , I made https://github.com/allegroai/clearml-agent/issues/126 as a suggestion.
Do you have any thoughts on how to expose these... manually?
It does so already for environment variables that prefixed with CLEARML_ , so it would be nice to have some control over that.
Actually SuccessfulKoala55 , there is something like that happening behind the scenes.
I have an AWS Autoscaler running on a services queue, so the autoscaler inherits the configuration used by the services agent, right?
Now, when my autoscaler launched new EC2 instances, they used the same fileserver as the one that was defined in the services agent too π€
I've been answering there as well π€