Reputation
Badges 1
662 × Eureka!From the traceback ( backend_interface/task/task.py, line 178, in __init__
), notice it's not Task.init
I'm not sure why internally ClearML tries to initialize a task when get_task
is called...
proj_suffix = "" i = 2 while Task.get_project_id(f"{proj_name}{proj_suffix}") is not None: tasks = Task.get_tasks(project_name=f"{proj_name}{proj_suffix}") if not [task for task in tasks if not task.get_archived()]: # Empty project, we can use this one... break proj_suffix = f"_{i}" i += 1
It's a small snippet that ensures identically named projects are still unique'd with a running number.
On an unrelated note, when cloning an experiment via the WebUI, shouldn't the cloned experiment have the original experiment as a parent? It seems to be empty
Unfortunately not, each task defines and constructs its own dataset. I want cloned task to save that link π€
Say I have Task A that works with some dataset (which is not hard-coded, but perhaps e.g. self-defined by the task itself).
I'd now like to clone Task A and modify some stuff, but still use the same dataset (no need to recreate it, but since it's not hard-coded, I have to maintain a reference somewhere to the dataset ID).
Since the Dataset
SDK offers use_current_task
, I would have also expected there to be something like dataset.link(task)
or task.register_dataset(ds)
...
The only thing I could think of is that the output of pip freeze would be a URL?
e.g. a separate structured user guide with common tips, usability, best practices - https://pandas.pydata.org/pandas-docs/stable/user_guide/index.html
vs the doc, where each function is its own page, e.g.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html
I realized it might work too, but looking for a more definitive answer π Has no-one attempted this? π€
I'm trying, let's see; our infra person is away on holidays :X Thanks! Uh, which configuration exactly would you like to see? We're running using the helm charts on K8s, so I don't think I have direct access to the agent configuration/update it separately?
Well the individual tasks do not seem to have the expected environment.
Then the username and password would be visible in the autoscaler task π
But it should work out of the box, as it does work like that out of the box also regardless of ClearML. The user and personal access token are used as is and it propagates down to submodules, since those are simply another git repository.
I've further checks on a different machine and it works as well π€
FWIW, we prefer to set it in the agentβs configuration file, then itβs all automatic
Couldn't the agent just come with the toml library? Kinda easy to load up and check if poetry is present then... π€
But yes it indeed used poetry correctly, though it would fail in other circumstances
Let me verify a hypothesis...
As the meme goes, well yes but actually no, since the input path is provided via argparse? I'm also not sure how this would help debug from the WebUI - you can't really see the contents of a zipped file/the configuration tab is too messy for such a nested configuration as the one we have. It's best suited as an artifact.
EDIT: Or am I missing something? Point being, when the remote execution begins, the entry point tries to run e.g. python train.py --config_file path/to/local/file.yaml
...
Thought it might be via docker, thanks!
Is there some default Docker image you ship with ClearML that you'd recommend, or can/should we use our own? π
Uhhh, not really unfortunately :white_frowning_face: . I have ~20 tasks happening in a single file, and it's quite random if/when this happens. I just noticed this tends to happen with the shorter tasks
I guess it's mixed. If #340 is resolved, then this initializer task will be a no-op: detach, and init-close new tasks as needed.
I mean, I see these are defined here https://github.com/allegroai/clearml-agent/blob/master/clearml_agent/definitions.py
But I do not see where an EnvironmentConfig.set()
is called...
The agent also uses a different clearml.conf
, so it should not matter?
FWIW running clearml
==1.9.1
with WebApp: 1.9.2-317 β’ Server: 1.9.2-317 β’ API: 2.23
Nothing I can spot --
ClearML results page:
ClearML pipeline page:
Launching the next 2 steps
Launching step [...]
Launching step [...]
Launching step: ...
Parameters:
{...}
Configurations:
{}
Overrides:
{}
Launching step: ...
Parameters:
{...}
Configurations:
{}
Overrides:
{}
ClearML Monitor: GPU monitoring failed getting GPU reading, switching off GPU monitoring
2023-02-21 13:53:48
ClearML Monitor: Could not detect iteration reporting, falling back to itera...