Reputation
Badges 1
662 × Eureka!Hey @<1523701435869433856:profile|SmugDolphin23> , thanks for the reply! Iโm aware of the caching โ thatโs not the issue Iโm trying to resolve ๐
This is related to my other thread, so Iโll provide an example there -->
@<1523701827080556544:profile|JuicyFox94> we have it up and running, hurray ๐
One thing I noticed in the k8s logs is frequent warnings about Python 3.6..? Is the helm chart built with that Python version?
/usr/lib/python3/dist-packages/secretstorage/dhcrypto.py:15: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography and will be removed in a future release.
from cryptography.utils import int_...
And actually it fails on quite many tasks for us with this Python 3.6.
I tried to set up a different image ( agent8sglue.defaultContainerImage: "ubuntu:20.04"
) but that did not change much.
I suspect the culprit is agentk8sglue.image
, which is set to tag 1.24-21
of clearml-agent-k8s-base
. That image is quite very oldโฆ Any updates on that? ๐ค
Thanks! To clarify, all the agent does is then spawn new nodes to cover the tasks?
i.e. It does not process tasks on its own?
I am; it seems like maybe a couple of hours?
Thanks CostlyOstrich36 !
Not that I recall
The deferred_init
input argument to Task.init
is bool
by default, so checking type(deferred_init) == int
makes no sense to begin with, and is altering the flow.
Thanks AgitatedDove14 , I'll first have to prove viability with the free version :)
I believe it is maybe a race condition that's tangent to clearml now...
I will TIAS, but maybe worthwhile to also mention if it has to be the absolute path or if relative path is fine too!
Yes, exactly! I've added instructions for the users on creating their account and running clearml-init
, and then they run the snippet that updates the api and sdk sections.
Or did you mean I can couple a short "mini config" with the package and redirect clearml to use this local one (instead of the one at ~/clearml.conf)?
i.e.ERROR Fetching experiments failed. Reason: Backend timeout (600s)
ERROR Fetching experiments failed. Reason: Invalid project ID
SuccessfulKoala55 This happens pip >= 22.3 btw.
Another semi-related issue is that I now encounter these kind of error messages:clearml_agent: ERROR: __init__() got an unexpected keyword argument 'types'
My current approach with pipelines basically looks like a GH CICD yaml config btw, so I give the user a lot of control on which steps to run, why, and how, and the default simply caches all results so as to minimize the number of reruns.
The user can then override and choose exactly what to do (or not do).
From the traceback ( backend_interface/task/task.py, line 178, in __init__
), notice it's not Task.init
I'm not sure why internally ClearML tries to initialize a task when get_task
is called...
proj_suffix = "" i = 2 while Task.get_project_id(f"{proj_name}{proj_suffix}") is not None: tasks = Task.get_tasks(project_name=f"{proj_name}{proj_suffix}") if not [task for task in tasks if not task.get_archived()]: # Empty project, we can use this one... break proj_suffix = f"_{i}" i += 1
It's a small snippet that ensures identically named projects are still unique'd with a running number.
On an unrelated note, when cloning an experiment via the WebUI, shouldn't the cloned experiment have the original experiment as a parent? It seems to be empty
Unfortunately not, each task defines and constructs its own dataset. I want cloned task to save that link ๐ค
Say I have Task A that works with some dataset (which is not hard-coded, but perhaps e.g. self-defined by the task itself).
I'd now like to clone Task A and modify some stuff, but still use the same dataset (no need to recreate it, but since it's not hard-coded, I have to maintain a reference somewhere to the dataset ID).
Since the Dataset
SDK offers use_current_task
, I would have also expected there to be something like dataset.link(task)
or task.register_dataset(ds)
...