
Reputation
Badges 1
662 × Eureka!I'm not sure why internally ClearML tries to initialize a task when get_task
is called...
From the traceback ( backend_interface/task/task.py, line 178, in __init__
), notice it's not Task.init
That's what I thought @<1523701087100473344:profile|SuccessfulKoala55> , but the server URL is correct (and WebUI is functional and responsive).
In part of our code, we look for projects with a given name, and pull all tasks in that project. That's the crash point, and it seems to be related to having running tasks in that project.
Because setting env vars and ensuring they exist on the remote machine during execution etc is more complicated 😁
There are always ways around, I was just wondering what is the expected flow 🙂
You don't even need to set the CLEARML_WORKER_ID, it will automatically assign one based on the machine's name
Trying now with 1.4.1, but I believe the changes you're referring to SuccessfulKoala55 were also introduced in 1.4.0, right?
Still crashing, I think that may not be the correct virtual environment to edit 🤔
It's the one created later down the line
(in the current version, that is, we’d very much like to use them obviously :D)
So where should I install the latest clearml version? On the client that's running a task, or on the worker machine?
Thanks David! I appreciate that, it would be very nice to have a consistent pattern in this!
I'm using 1.1.6 (upgraded from 1.1.6rc0) - should I try 1.1.7rc0 or smth?
That still seems to crash SuccessfulKoala55 🤔
EDIT: No, wait, the environment still needs updating. One moment still...
I just ran into this too recently. Are you passing these also in the extra_clearml_conf
for the autoscaler?
Hey @<1537605940121964544:profile|EnthusiasticShrimp49> ! You’re mostly correct. The Step
classes will be predefined (of course developers are encouraged to add/modify as needed), but as in the DataTransformationStep
, there may be user-defined functions specified. That’s not a problem though, I can provide these functions with the helper_functions
argument.
- The
.add_function_step
is indeed a failing point. I can’t really create a task from the notebook because calling `Ta...
Interesting, why won’t it be possible? Quite easy to get the source code using e.g. dill
.
Is there a way to specify that flag within the config file, SuccessfulKoala55 ?
Happens pretty much consistently across all our projects -
Have a project with over 15 tasks (i.e. one that needs the Load More button) Click Load More, select a task that's not in the first 15 Let the page "rest" for a while (a couple of hours) Flip back to the page - the task is still active, but you cannot see it in the task list and there is no more Load More button
I think I may have brought this up multiple times in different ways :D
When dealing with long and complicated configurations (whether config objects, yaml, or otherwise), it's often useful to break them down into relevant chunks (think hydra, maybe).
In our case, we have a custom YAML instruction !include
, i.e.
` # foo.yaml
bar: baz
bar.yaml
obj: !include foo.yaml
maybe_another_obj: !include foo.yaml `
I can elaborate in more detail if you have the time, but generally the code is just defined in some source files.
I’ve been trying to play around with pipelines for this purpose, but as suspected, it fails finding the definition for the pickled object…
I'm not sure about the intended use of connect_configuration
now.
I was under the assumption that in connect_configuration(configuration, name=None, description=None)
, the configuration
is only used in local execution.
But when I run config = task.connect_configuration({}, name='General')
(in remote execution), the configuration is set to the empty dictionary
Ah. Apparently getting a task ID while it’s running can cause this behaviour 🤔
AFAIK that's the only way right now (see my comment here - https://clearml.slack.com/archives/CTK20V944/p1657720159903739?thread_ts=1657699287.630779&cid=CTK20V944 )
Or then if you have the ClearML paid service, I believe there is a "vaults" service, right AgitatedDove14 ?
But it does work on linux 🤔 I'm using it right now and the environment variables are not defined in the terminal, only in the .env
🤔
No it doesn't, the agent has its own clearml.conf file.
I'm not too familiar with clearml on docker, but I do remember there are config options to pass some environment variables to docker.
You can then set your environment variables in any way you'd like before the container starts
Uhhh, not really unfortunately :white_frowning_face: . I have ~20 tasks happening in a single file, and it's quite random if/when this happens. I just noticed this tends to happen with the shorter tasks
Now, the original pyhocon does support include statements as you mentioned - https://github.com/chimpler/pyhocon