Reputation
Badges 1
662 × Eureka!Thanks! To clarify, all the agent does is then spawn new nodes to cover the tasks?
Is Task.create the way to go here? ๐ค
I can also do this via Mongo directly, but I was hoping to skip the K8S interaction there.
I wouldn't mind going the requests route if I could find the API end point from the SDK?
Hmmm, what ๐
Hey @<1537605940121964544:profile|EnthusiasticShrimp49> ! Youโre mostly correct. The Step classes will be predefined (of course developers are encouraged to add/modify as needed), but as in the DataTransformationStep , there may be user-defined functions specified. Thatโs not a problem though, I can provide these functions with the helper_functions argument.
- The
.add_function_stepis indeed a failing point. I canโt really create a task from the notebook because calling `Ta...
The tl;dr is that some of our users like poetry and others prefer pip . Since pip install git+.... stores the git data, it seems trivial to first try and install based on pip , and only later on poetry , since the pip would crash with poetry as it stores git data elsewhere (in poetry.lock )
It's pulled from the remote repository, my best guess is that the uncommitted changes apply only after the environment is set up?
The network is configured correctly ๐ But the newly spun up instances need to be set to the same VPC/Subnet somehow
You don't even need to set the CLEARML_WORKER_ID, it will automatically assign one based on the machine's name
I have seen this quite frequently as well tbh!
I think now there's the following:
Resource type Queue (name) defines resource + max instancesAnd I'm looking for:
Resource type "pool" of resources (type + max instances) A pool can be shared among queues
Yeah I will probably end up archiving them for the time being (or deleting if possible?).
Otherwise (regarding the code question), I think itโs better if we continue the original thread, as it has a sample code snippet to illustrate what Iโm trying to do.
Same result ๐ This is frustrating, wtf happened :shocked_face_with_exploding_head:
This is also specifically the services queue worker I'm trying to debug ๐ค
Internally yes, but in Task.init the default argument is a boolean, not an int.
We don't want to close the task, but we have a remote task that spawns more tasks. With this change, subsequent calls to Task.init fail because it goes in the deferred init clause and fails on validate_defaults .
We have a mini default config (if you remember from a previous discussion we had) that actually uses the second form you suggested.
I wrote a small "fixup" script that combines this default with the one generated by clearml-init , and it simply does:def_config = ConfigFactory.parse_file(DEF_CLEARML_CONF, resolve=False) new_config = ConfigFactory.parse_file(new_config_file, resolve=False) updated_new_config = ConfigTree.merge_configs(new_config, def_config)
It's of course not an MLOps issue so I understand it's not high on the priority list, but would be kinda cool to just have a simple view presenting the content of users.get_all ๐
Thanks CostlyOstrich36 !
Okay, I'll test it out by trying to downgrade to 4.0.0 and then upgrade to 4.1.2
Just to make sure, the chart_ref is allegroai/clearml right? (for some reason we had clearml/clearml and it seems like it previously worked?)
Nothing I can spot --
ClearML results page:
ClearML pipeline page:
Launching the next 2 steps
Launching step [...]
Launching step [...]
Launching step: ...
Parameters:
{...}
Configurations:
{}
Overrides:
{}
Launching step: ...
Parameters:
{...}
Configurations:
{}
Overrides:
{}
ClearML Monitor: GPU monitoring failed getting GPU reading, switching off GPU monitoring
2023-02-21 13:53:48
ClearML Monitor: Could not detect iteration reporting, falling back to itera...
TimelyPenguin76 I added pip install --update clearml-agent to the extra_vm_bash_script for the autoscaler, that should at least guarantee the latest clearml agent is used on the instance, right?
Ah I see, if the pipeline controller begins in a Task it does not add the tags to itโฆ
I think this is about maybe the credential.helper used
Hm, just a small update - I just verified and it does indeed work on linux:
` import clearml
import dotenv
if name == "main":
dotenv.load_dotenv()
config = clearml.backend_api.Config.load() # Success, parsed with environment variables `
Maybe this is part of the paid version, but would be cool if each user (in the web UI) could define their own secrets, and a task could then be assigned to some user and use those secrets during boot?