Hi @<1547028074090991616:profile|ShaggySwan64>
I have to admit that personally I do not know pdm
, could you share links, and help us understand what is the value over pip/poetry/conda ?
I want each remote task to execute one instance of the hydra multirun, but I suspect the remote will try to run the full multirun by itself
if config.clearml.remote and task.running_locally(): task.execute_remotely( queue_name=config.clearml.queue_name, clone=True, exit_process=False ) return
I think this ensures the local execution actually triggers the remote one, so it should be as you expect, no?
Hi BattyLion34
script_a.py
generates file
test.json
in project folder
So let's assume "script_a" generates something and puts it under /tmp/my_data
Then it can create a dateset from the folder /tmp/my_data
, with Dataset.create() -> Dataset.sync -> Dataset.upload -> Dataset.finalize
See example: https://github.com/alguchg/clearml-demo/blob/main/process_dataset.py
Then "script_b" can get a copy of the dataset using "Dataset.get()", see examp...
Hmmm that is odd... based on the reply "'Task' object has no attribute 'hyperparams'", I would assume API version is lower then 2.9. But you specifically said you see Session.api_version == 2.9
is that correct?
task._wait_for_repo_detection()
You can use the above, to wait until repository & packages are detected
(If this is something users need, we should probably make it a "public function" )
Also, for a single parameter you can use:cloned_task.set_parameter(name="Args/artifact_name", value="test-artifact", description="my help text that will appear in the UI next to the value")
This way, you are not overwriting the other parameters, you are adding to them.
(Similar to update_parameters
, only for a single parameter)
Hmm is this similar to this one https://allegroai-trains.slack.com/archives/CTK20V944/p1597845996171600?thread_ts=1597845996.171600&cid=CTK20V944
JitteryCoyote63 try to add the prefix to the parameter name, e.g. instead of "artifact_name" use "Args/artifact_name"
Any chance @<1578918150261444608:profile|RoundJellyfish71> you can open a GitHub issue so that we can track it? (I think this is indeed a good idea)
Hmm, let me check, the link is definitely there but this is not a valid link
Hi SarcasticSparrow10 ,
So the bad news is the UI is actually escaping the query, so you cannot search regexp from the UI. The good news, you can do achieve that from python:from trains import Task tasks = Task._query_tasks(task_name='exp.*i1')
Yes, actually the first step would be a toggle button for regexp in the search, the second will be even more advanced search.
May I suggest you post it on the UI suggestion issue https://github.com/allegroai/trains/issues/81 ?
Actually this should be a flag
the unclear part is how do I sample another point in the optimization space from the optimizer
Just so I'm clear on the issue, you want multiple machines to access the internals of the optimizer class ? or Do you just want a way to understand what is the optimizer sampling space (i.e. the parameters and options per parameter) ?
I have no idea what string reference could be used when steps come from Task?
Oh I see, you are correct, when it comes to Tasks the assumption is your are passing strings (with selectors on the strings, i.e. the curly brackets) but there is no fancy serialization/deserialization as you have with pipelines from decorators / functions. The reason for that is that the Task itslef is a standalone, there is no way for the pipeline logic to actually "pull data" from it and "pass" it to the o...
Hi GreasyPenguin14
Could you tell me what the differences are and why we should use ClearML data?
The first difference is in the approach itself, DVC ties the data with the code (i.e. git repo), where we (ClearML - but not just us) actually think data should be abstracted from the Code-Base and become a standalone argument, allowing users to build/execute against different dataset/versions. ClearML Data becomes part of the workflow as it is visible from the UI including the abili...
Hi VexedCat68
So if I understand correctly, the issue is this argument:parameter_override={'Args/dataset_id': '${split_dataset.split_dataset_id}', 'Args/model_id': '${get_latest_model_id.clearml_model_id}'},
I think that what is missing is telling it this an artifact:parameter_override={'Args/dataset_id': '${split_dataset.artifacts.split_dataset_id.url}', 'Args/model_id': '${get_latest_model_id.clearml_model_id}'},
You can see the example here:
https://clear.ml/docs/latest/docs/ref...
Can the user be overwritten during task configuration (I don't see such an option in the documentation)?
Hm, not really 😞 this is tied with security feature on top.
That said,
stored them as a k8s secret and they are reused whenever anyone from our ML team starts a new ML model training
Does that mean you are running an Agent on the k8s cluster? what's exactly the flow that causes your k8s credentials to be used
TypeError:
init
() got an unexpected keyword argument 'base_pod_num'
Could you post the entire log?
I want to be able to install the venv in multiple servers and start the "simple" agents in each one on them. You can think of it as some kind of one-off agent for a specific (distributed) hyperparameter search task
ExcitedFish86 Oh if this is the case:
in your cleaml.conf:agent.package_manager.type: conda agent.package_manager.conda_env_as_base_docker: true
https://github.com/allegroai/clearml-agent/blob/36073ad488fc141353a077a48651ab3fabb3d794/docs/clearml.conf#L60
https://git...
, how do different tasks know which arguments were already dispatched if the arguments are generated at runtime?
A bit of how clearml-agent works (and actually on how clearml itself works).
When running manually (i.e. not executed by an agent), Task.init (and similarly task.connect etc.) will log data on the Task itself (i.e. will send arguments /parameters to the server), This includes logint the argparser for example (and any other part of the automagic or manuall connect).
When run...
I still see things being installed when the experiment starts. Why does that happen?
This only means no new venv is created, it basically means install in "default" python env (usually whatever is preset inside the docker)
Make sense ?
Why would you skip the entire python env setup ? Did you turn on venvs cache ? (basically caching the entire venv, even if running inside a container)