y. In the second case you run a script that creates a task with
Task.create()
which creates a draft task with execution parameters, output uri etc (Nothing in configuration I assume? Please check
In the second case I only call Task.create
, that specifies docker, repository, commit, script path and so on (but doesn't specify output_uri
or tags
. They should be set in train.py
, when the Task.Init
is called).
Afterwards on the remote machine task is pulled by agent (is it running in docker mode?)
I provide docker and docker_args to Task.create
. I believe it means i run it in the docker mode
, right?
At which point were the init params changed in the second case?
In second case the init params are not meant to be changed. They are meant to be set during execution of train.py
by clearml agent. But instead they are ignored completely.
So in the first case you just run a task locally. In the second case you run a script that creates a task with Task.create()
which creates a draft task with execution parameters, output uri etc (Nothing in configuration I assume? Please check). This task is then enqueued and the script exists.
Afterwards on the remote machine task is pulled by agent (is it running in docker mode?) same code is pulled and execution begins.
At which point were the init params changed in the second case? From my understanding you just create a script that creates a task with the Task.init()
(train.py) from repository. The code then runs and uses params in the cloned repo. What am I missing?
Hi CharmingStarfish14 , I think it comes from the way that the clearml-agent
works if I understand correctly your issue. When running in remote it uses the values on the backend. So for example if you take a task and clone it, assuming the task uses parameters from the repo and they change, the agent will take the parameters that are logged in the ClearML backend. So for new parameters to take affect you need to clone the task, change the parameters in the cloned task (Either by UI or programmatically) and then enqueue the task.
What is your use case? I think Pipelines might be beneficial to your use case.
I guess I can simplify it a little.
Basically there are two scenarios:
on local machine
task = Task.Init(output_uri="...", tags=["tag"])
Result: everything works. Remote uri is used, tags are set
on local machine
task = clearml.Task.create(...) clearml.Task.enqueue(task, queue_name=queue)
> on remote clearml agent, the same code is calledtask = Task.Init(output_uri="...", tags=["tag"])
Result: init params are ignored, remote uril is not set, tags are empty
Does it make my problem description clearer? I'm not sure if it's a bug or if I'm missing something.
I figured the problem.
Reason:
If you create a clearml task and put it into queue, all further Task.init
call arguments from clearml worker will be ignored.
Solution:enque_task.py
task = clearml.Task.create(...) task.init(remote_uri=..., tags=...) clearml.Task.enqueue(task, queue_name=queue)
train.py
task = Task.Init(<whatever, all this args will be ignored>)
UPD: it doesn't solve anything 😞
This approach just creates a separate task corresponding to enque_task.py
script. But the task that is being run on clearml agent still ignored outpur_uri 😞
CostlyOstrich36 It seems to be a critical bug.
Do you happen to know a support channel, that can help with that?
CharmingStarfish14 , maybe SuccessfulKoala55 can assist
@<1523701087100473344:profile|SuccessfulKoala55>
So basically my problem was that I couldn't specify ouput_uri
with Task.creaate
.
I ended up with a solution to just use CLI version of clearml-task
that allows for specifying output_uri
(but not tags, though).
Maybe this is something we can add 🙂
@<1523701137134325760:profile|CharmingStarfish14> the explanation is very simple - this is not a bug, but part of how ClearML SDK and Agent work.
When you run a task locally (or create it), everything that you provide is stored on the task metadata in the server.
When such a task is executed remotely by an agent (after you enqueued it), the Task.init() is not ignored, it just does different things - instead of storing all settings to the server, it reads all previously stored settings from the server, and applies them to the task object/setup being run. This is part of the concept allowing you to create tasks from code, and than clone them and change their parameters/settings from the UI (or using API/SDK) before scheduling the cloned tasks for remote execution.
Without this, tasks would be static constructs that always use the same configuration hard-coded in the Task.init() call (or other configurations) and can never be affected externally by the system.