It Seems Like Clearml Agent Does Not Support Arparse Subparsers, Right?

Answered

It seems like ClearML Agent does not support arparse subparsers, right?
https://docs.python.org/3/library/argparse.html#argparse.ArgumentParser.add_subparsers

This is what I get locally:
Namespace(checkpoint=None, checkpoint_every=1000, checkpoint_test_every=1000, command='train', device='cuda', enqueue='default', environment='walker_stand', jit=False, mixed_precision=False, name=None, nvidia_docker=False, preset='rlad.modules.dreamer.presets.dmc.original', render=False, steps=5000000, symbolic_obs=False, test_every=2000, test_steps=1000, track_remote=False, type='dmc')and this is what I get when executed remotly:
Namespace(checkpoint_every=1000, checkpoint_test_every=1000, command="['train', 'rlad.modules.dreamer.presets.dmc.original', 'dmc', 'walker_stand', '5000000', '--test-steps', '1000', '--test-every', '2000', '--checkpoint-test-every', '1000', '--checkpoint-every', '1000', '--enqueue']", device='cuda', enqueue='default', environment='walker_stand', jit=False, mixed_precision=False, nvidia_docker=False, preset='rlad.modules.dreamer.presets.dmc.original', render=False, steps=5000000, symbolic_obs=False, test_every=2000, test_steps=1000, track_remote=False, type='dmc')The code looks like this:
` def parser_add_clearml_arguments(parser: ArgumentParser):
parser.add_argument(
"--nvidia-docker",
action="store_true",
default=False,
help="Use jit compilation.",
)

parser.add_argument(
    "--enqueue",
    nargs="?",
    const="default",
    type=str,
    help="Enqueue experiment in clearml queue.",
)

parser.add_argument(
    "--track-remote",
    action="store_true",
    default=False,
    help="",
)

def parser_add_experiment_arguments(
parser: ArgumentParser, parent_parser: ArgumentParser
):
# Sub-parsers
subparsers = parser.add_subparsers(help="sub-command help", dest="command")

# Common
parent_parser.add_argument(
    "--device",
    default="cuda",
    help="The name of the device to run the agent on (e.g. cpu, cuda, cuda:0)",
)

parent_parser.add_argument(
    "--name",
    default=None,
    help="The name of the experiment",
)

parent_parser.add_argument(
    "--render",
    default=False,
    action="store_true",
    help="Renders the environment in a seperate window",
)

parent_parser.add_argument(
    "--checkpoint",
    default=None,
    help="",
)

parent_parser.add_argument(
    "--symbolic-obs",
    action="store_true",
    default=False,
    help="Learn with images as observation if True or symbolic states otherwise",
)

parent_parser.add_argument(
    "--mixed-precision",
    action="store_true",
    default=False,
    help="Use mixed-precision for GPUs with tensor cores.",
)

parent_parser.add_argument(
    "--jit",
    action="store_true",
    default=False,
    help="Use jit compilation.",
)

# Training
parser_train = subparsers.add_parser("train", parents=[parent_parser])
parser_train.add_argument(
    "steps",
    type=int,
    help="Number of environment steps to run",
)

parser_train.add_argument(
    "--checkpoint-every",
    default=-1,
    type=int,
    help="",
)

parser_train.add_argument(
    "--checkpoint-test-every",
    default=-1,
    type=int,
    help="",
)

parser_train.add_argument(
    "--test-every",
    default=-1,
    type=int,
    help="",
)

parser_train.add_argument(
    "--test-steps",
    default=0,
    type=int,
    help="",
)

# Testing
parser_test = subparsers.add_parser("test", parents=[parent_parser])
parser_test.add_argument(
    "steps",
    type=int,
    help="Number of environment steps to run",
)

parser = argparse.ArgumentParser(description="Run an OpenAI Gym environment.")
parent_parser = argparse.ArgumentParser(add_help=False)
parent_parser.add_argument(
"preset",
help="Path to the preset",
)

parent_parser.add_argument(
"type",
help="Type of the environment: pybullet, gym or atari",
)

parent_parser.add_argument(
"environment",
help="Name of the Atari game (e.g. Pong). For deepmind environments use: dmc:domain:task",
)
parser_add_clearml_arguments(parent_parser)
parser_add_experiment_arguments(parser, parent_parser)

args = parser.parse_args() `

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Votes Newest

Answers 30

Thanks @<1527459125401751552:profile|CloudyArcticwolf80> ! let me see if we can reproduce it

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Yes. Here is some simple code that reproduces the issue:

import argparse
from datetime import datetime

import clearml


def train(args):
    print(f"Running training: {args}")


def test(args):
    print(f"Running testing: {args}")


def parse_args():
    parser = argparse.ArgumentParser()
    subparser = parser.add_subparsers(dest="subparser")

    train_parser = subparser.add_parser("train")
    train_parser.add_argument("project", type=str)
    train_parser.add_argument("queue", type=str)
    train_parser.add_argument("--epochs", type=int)

    test_parser = subparser.add_parser("test")
    test_parser.add_argument("model_path", type=str)
    test_parser.add_argument("--metric", type=str)

    args = parser.parse_args()
    return args


if __name__ == "__main__":
    args = parse_args()
    print(args)

    task = clearml.Task.init(args.project, datetime.now().strftime("%Y%m%d-%H%M%S"))
    task.execute_remotely(queue_name=args.queue)

    if args.subparser == "train":
        train(args)
    elif args.subparser == "test":
        test(args)
    else:
        raise ValueError(f"Invalid args: {args}")

Locally it prints: Namespace(subparser='train', project='my-clearml-project', queue='worker-queue', epochs=2)
In the remote console: Namespace(subparser="['train', 'my-clearml-project', 'worker-queue', '--epochs', '2']", project='my-clearml-project', queue='worker-queue', epochs=2)

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					CloudyArcticwolf80
				
					0

and does the code above reproduce the issue/bug? because obviously should not happen

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

When the execution starts locally the args are like:

Namespace(subparser='train', project='test', epochs=0)

then remotely they get converted to:

Namespace(subparser="['train', '--project', 'test', '--epoch', '0']", project='test', epochs=0)

Which is similar to what Tim reported a few messages above.

So when in the code I do something like if args.subparser == "train": ... I get a normal behaviour locally (i.e. True ), but not remotely because args.subarser is actually that weird string.
I solved it by chaing the condition to if "train" in args.subparser , which works in both situation, but it’s not very safe 🙂

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					CloudyArcticwolf80
				
					0

@<1527459125401751552:profile|CloudyArcticwolf80> what are you seeing in the Args section ?
what exactly is not working ?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Hi, I am on clearml == 1.9.0 and I am having the same issue.
Is there a recommended workaround or plans to fix it?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					CloudyArcticwolf80
				
					0

When I passed specific arguments (for example --steps) it ignored them...

I am not sure what you mean by this. It should not ignore anything.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

The script is intended to be used something like this:
script.py train my_model --steps 10000 --checkpoint-every 10000
or
script.py test my_model --steps 1000

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

When I passed specific arguments (for example --steps) it ignored them...

script.py test blah1 blah2 blah3 42

Is this how it is intended to be used ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Good, at least now I know it is not a user-error 😄

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

I can verify the behavior, I think it has to do with the way the subparser was setup.
This was the only way for me to get it to run:
script.py test blah1 blah2 blah3 42When I passed specific arguments (for example --steps) it ignored them...

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Okay, let me quickly run a test

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Python 3.8.8, clearml 1.0.2

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Thanks ReassuredTiger98 , yes that makes sense.
What's the python version you are using ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

And in the WebUI I can see arguments similar to the second print statement's.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Here is some code that shows exactly what goes wrong. I do local execution only. It seems not to be related to remote execution as I thought, but more related to clearml.Task:

` args = parser.parse_args()
print(args) # FIRST OUTPUT

command = args.command
enqueue = args.enqueue
track_remote = args.track_remote
preset_name = args.preset
type_name = args.type
environment_name = args.environment
nvidia_docker = args.nvidia_docker

# Initialize ClearML Task
task = (
    Task.init(
        project_name="reinforcement-learning/" + type_name,
        task_name=args.name or preset_name,
        tags=[environment_name],
        output_uri=True,
    )
    if track_remote or enqueue
    else None
)

print(task.get_parameters()) # SECOND OUTPUT `

First print(args) :
Namespace(checkpoint=None, checkpoint_every=1000, checkpoint_test_every=1000, command='train', device='cuda', enqueue=None, environment='walker_stand', jit=False, mixed_precision=False, name=None, nvidia_docker=False, preset='rlad.modules.dreamer.presets.dmc.original', render=False, steps=5000000, symbolic_obs=False, test_every=2000, test_steps=1000, track_remote=True, type='dmc')Second print print(task.get_parameters()) :
{'Args/command': "['train', 'rlad.modules.dreamer.presets.dmc.original', 'dmc', 'walker_stand', '5000000', '--test-steps', '1000', '--test-every', '2000', '--checkpoint-test-every', '1000', '--checkpoint-every', '1000', '--track-remote']", 'Args/preset': 'rlad.modules.dreamer.presets.dmc.original', 'Args/type': 'dmc', 'Args/environment': 'walker_stand', 'Args/nvidia_docker': 'False', 'Args/enqueue': '', 'Args/track_remote': 'True', 'Args/device': 'cuda', 'Args/name': '', 'Args/render': 'False', 'Args/checkpoint': '', 'Args/symbolic_obs': 'False', 'Args/mixed_precision': 'False', 'Args/jit': 'False', 'Args/steps': '5000000', 'Args/checkpoint_every': '1000', 'Args/checkpoint_test_every': '1000', 'Args/test_every': '2000', 'Args/test_steps': '1000'}

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Just to make sure I understand, running locally creates the Args/command correctly, then when actually executed on the remote machine (i.e. execute_remotely creates the correct Args/command But when the agent actually executes it) it updates back the Args/command as a list. Is that a correct description ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

` args = parser.parse_args()
print(args) # args PRINTED HERE ON LOCAL

command = args.command
enqueue = args.enqueue
track_remote = args.track_remote
preset_name = args.preset
type_name = args.type
environment_name = args.environment
nvidia_docker = args.nvidia_docker

# Initialize ClearML Task
task = (
    Task.init(
        project_name="reinforcement-learning/" + type_name,
        task_name=args.name or preset_name,
        tags=[environment_name],
        output_uri=True,
    )
    if track_remote or enqueue
    else None
)
# Execute remotly via CLearML
if enqueue is not None:
    task.execute_remotely(queue_name=enqueue, clone=False, exit_process=True) `

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

That seems to be the case. After parsing the args I run task = Task.init(...) and then task.execute_remotely(queue_name=args.enqueue, clone=False, exit_process=True) .

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

if executed remotely...

You mean cloning the local execution, sending to the agent, then when running on the agent the Args/command is updated to a list ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

What I get for args when I print it locally is not the same as what ClearML extracts from args .

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

If you compare the two outputs it put at the top of this thread, the one being the output if executed locally and the other one being the output if executed remotely, it seems like command is different and wrong on remote.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

With remote_execution it is

command="[...]"

, but on local it is

command='train'

like it is supposed to be.

I'm not sure I follow, could you expand ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Ah, it actually is also a string with remote_execution, but still not what it should be.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

With remote_execution it is command="[...]" , but on local it is command='train' like it is supposed to be.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Yes

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

And command is a list instead of a single str

"command list", you mean the command argument ?

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

So missing args that are not specified are not None like intended, but just do not exists in args . And command is a list instead of a single str.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Args is similar to what is shown in print(args) when executed remotely.

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					ReassuredTiger98
				
					0
					 × 1

Hi ReassuredTiger98
It's clearml that needs to support subparser, and it does support it.
What are you seeing in the Args section ?
(Notice that at the end all the args parsing are stored on the global "args" variable after you call the pasre_args(), clearml will basically take those variables and put them into Args section)

  				
Posted 
	4 years ago

					More
				  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

2K Views

30 Answers

4 years ago

2 years ago