Hi Everyone, I’Ve Been Using Clearml For A While Now And I Wanted To Add The Option To Execute My Code Remotely As A Command Line Argument. I Have The Clearml Agents And Queues Set Up, And That Seems To Be Working Correctly (Cloning And Running Experiment

Answered

Hi everyone,
I’ve been using ClearML for a while now and I wanted to add the option to execute my code remotely as a command line argument. I have the ClearML agents and queues set up, and that seems to be working correctly (cloning and running experiments through the ClearML UI and API works).

However, when I use python -m <path>.<to>.<module> --arg1 <arg1_value> --arg2 <arg2_value> ... to run my code the ClearML agent tries to use everything as the file name, which results in the following error:
clearml_agent: ERROR: [Errno 36] File name too long: '/root/.clearml/venvs-builds/task_repository/<repo_name>.git/-m <path>.<to>.<module> --arg1 <arg1_value> --arg2 <arg2_value> ...'
This is how I create the ClearML task and call execute_remotely :
` args = parse_args()

task = Task.init(project_name=args.project_name,
task_name=args.task_name,
output_uri=True,
reuse_last_task_id=False)

if args.execute_remotely:
task.execute_remotely(queue_name=args.remote_queue,
clone=False,
exit_process=True)

<training code> `
Any ideas of what I could be doing wrong?

  				
Posted 
	2 years ago

					More  		
  Report
		
					SteepDeer88
				
					0
					 × 1

Votes Newest

Answers 11

Latest clearml version I believe (1.4.0)

  				
Posted 
	2 years ago

					More  		
  Report
		
					SteepDeer88
				
					0
					 × 1

Hi SteepDeer88
I wrote this script to try to reproduce the error. I am passing there +50 parameters and so far everything works fine. Could you please give me some more details about your issue, so that we could reproduce it ?

from clearml import Task
import argparse

'''
COMMAND LINE:
python -m my_script --project_name my_project --task_name my_task --execute_remotely true --remote_queue default --param_1 parameter --param_2 parameter <...etc>
'''

parser = argparse.ArgumentParser()
parser.add_argument("--project_name")
parser.add_argument("--task_name")
parser.add_argument("--execute_remotely")
parser.add_argument("--remote_queue")

#adding 50 arguments
for i in range(1, 51):
str = f"--param_{i}"
parser.add_argument(str)
args = parser.parse_args()

task = Task.init(project_name=args.project_name,
task_name=args.task_name,
output_uri=True,
reuse_last_task_id=False)

if args.execute_remotely:
task.execute_remotely(queue_name=args.remote_queue,
clone=False,
exit_process=True)

  				
Posted 
	2 years ago

					More  		
  Report
		
					SweetBadger76
				
					0
					 × 1

SteepDeer88 which clearml version are you using?

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyPenguin76
				
					0
					 Administrator

and how did you run this script? just from the CLI? PyCharm? which OS?

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyPenguin76
				
					0
					 Administrator

Hi SweetBadger76 , I have not been able to deal with this issue yet. I am getting all sorts of weird behaviours which are likely due to some miss configuration of my ClearML agents or of the experiments I am trying to run. The latest one is that ClearML agents are ignoring my --docker flag and running everything on the host machine using an env. On this, can you clarify something for me: if I clone an experiment, will the configs on the experiment overwrite the ones from the agent? For example, if the experiment I am cloning has no docker image and parameters set, will that make the agent ignore the ones I set in clearml.conf ?

  				
Posted 
	2 years ago

					More  		
  Report
		
					SteepDeer88
				
					0
					 × 1

TimelyPenguin76 Thanks for you suggestion! I’ve considered using that but, from what I understood, it doesn’t offer the same functionalities.
For example, by creating the task within the script it already identifies the branch/commit being used and also includes uncommitted changes. I also have auxiliary functions to guarantee that the experiments go to the correct projects within ClearML according to the script that is being used. It would also require a significant change to the command line I use for test runs on my dev machine to the one used to running it on the training machine.

  				
Posted 
	2 years ago

					More  		
  Report
		
					SteepDeer88
				
					0
					 × 1

Update on this one, I noticed I had different versions of clearml in my dev machine and the training machine (host and container). Updating both to the latest 1.4.1 caused a different error (related to the other question I posted in the channel) where it tries to install the packages from my dev machine (windows) in the docker container used in the training machine (ubuntu container, ubuntu host). The main issue I am trying to get around now is that I use pycocotools which has a package specific for windows ( pycocotools-windows )
This happens even though I am setting the env var CLEARML_AGENT_SKIP_PIP_VENV_INSTALL=/usr/bin/python

  				
Posted 
	2 years ago

					More  		
  Report
		
					SteepDeer88
				
					0
					 × 1

hey SteepDeer88
did you managed to get rid of that issue or you still need support on it ?

  				
Posted 
	2 years ago

					More  		
  Report
		
					SweetBadger76
				
					0
					 × 1

TimelyPenguin76 SweetBadger76 thanks for the support!
I ran the script on the terminal (powershell) using a command similar to python -m <path>.<to>.<module> --arg1 <arg1_value> --arg2 <arg2_value> ...
I ran it on Windows. The ClearML server is running on Ubuntu.

I will create a minimal program that reproduces the error and come back to you (I will also test both on WSL and Ubuntu to have a better idea if it is OS specific)

  				
Posted 
	2 years ago

					More  		
  Report
		
					SteepDeer88
				
					0
					 × 1

Hi, SteepDeer88

For example, if the experiment I am cloning has no docker image and parameters set, will that make the agent ignore the ones I set in

clearml.conf

?

No, the experiment should run in docker mode if the agent was run with --docker mode

  				
Posted 
	2 years ago

					More  		
  Report
		
					CostlyOstrich36
				
					0

Hi SteepDeer88 ,

You can use https://clear.ml/docs/latest/docs/apps/clearml_task for this, what do you think?

  				
Posted 
	2 years ago

					More  		
  Report
		
					TimelyPenguin76
				
					0
					 Administrator

Write your answer

2K Views

11 Answers

2 years ago

Answers 11

Hi SteepDeer88I wrote this script to try to reproduce the error. I am passing there +50 parameters and so far everything works fine. Could you please give me some more details about your issue, so that we could reproduce it ?

Hi SteepDeer88
I wrote this script to try to reproduce the error. I am passing there +50 parameters and so far everything works fine. Could you please give me some more details about your issue, so that we could reproduce it ?