Hello Channel, I Am Struggling A Lot On An Issue Linked To

Answered

Hello channel,

I am struggling a lot on an issue linked to ClearMl agent and AWS Autoscaler .
This issue is very problematic and urgent, please help me out! We need the autoscaler to work to be able to pull and execute our tasks correctly.

Context:
I want to install a poetry environment from a poetry.lock file located at the root of my repository.
I have a bash script running allowing to install poetry and install python 3.9.16 using Pyenv (needed by my poetry env).
The installation goes on, my repository is cloned, poetry is detected and then poetry installation fails directly without any logs (even when using the verbose mode).

Here is the clearml.conf of my AWS autoscaler config and the logs of one of the latest runs (you can find the bash script inside it) + snippet of the error :

Poetry Enabled: Ignoring requested python packages, using repository poetry lock file!
Creating virtualenv alfred in /root/.clearml/venvs-builds/3.9/task_repository/alfred.git/.venv
Using virtualenv: /root/.clearml/venvs-builds/3.9/task_repository/alfred.git/.venv
Installing dependencies from lock file
Finding the necessary packages for the current system
Package operations: 351 installs, 1 update, 1 removal, 22 skipped
failed installing poetry requirements: Command '['poetry', 'install', '-n', '-v']' returned non-zero exit status 1.
Ignoring pip: markers 'python_version >= "3.10"' don't match your environment

agent {
    vcs_cache.enabled: false

    package_manager: {
          type: poetry,
          poetry_version: "1.4.2",
          poetry_install_extra_args: ["-v"],
     }  
}

sdk {
    development {
         store_code_diff_from_remote: false,
    }

}

Issue and what has been done:

I tested all clearml.co nf parameters without success
I created an AWS instance, ssh-ed into it and try the installation of poetry and it worked...
Last but not least which leads to my full confusion : I setted up SSH access to the AWS autoscaler instances, SSHed to it during runtime, went into the created docker container, waited until the git clone command made effect and poetry installation failed to be able to try myself the poetry install -n -v command. Again, It worked... I am extremely confused because it works on the instance itself created by the autoscaler, but it does not work when running on ClearML. What am I missing??
Thank you a lot for your time,
CC @<1523701070390366208:profile|CostlyOstrich36> @<1523701205467926528:profile|AgitatedDove14> @<1523701087100473344:profile|SuccessfulKoala55>

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulRaven86
				
					0
					 × 1

Votes Newest

Answers 40

The autoscaler just runs it on an AWS instance, inside a docker container - there's no difference from running it yourself inside a docker container - did you try running it inside a docker container as well?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Yes same error

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulRaven86
				
					0
					 × 1

@<1556812486840160256:profile|SuccessfulRaven86> , to make things easier to debug, can you try running the agent locally?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

You can theoretically do that in the docker init bash script that will be executed before the task is cloned and run

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

And I just tried with Python 3.8 (default version of the image) and it still fails.

Poetry Enabled: Ignoring requested python packages, using repository poetry lock file!
Creating virtualenv debug in /root/.clearml/venvs-builds/3.8/task_repository/clearmldebug.git/.venv
Using virtualenv: /root/.clearml/venvs-builds/3.8/task_repository/clearmldebug.git/.venv
2023-04-18 15:03:52
Installing dependencies from lock file
Finding the necessary packages for the current system
Package operations: 6 installs, 0 updates, 0 removals
failed installing poetry requirements: Command '['poetry', 'install', '-n', '-v']' returned non-zero exit status 1.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulRaven86
				
					0
					 × 1

@<1523701070390366208:profile|CostlyOstrich36> poetry is installed as part of the bash script of the task.

The init script of the AWS autoscaler only contains three export variables I set.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulRaven86
				
					0
					 × 1

the autoscaler always uses docker mode

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Is it a bug inside the AWS autoscaler??

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulRaven86
				
					0
					 × 1

So the same?

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulKoala55
				
					0
					 × 1

Yes should be correct. Inside the bash script of the task.

  				
Posted 
	2 years ago

					More
				  		
  Report
		
					SuccessfulRaven86
				
					0
					 × 1

Show more results

Write your answer

173K Views

40 Answers

2 years ago