Using a pyenv virtual env then exporting LOCALPYTHON env var
Yes, the problem is it's still really hidden (the error, I mean)
Yes I take the export statements from my bash script of the task
I guess it makes no sense because of the steps a clearml-agent works...
I also thought about going to pip
mode but not all packages are detected from our poetry.lock file unfortunately so cannot do that.
I literrally connected to it at runtime, and ran poetry install -n
and it worked
I see it's running inside 3.9, so I assume it's correct
I am currently trying with a new dummy repo and I iterate over the dependencies of the pyproject.toml.
You can theoretically do that in the docker init bash script that will be executed before the task is cloned and run
the autoscaler always uses docker mode
CostlyOstrich36 SuccessfulKoala55 I tried with dummy repo. Using Python and stripe packages ONLY in the pyproject.toml
Here is my result (still failing) :
Poetry Enabled: Ignoring requested python packages, using repository poetry lock file!
Creating virtualenv debug in /root/.clearml/venvs-builds/3.9/task_repository/clearmldebug.git/.venv
Using virtualenv: /root/.clearml/venvs-builds/3.9/task_repository/clearmldebug.git/.venv
Installing dependencies from lock file
Finding the necessary packages for the current system
Package operations: 6 installs, 0 updates, 0 removals
failed installing poetry requirements: Command '['poetry', 'install', '-n', '-v']' returned non-zero exit status 1.
Ignoring pip: markers 'python_version >= "3.10"' don't match your environment
The autoscaler just runs it on an AWS instance, inside a docker container - there's no difference from running it yourself inside a docker container - did you try running it inside a docker container as well?
How to make sure that the python version is correct?
Because I was ssh-ing to it before the fail. When poetry fails, it installs everything using PIP
SuccessfulRaven86 , did you install poetry inside the EC2 instance or inside the docker? Basically, where do you put the poetry installation bash script - in the 'init script' section of the autoscaler or on the task's 'setup shell script' in execution tab (This is basically the script that runs inside the docker)
It sounds like you're installing poetry on the ec2 instance itself but the experiment runs inside a docker container
This is really extremely hard to debug. I am thinking to create another repo and iterate on the packages to hopefully find the problem, but it will take ages.
I tried too. I do not have more logs inside the ClearML agent 😞
Yes should be correct. Inside the bash script of the task.
into the same docker container running the task?
When the task finally failed, I was kicked of from the container
SuccessfulRaven86 can you try with -vvv
instead of -v
?
CostlyOstrich36 poetry is installed as part of the bash script of the task.
The init script of the AWS autoscaler only contains three export variables I set.
Is it a bug inside the AWS autoscaler??
Yes indeed, but what about the possibility to do the clone/poetry installation ourself in the init bash script of the task?
How is it still up is the task failed?
SuccessfulKoala55 Do you think it is possible to ask to run docker mode in the aws autoscaler, and add the cloning and installation inside the init bash script of the task?
And I just tried with Python 3.8 (default version of the image) and it still fails.
Poetry Enabled: Ignoring requested python packages, using repository poetry lock file!
Creating virtualenv debug in /root/.clearml/venvs-builds/3.8/task_repository/clearmldebug.git/.venv
Using virtualenv: /root/.clearml/venvs-builds/3.8/task_repository/clearmldebug.git/.venv
2023-04-18 15:03:52
Installing dependencies from lock file
Finding the necessary packages for the current system
Package operations: 6 installs, 0 updates, 0 removals
failed installing poetry requirements: Command '['poetry', 'install', '-n', '-v']' returned non-zero exit status 1.
SuccessfulRaven86 , to make things easier to debug, can you try running the agent locally?