Yes I take the export statements from my bash script of the task
I think you should try to manually start such a docker container and try to see what fails in the process. Attaching to an existing one has too many differences already
How to make sure that the python version is correct?
Because I was ssh-ing to it before the fail. When poetry fails, it installs everything using PIP
Yes should be correct. Inside the bash script of the task.
I see it's running inside 3.9, so I assume it's correct
Using a pyenv virtual env then exporting LOCALPYTHON env var
but I still had time to go inside the container, export the PATH variables for my poetry and python versions, and run the poetry install command there
How do you explain that it works when I ssh-ed into the same AWS container instance from the autoscaler?
When the task finally failed, I was kicked of from the container
It just allows me to have access to poetry and python installed on hte container
@<1523701070390366208:profile|CostlyOstrich36> poetry is installed as part of the bash script of the task.
The init script of the AWS autoscaler only contains three export variables I set.
@<1523701087100473344:profile|SuccessfulKoala55> Do you think it is possible to ask to run docker mode in the aws autoscaler, and add the cloning and installation inside the init bash script of the task?
How is it still up is the task failed?
the autoscaler always uses docker mode
@<1556812486840160256:profile|SuccessfulRaven86> , did you install poetry inside the EC2 instance or inside the docker? Basically, where do you put the poetry installation bash script - in the 'init script' section of the autoscaler or on the task's 'setup shell script' in execution tab (This is basically the script that runs inside the docker)
It sounds like you're installing poetry on the ec2 instance itself but the experiment runs inside a docker container
My issue has been resolved going with pip.
Is it a bug inside the AWS autoscaler??
@<1556812486840160256:profile|SuccessfulRaven86> can you try with -vvv
instead of -v
?
Yes, the problem is it's still really hidden (the error, I mean)
I am currently trying with a new dummy repo and I iterate over the dependencies of the pyproject.toml.
Yes indeed, but what about the possibility to do the clone/poetry installation ourself in the init bash script of the task?
You can theoretically do that in the docker init bash script that will be executed before the task is cloned and run
The autoscaler just runs it on an AWS instance, inside a docker container - there's no difference from running it yourself inside a docker container - did you try running it inside a docker container as well?
I tried too. I do not have more logs inside the ClearML agent 😞
@<1556812486840160256:profile|SuccessfulRaven86> , to make things easier to debug, can you try running the agent locally?