Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello Channel, I Am Struggling A Lot On An Issue Linked To

Hello channel,

I am struggling a lot on an issue linked to ClearMl agent and AWS Autoscaler .
This issue is very problematic and urgent, please help me out! We need the autoscaler to work to be able to pull and execute our tasks correctly.

Context:
I want to install a poetry environment from a poetry.lock file located at the root of my repository.
I have a bash script running allowing to install poetry and install python 3.9.16 using Pyenv (needed by my poetry env).
The installation goes on, my repository is cloned, poetry is detected and then poetry installation fails directly without any logs (even when using the verbose mode).

Here is the clearml.conf of my AWS autoscaler config and the logs of one of the latest runs (you can find the bash script inside it) + snippet of the error :

Poetry Enabled: Ignoring requested python packages, using repository poetry lock file!
Creating virtualenv alfred in /root/.clearml/venvs-builds/3.9/task_repository/alfred.git/.venv
Using virtualenv: /root/.clearml/venvs-builds/3.9/task_repository/alfred.git/.venv
Installing dependencies from lock file
Finding the necessary packages for the current system
Package operations: 351 installs, 1 update, 1 removal, 22 skipped
failed installing poetry requirements: Command '['poetry', 'install', '-n', '-v']' returned non-zero exit status 1.
Ignoring pip: markers 'python_version >= "3.10"' don't match your environment 
agent {
    vcs_cache.enabled: false

    package_manager: {
          type: poetry,
          poetry_version: "1.4.2",
          poetry_install_extra_args: ["-v"],
     }  
}

sdk {
    development {
         store_code_diff_from_remote: false,
    }

}

Issue and what has been done:

  • I tested all clearml.co nf parameters without success
  • I created an AWS instance, ssh-ed into it and try the installation of poetry and it worked...
  • Last but not least which leads to my full confusion : I setted up SSH access to the AWS autoscaler instances, SSHed to it during runtime, went into the created docker container, waited until the git clone command made effect and poetry installation failed to be able to try myself the poetry install -n -v command. Again, It worked... I am extremely confused because it works on the instance itself created by the autoscaler, but it does not work when running on ClearML. What am I missing??
    Thank you a lot for your time,
    CC @<1523701070390366208:profile|CostlyOstrich36> @<1523701205467926528:profile|AgitatedDove14> @<1523701087100473344:profile|SuccessfulKoala55>
  
  
Posted 2 years ago
Votes Newest

Answers 40


My issue has been resolved going with pip.

  
  
Posted 2 years ago

and are you sure these are the same env vars available when the agent does the same?

  
  
Posted 2 years ago

When the task finally failed, I was kicked of from the container

  
  
Posted 2 years ago

How to make sure that the python version is correct?

  
  
Posted 2 years ago

Yes indeed, but what about the possibility to do the clone/poetry installation ourself in the init bash script of the task?

  
  
Posted 2 years ago

I also did that in the following way:

  • I put a sleep inside the bash script
  • I ssh-ed to the fresh container and did all commands myself (cloning, installation) and again it worked...
  
  
Posted 2 years ago

How do you explain that it works when I ssh-ed into the same AWS container instance from the autoscaler?

  
  
Posted 2 years ago

Using a pyenv virtual env then exporting LOCALPYTHON env var

  
  
Posted 2 years ago

I see it's running inside 3.9, so I assume it's correct

  
  
Posted 2 years ago

@<1556812486840160256:profile|SuccessfulRaven86> , did you install poetry inside the EC2 instance or inside the docker? Basically, where do you put the poetry installation bash script - in the 'init script' section of the autoscaler or on the task's 'setup shell script' in execution tab (This is basically the script that runs inside the docker)

It sounds like you're installing poetry on the ec2 instance itself but the experiment runs inside a docker container

  
  
Posted 2 years ago
149K Views
40 Answers
2 years ago
2 years ago
Tags
Similar posts