Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Trying To Run Aws Autoscaler With

Trying to run AWS autoscaler with poetry queue, and I get:
Traceback (most recent call last): File "/root/.local/bin/poetry", line 5, in <module> from poetry.console.application import main ModuleNotFoundError: No module named 'poetry'I know this is not strictly ClearML related, but I wonder if anyone has had any success?
(source CLI is that the agent is trying to run poetry run python -u train.py )

  
  
Posted one year ago
Votes Newest

Answers 30


Ah. In the extra_vm_bash_script of the AWS autoscaler.

  
  
Posted one year ago

Sure SuccessfulKoala55 , and thanks for looking into it.

As an alternative (for now, or in general), we could consider reverting back to pip. The issue we encounter is that we have a monorepo, so frozen requirements should specify relative paths, but pip freeze does not seem to do that, so ClearML also fails in pip mode

  
  
Posted one year ago

Still crashing, I think that may not be the correct virtual environment to edit 🤔
It's the one created later down the line

  
  
Posted one year ago

SuccessfulKoala55 help me out here 🙂
It seems all the changes I make in the AWS autoscaler apply directly to the virtual environment set for the autoscaler, but nothing from that propagates down to the launched instances.
So e.g. the autoscaler environment has poetry installed, but then the instance fails because it does not have it available?

  
  
Posted one year ago

SuccessfulKoala55 no that did not solve the issue 😞

  
  
Posted one year ago

The agent creates a venv in which the script is run, are you sure this venv has access to the python system site packages?

  
  
Posted one year ago

I'll try a hacky-way around it with sed -i 's/include-system-site-packages = false/include-system-site-packages = true/g' clearml_agent_venv/pyvenv.cfg and report back.

  
  
Posted one year ago

👍

  
  
Posted one year ago

I think it's not there since the main goal was supporting docker mode (and it was missed)

  
  
Posted one year ago

That still seems to crash SuccessfulKoala55 🤔
EDIT: No, wait, the environment still needs updating. One moment still...

  
  
Posted one year ago

I'll try that in a bit (that requires some access control changes). Any idea how can I modify the dynamically created virtualenv?

Poetry Enabled: Ignoring requested python packages, using repository poetry lock file! The currently activated Python version 3.10.6 is not supported by the project (~3.8.0). Trying to find and use a compatible version. Using python3.8 (3.8.16) Creating virtualenv ... in /root/.clearml/venvs-builds/3.10/task_repository/...git/.venv Installing dependencies from lock file

  
  
Posted one year ago

SuccessfulKoala55 it does not

  
  
Posted one year ago

Ultimately we're trying to avoid docker in AWS autoscaler (virtualization on top of virtualization seems redundant), and instead we maintain an AMI for a faster boot sequence.
We had no issues when we used pip , but now when trying to work with poetry all these issues came up.
The way I understand poetry to work, is that it is expected there is one system-wide installation that is used for virtual environment creation and manipulation. So at least it may be desired that the poetry installation is inherited from system-wide?

  
  
Posted one year ago

But to be fair, I've also tried with python3.X -m pip install poetry etc. I get the same error.

  
  
Posted one year ago

I meant where is that done?

  
  
Posted one year ago

Or to be clear, the environment installed by the autoscaler under /clearml_agent_venv has poetry installed, and it uses that to set up the environment for the executed task, e.g. in root/.clearml/venvs-builds/3.10/task_repository/.../.venv , but the latter does not have poetry installed, and so it crashes?

  
  
Posted one year ago

I also tried adding gent.package_manager.system_site_packages = true to ensure these virtual environments have access btw, still no avail

  
  
Posted one year ago

And?

  
  
Posted one year ago

Now my extra_vm_bash_script looks like so:
deactivate apt-get install -y gfortran libopenblas-dev liblapack-dev libpq-dev python-is-python3 python3-pip python3-dev proj-bin libgraphviz-dev graphviz graphviz-dev libgdal-dev apt-get install software-properties-common -y add-apt-repository ppa:deadsnakes/ppa -y apt update apt install python3.7 python3.8 python3.9 python3.7-distutils python3.8-distutils python3.9-distutils python3.10-distutils python3.7-dev python3.8-dev python3.9-dev python3.10-dev -y curl -sSL | python3 - export PATH=\"/root/.local/bin:$PATH\" poetry --version sed -i 's/include-system-site-packages = false/include-system-site-packages = true/g' clearml_agent_venv/pyvenv.cfg git config --system credential.helper \"store --file /root/.git-credentials\" python3.7 -m pip install virtualenv python3.8 -m pip install virtualenv python3.9 -m pip install virtualenv python3.10 -m pip install virtualenv export AWS_ACCESS_KEY_ID=... export AWS_SECRET_ACCESS_KEY=... source clearml_agent_venv/bin/activate

  
  
Posted one year ago

Created this for follow up, SuccessfulKoala55 ; I'm really stumped. Spent the entire day on this 🥹
https://github.com/allegroai/clearml-agent/issues/134

  
  
Posted one year ago

Is there a way to specify that flag within the config file, SuccessfulKoala55 ?

  
  
Posted one year ago

We're not using the docker setup though. The CLI run by the autoscaler is python -m clearml_agent --config-file /root/clearml.conf daemon --queue aws_small , so no docker

  
  
Posted one year ago

I think the agent runs the script inside the machine in a docker container, I would assume this is missing from inside the docker container (and not really required in the vm machine itself)

  
  
Posted one year ago

I've tried also e.g. setting gent.package_manager.priority_packages = ["poetry"] , and/or agent.package_manager.poetry_version = ">1.2.0" , and other flags, but these affect only the main /clearml_agent_venv environment, and not the one actually generated by the clearml-agent when executing the task

  
  
Posted one year ago

I think the default command used to create the venv does not specify --system-site-packages

  
  
Posted one year ago

Let me have a quick look.

  
  
Posted one year ago

If you ssh into that machine and into the venv, can you see if it inherits the system packages?

  
  
Posted one year ago

It's possible for the agent, but I'm not sure it's supported by the SDK's cloud driver... If it solves your issue, this might be a good addition

  
  
Posted one year ago

Thanks for the details, UnevenDolphin73 , and sorry for the inconvenience - we'll try to nail this down...

  
  
Posted one year ago

Nothing?

  
  
Posted one year ago
933 Views
30 Answers
one year ago
one year ago
Tags