Hi SuperficialGrasshopper36 , that sounds like a good starting point π
Thanks AgitatedDove14 I think the change your suggestion will work if it would mean the client authenticates itself before each time it attempts to run an experiment.
However I have another issue right now, I've manually authenticated on the instance running clearml-agent
. We use poetry
to install packages for a given project. From the logs, they are installed correctly and a venv
is created. But when it comes to running the initial task. It seems that it doesn't use the venv
that was created because it can't import the required modules?
Installing the current project: **** Adding venv into cache: /home/ubuntu/.clearml/venvs-builds/3.8 Running task id [4ce5aedc75404225b37eda2d9bd9ad8f]: [****/dataset_creation]$ /home/ubuntu/.clearml/venvs-builds/3.8/bin/python -u create_dataset.py Summary - installed python packages: pip: [] Environment setup completed successfully Starting Task Execution: Traceback (most recent call last): File "create_dataset.py", line 6, in <module> from pymongo.collection import Collection ModuleNotFoundError: No module named 'pymongo'
My bad, I realised that the Adding venv into cache: /home/ubuntu/.clearml/venvs-builds/3.8
actually adds the entire repository with the venv inside it i.e /home/ubuntu/.clearml/venvs-builds/3.8/task_repository/***.git/.venv
. However the the default path that it uses /home/ubuntu/.clearml/venvs-builds/3.8/bin/python
is has a venv without all the requried python packages
Is there a way to detect the repository when initialising a task?
SuperficialGrasshopper36 This should have happened automatically when you call Task.init()
you can also set theΒ
agent.package_manager.extra_index_url
Β , but since this is dynamic,...
You are correct, sine this is dynamic there is no need to set the " extra_index_url
" configuration in clearml.conf, the additional bash script will configure pip directly. Make sense ?
Okay that seems to explain it. Now the question is why it installed it in the wrong place.
Hmm two questions: 1. How come it did not detect the packages when you were running the original task manually? 2. Could it be the poetry manager option is not working correctly?! Can you verify the venv is created with all packages? If so can you post the full log?
SuperficialGrasshopper36 regrading the codeartifact
I think the easiest will be to have a bash script authenticating the codeartifact with the aws command at the beginning of each docker spin. This can be done by adding it to:
https://github.com/allegroai/clearml-agent/blob/81edd2860fbc09e2a179985d8315ffaba851dcd7/docs/clearml.conf#L136
For example:extra_docker_shell_script: ["apt-get install -y aws_cli_or_something", "aws cli authenticate me command"]
wdyt?
AgitatedDove14
I'm guessing agent is running in venv mode, it was started using clearml-agent daemon --queue default --detached
Under installed packages I have " No changes logged ". I have already set the package manager to poetry in the clearml.conf file in the agent section In the execution it installs all the required packages fine and creates the venv, just doesn't use it(I can access the cached venv and when activated, the required packages can be imported)
PS: the experiment I am trying to run is a clone of a previous one
cheers,
AgitatedDove14 thanks for responding
The initial experiment ran fine, see "Installed Packages: section looks like:` # Python 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0]
boto3 == 1.17.36
botocore == 1.20.36
clearml == 0.17.5
flake8 == 3.9.0
jmespath == 0.10.0
lxml == 4.6.3
matplotlib == 3.3.4
networkx == 2.5
numpy == 1.19.5
pyflakes == 2.3.1
pymongo == 3.11.3
pytest == 5.4.3
scikit_learn == 0.23.2
scipy == 1.6.1
setuptools == 52.0.0
tensorboard == 2.4.1
torch == 1.6.0+cpu
torch_geometric == 1.6.3
torch_optimizer == 0.1.0
tqdm == 4.59.0
wget == 3.2
Detailed import analysis
**************************
IMPORT PACKAGE boto3
clearml.storage: 0
IMPORT PACKAGE botocore
***/tests/mocks/mock_s3_helper.py: 5
IMPORT PACKAGE clearml
***/utils/initialise_clearml_experiment.py: 5
IMPORT PACKAGE flake8
.eggs/flake8-3.8.4-py3.9.egg/flake8/main.py: 2
.eggs/flake8-3.8.4-py3.9.egg/flake8/api/legacy.py: 10,11,12,13
.eggs/flake8-3.8.4-py3.9.egg/flake8/checker.py: 16,17,18,19
.eggs/flake8-3.8.4-py3.9.egg/flake8/formatting/base.py: 8,9
.eggs/flake8-3.8.4-py3.9.egg/flake8/formatting/default.py: 4,7
.eggs/flake8-3.8.4-py3.9.egg/flake8/main/application.py: 17,18,19,23
.eggs/flake8-3.8.4-py3.9.egg/flake8/main/cli.py: 5
.eggs/flake8-3.8.4-py3.9.egg/flake8/main/git.py: 16,17,41
.eggs/flake8-3.8.4-py3.9.egg/flake8/main/mercurial.py: 12,27
.eggs/flake8-3.8.4-py3.9.egg/flake8/main/options.py: 5,6,7
.eggs/flake8-3.8.4-py3.9.egg/flake8/main/setuptools_command.py: 8
.eggs/flake8-3.8.4-py3.9.egg/flake8/main/vcs.py: 7,8,9
.eggs/flake8-3.8.4-py3.9.egg/flake8/options/aggregator.py: 10,11
.eggs/flake8-3.8.4-py3.9.egg/flake8/options/config.py: 8
.eggs/flake8-3.8.4-py3.9.egg/flake8/options/manager.py: 11
.eggs/flake8-3.8.4-py3.9.egg/flake8/plugins/manager.py: 5,6,7
.eggs/flake8-3.8.4-py3.9.egg/flake8/plugins/pyflakes.py: 18
.eggs/flake8-3.8.4-py3.9.egg/flake8/processor.py: 10,11,12,13
.eggs/flake8-3.8.4-py3.9.egg/flake8/statistics.py: 6
.eggs/flake8-3.8.4-py3.9.egg/flake8/style_guide.py: 13,14,15,16,17
.eggs/flake8-3.8.4
... `
One question - you can also set the agent.package_manager.extra_index_url
, but since this is dynamic, will pip install still add the extra index URL from the pip config file? Or does it have to be set in this agent config variable?
Did the shell script route work? I have a similar question.
It's a little more complicated because the index URL is not fixed; it contains the token which is only valid for a max of 12 hours. That means the ~/.config/pip/pip.conf
file will also need to be updated every 12 hours. Fortunately, this editing is done automatically by authenticating AWS codeartefact in the command line by logging in.
My current thinking is as follows:
Install the awscli
- pip install awscli
(could also use apt-get install awscli
if pip is not already installed in docker container) Authenticate AWS CodeArtefact, which should also updated ~/.config/pip/pip.conf
- aws codeartifact login --tool pip --repository my-repo --domain my-domain --domain-owner 111122223333
. This should work as expected if the correct IAM permissions/roles are attached to the agent and child container
I would then hope that pip installing inside the new container should have access to the private repository as well as the public
What do you have under the "installed packages" section? Also you can configure the agent to use poetry to restore the environment (instead of pip)
TenseOstrich47 you can actually enter this script as part of the extra_docker_shell_script
This will be executed at the beginning of each Task inside the container, and as long as the execution time is under 12h, you should be fine. wdyt?
Are you running the agent in docker mode or venv mode?
the poetry.toml
looks like this:[virtualenvs] in-project = true
Do you think this is the issue?
Can you post the toml file? Maybe the answer is there
Awesome, thank you. I will give that a try later this week and update if it worked as expected! May finally solve my private dependencies issue π
Do you know of any related best practice? I've never used CodeArchitect before
I'd like to add I changed the in-project = false
. And the venvs is still getting in a different place to the one the clearml-agent tries to use when running the experimetn
Not sure, SuccessfulKoala55 but I think I've run into a bigger problem I realised that the tasks get created without any link to the original repo. So that when I clone a task on clearml-agent it doesn't have all the required packages that I need (including the private ones). Is there a way to detect the repository when initialising a task?
Correct:extra_docker_shell_script: ["apt-get install -y awscli", "aws codeartifact login --tool pip --repository my-repo --domain my-domain --domain-owner 111122223333"]
By script, you mean entering these two lines separately as a list for that extra_docker_shell_scripts
arugment?
-
- I confirm, the venv is created with all packages, when I go to the cached location
/home/ubuntu/.clearml/venvs-builds/3.8
and activate it, the packages are there. The full log:
` 1617282623837 ip-172-28-11-109:0 INFO task 4ce5aedc75404225b37eda2d9bd9ad8f pulled from 903741980edd416697ca8074f782bdb4 by worker ip-172-28-11-109:0
- I confirm, the venv is created with all packages, when I go to the cached location
1617282629221 ip-172-28-11-109:0 DEBUG Current configuration (clearml_agent v0.17.2, location: /tmp/.clearml_agent.xo_7zbpi.cfg):
agent.worker_id = ip-172-28-11-109:0
agent.worker_name = ip-172-28-11-109
agent.force_git_ssh_protocol = true
agent.python_binary =
agent.package_manager.type = poetry
agent.package_manager.pip_version = <20.2
agent.package_manager.system_site_packages = false
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = defaults
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = pytorch
agent.package_manager.torch_nightly = false
agent.package_manager.extra_index_url.0 = https://***
agent.venvs_dir = /home/ubuntu/.clearml/venvs-builds
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.vcs_cache.enabled = true
agent.vcs_cache.path = /home/ubuntu/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = /home/ubuntu/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = /home/ubuntu/.clearml/pip-cache
agent.docker_apt_cache = /home/ubuntu/.clearml/apt-cache
agent.docker_force_pull = false
agent.default_docker.image = nvidia/cuda:10.1-runtime-ubuntu18.04
agent.enable_task_env = false
agent.default_python = 3.8
agent.cuda_version = 110
agent.cudnn_version = 0
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.s3.key =
sdk.aws.s3.region =
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.default_output_uri =
sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false
sdk.development.store_uncommitted_code_diff_on_train = true
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.web_server = http://***
api.api_server = http://***
api.files_server = http://***
api.credentials.access_key = ***
api.host = ***
Executing task id [4ce5aedc75404225b37eda2d9bd9ad8f]:
repository =
branch = master
version_num = e267eca697341b8ccaf84be6f52480198b6e9682
tag =
docker_cmd = None
entry_point = create_dataset.py
working_dir = ***/dataset_creation
Using base prefix '/usr'
New python executable in /home/ubuntu/.clearml/venvs-builds/3.8/bin/python3.8
Also creating executable in /home/ubuntu/.clearml/venvs-builds/3.8/bin/python
Installing setuptools, pip, wheel...
1617282634455 ip-172-28-11-109:0 DEBUG done.
Using cached repository in "/home/ubuntu/.clearml/vcs-***"
Note: switching to 'e267eca697341b8ccaf84be6f52480198b6e9682'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:
git switch -c <new-branch-name>
Or undo this operation with:
git switch -
Turn off this advice by setting config variable advice.detachedHead to false
HEAD is now at e267eca Merge pull request #17 from ***
type: git
url:
branch: HEAD
commit: e267eca697341b8ccaf84be6f52480198b6e9682
root: /home/ubuntu/.clearml/venvs-builds/3.8/task_repository/***.git
Applying uncommitted changes
Poetry Enabled: Ignoring requested python packages, using repository poetry lock file!
Creating virtualenv *** in /home/ubuntu/.clearml/venvs-builds/3.8/task_repository/***.git/.venv
Installing dependencies from lock file
1617282639678 ip-172-28-11-109:0 DEBUG
Package operations: 125 installs, 0 updates, 0 removals
β’ Installing six (1.15.0)
1617282644955 ip-172-28-11-109:0 DEBUG β’ Installing certifi (2020.12.5)
β’ Installing chardet (4.0.0)
β’ Installing jmespath (0.10.0)
β’ Installing idna (2.10)
β’ Installing pyasn1 (0.4.8)
β’ Installing python-dateutil (2.8.1)
β’ Installing urllib3 (1.26.4)
β’ Installing botocore (1.20.36)
β’ Installing cachetools (4.2.1)
β’ Installing cycler (0.10.0)
β’ Installing kiwisolver (1.3.1)
β’ Installing numpy (1.19.5)
β’ Installing oauthlib (3.1.0)
β’ Installing pillow (6.2.2)
β’ Installing pyasn1-modules (0.2.8)
β’ Installing pytz (2021.1)
β’ Installing requests (2.25.1)
β’ Installing pyparsing (2.4.7)
β’ Installing rsa (4.7.2)
β’ Installing google-auth (1.28.0)
β’ Installing joblib (1.0.1)
β’ Installing matplotlib (3.3.4)
β’ Installing ply (3.11)
β’ Installing python-levenshtein (0.12.2)
β’ Installing requests-oauthlib (1.3.0)
β’ Installing scipy (1.6.1)
β’ Installing pandas (1.2.3)
β’ Installing requests-file (1.5.1)
β’ Installing soupsieve (2.2.1)
β’ Installing s3transfer (0.3.6)
β’ Installing texttable (1.6.3)
1617282650198 ip-172-28-11-109:0 DEBUG β’ Installing threadpoolctl (2.1.0)
β’ Installing tzlocal (2.1)
β’ Installing absl-py (0.12.0)
β’ Installing attrs (20.3.0)
β’ Installing beautifulsoup4 (4.9.3)
β’ Installing boto3 (1.17.36)
β’ Installing compress-pickle (1.2.0)
β’ Installing decorator (4.4.2)
β’ Installing future (0.18.2)
β’ Installing fuzzyset (0.0.19)
β’ Installing google-auth-oauthlib (0.4.3)
β’ Installing hjson (3.0.2)
β’ Installing html-parser (0.2)
β’ Installing grpcio (1.32.0)
1617282655480 ip-172-28-11-109:0 DEBUG β’ Installing lxml (4.6.3)
β’ Installing markdown (3.3.4)
β’ Installing orderedmultidict (1.0.1)
β’ Installing protobuf (3.15.6)
β’ Installing psycopg2-binary (2.8.6)
β’ Installing pyrsistent (0.17.3)
β’ Installing pyyaml (5.4.1)
β’ Installing rfc5424-logging-handler (1.4.3)
β’ Installing scikit-learn (0.23.2)
β’ Installing seaborn (0.10.1)
β’ Installing tensorboard-plugin-wit (1.8.0)
β’ Installing tldextract (2.2.3)
β’ Installing werkzeug (1.0.1)
1617282660768 ip-172-28-11-109:0 DEBUG β’ Installing astunparse (1.6.3)
β’ Installing click (7.1.2)
β’ Installing cython (0.29.14)
β’ Installing flatbuffers (1.12)
β’ Installing funcsigs (1.0.2)
β’ Installing gast (0.3.3)
β’ Installing google-pasta (0.2.0)
β’ Installing furl (2.1.0)
β’ Installing h5py (2.10.0)
β’ Installing humanfriendly (9.1)
β’ Installing isodate (0.6.0)
β’ Installing jsonschema (3.2.0)
β’ Installing keras-preprocessing (1.1.2)
β’ Installing llvmlite (0.34.0)
β’ Installing markupsafe (1.1.1)
β’ Installing more-itertools (8.7.0)
β’ Installing natebbcommon (3.1.0)
β’ Installing networkx (2.5)
β’ Installing opt-einsum (3.3.0)
β’ Installing packaging (20.9)
β’ Installing pathlib2 (2.3.5)
β’ Installing pluggy (0.13.1)
β’ Installing psutil (5.8.0)
β’ Installing py (1.10.0)
β’ Installing pyjwt (1.7.1)
β’ Installing regex (2021.3.17)
β’ Installing sentinels (1.0.0)
β’ Installing smart-open (4.2.0)
β’ Installing tensorboard (2.4.1)
β’ Installing tensorflow-estimator (2.4.0)
β’ Installing termcolor (1.1.0)
β’ Installing torch (1.6.0+cpu )
β’ Installing tqdm (4.59.0)
β’ Installing typing-extensions (3.7.4.3)
β’ Installing wcwidth (0.2.5)
β’ Installing wrapt (1.12.1)
1617282671036 ip-172-28-11-109:0 DEBUG β’ Installing ase (3.21.1)
β’ Installing gensim (3.8.3)
β’ Installing googledrivedownloader (0.4)
β’ Installing jinja2 (2.11.3)
β’ Installing mccabe (0.6.1)
β’ Installing mongomock (3.22.1)
β’ Installing natebbconnector (2.1.2)
β’ Installing nltk (3.5)
β’ Installing numba (0.51.2)
β’ Installing pycodestyle (2.7.0)
β’ Installing pyflakes (2.3.1)
β’ Installing pymongo (3.11.3)
β’ Installing pytest (5.4.3)
β’ Installing python-louvain (0.15)
β’ Installing pytorch-ranger (0.1.1)
β’ Installing rdflib (5.0.0)
β’ Installing tensorflow (2.4.1)
β’ Installing trains (0.16.4)
1617282686331 ip-172-28-11-109:0 DEBUG β’ Installing clearml (0.17.5)
β’ Installing flake8 (3.9.0)
β’ Installing natebblemae (4.1.0)
β’ Installing pytest-mock (3.5.1)
β’ Installing torch-cluster (1.5.8 )
β’ Installing pytest-mongodb (2.2.0)
β’ Installing torch-geometric (1.6.3)
β’ Installing torch-sparse (0.6.8 )
β’ Installing torch-spline-conv (1.2.0 )
β’ Installing torch-scatter (2.0.5 )
β’ Installing wget (3.2)
β’ Installing torch-optimizer (0.1.0)
Installing the current project: *** (2.2.0)
Adding venv into cache: /home/ubuntu/.clearml/venvs-builds/3.8
Running task id [4ce5aedc75404225b37eda2d9bd9ad8f]:
[***/dataset_creation]$ /home/ubuntu/.clearml/venvs-builds/3.8/bin/python -u create_dataset.py
Summary - installed python packages:
pip: []
Environment setup completed successfully
Starting Task Execution:
Traceback (most recent call last):
File "create_dataset.py", line 6, in <module>
from pymongo.collection import Collection
ModuleNotFoundError: No module named 'pymongo' `