Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Good Morning, I'M Wondering If Someone Has Any Advice/Experience Configuring Clearml-Agent To Include Private Packages From Aws Codeartifact? So Far I Know I Have To Edit The

Good morning, I'm wondering if someone has any advice/experience configuring clearml-agent to include private packages from AWS Codeartifact? So far I know I have to edit the extra_index_url of the clearml.conf file. AWS Codeartifact requires authentication via tokens that need to be refreshed every 12 hours or so.

My idea is to write a script that will on schedule and generate these tokens on the instances running clearml-agent, but maybe there is a better way to do this?

  
  
Posted 3 years ago
Votes Newest

Answers 25


Hi SuperficialGrasshopper36 , that sounds like a good starting point πŸ™‚

  
  
Posted 3 years ago

Thanks AgitatedDove14 I think the change your suggestion will work if it would mean the client authenticates itself before each time it attempts to run an experiment.

However I have another issue right now, I've manually authenticated on the instance running clearml-agent . We use poetry to install packages for a given project. From the logs, they are installed correctly and a venv is created. But when it comes to running the initial task. It seems that it doesn't use the venv that was created because it can't import the required modules?

Installing the current project: **** Adding venv into cache: /home/ubuntu/.clearml/venvs-builds/3.8 Running task id [4ce5aedc75404225b37eda2d9bd9ad8f]: [****/dataset_creation]$ /home/ubuntu/.clearml/venvs-builds/3.8/bin/python -u create_dataset.py Summary - installed python packages: pip: [] Environment setup completed successfully Starting Task Execution: Traceback (most recent call last): File "create_dataset.py", line 6, in <module> from pymongo.collection import Collection ModuleNotFoundError: No module named 'pymongo'

  
  
Posted 3 years ago

My bad, I realised that the Adding venv into cache: /home/ubuntu/.clearml/venvs-builds/3.8 actually adds the entire repository with the venv inside it i.e /home/ubuntu/.clearml/venvs-builds/3.8/task_repository/***.git/.venv . However the the default path that it uses /home/ubuntu/.clearml/venvs-builds/3.8/bin/python is has a venv without all the requried python packages

  
  
Posted 3 years ago

Is there a way to detect the repository when initialising a task?

SuperficialGrasshopper36 This should have happened automatically when you call Task.init()

  
  
Posted 3 years ago

you can also set theΒ 

agent.package_manager.extra_index_url

Β , but since this is dynamic,...

You are correct, sine this is dynamic there is no need to set the " extra_index_url " configuration in clearml.conf, the additional bash script will configure pip directly. Make sense ?

  
  
Posted 3 years ago

Okay that seems to explain it. Now the question is why it installed it in the wrong place.

  
  
Posted 3 years ago

Hmm two questions: 1. How come it did not detect the packages when you were running the original task manually? 2. Could it be the poetry manager option is not working correctly?! Can you verify the venv is created with all packages? If so can you post the full log?

  
  
Posted 3 years ago

SuperficialGrasshopper36 regrading the codeartifact
I think the easiest will be to have a bash script authenticating the codeartifact with the aws command at the beginning of each docker spin. This can be done by adding it to:
https://github.com/allegroai/clearml-agent/blob/81edd2860fbc09e2a179985d8315ffaba851dcd7/docs/clearml.conf#L136
For example:
extra_docker_shell_script: ["apt-get install -y aws_cli_or_something", "aws cli authenticate me command"]wdyt?

  
  
Posted 3 years ago

AgitatedDove14
I'm guessing agent is running in venv mode, it was started using clearml-agent daemon --queue default --detached Under installed packages I have " No changes logged ". I have already set the package manager to poetry in the clearml.conf file in the agent section In the execution it installs all the required packages fine and creates the venv, just doesn't use it(I can access the cached venv and when activated, the required packages can be imported)

PS: the experiment I am trying to run is a clone of a previous one

cheers,

  
  
Posted 3 years ago

AgitatedDove14 thanks for responding
The initial experiment ran fine, see "Installed Packages: section looks like:` # Python 3.8.5 (default, Jan 27 2021, 15:41:15) [GCC 9.3.0]

boto3 == 1.17.36
botocore == 1.20.36
clearml == 0.17.5
flake8 == 3.9.0
jmespath == 0.10.0
lxml == 4.6.3
matplotlib == 3.3.4
networkx == 2.5
numpy == 1.19.5
pyflakes == 2.3.1
pymongo == 3.11.3
pytest == 5.4.3
scikit_learn == 0.23.2
scipy == 1.6.1
setuptools == 52.0.0
tensorboard == 2.4.1
torch == 1.6.0+cpu
torch_geometric == 1.6.3
torch_optimizer == 0.1.0
tqdm == 4.59.0
wget == 3.2

Detailed import analysis

**************************

IMPORT PACKAGE boto3

clearml.storage: 0

IMPORT PACKAGE botocore

***/tests/mocks/mock_s3_helper.py: 5

IMPORT PACKAGE clearml

***/utils/initialise_clearml_experiment.py: 5

IMPORT PACKAGE flake8

.eggs/flake8-3.8.4-py3.9.egg/flake8/main.py: 2

.eggs/flake8-3.8.4-py3.9.egg/flake8/api/legacy.py: 10,11,12,13

.eggs/flake8-3.8.4-py3.9.egg/flake8/checker.py: 16,17,18,19

.eggs/flake8-3.8.4-py3.9.egg/flake8/formatting/base.py: 8,9

.eggs/flake8-3.8.4-py3.9.egg/flake8/formatting/default.py: 4,7

.eggs/flake8-3.8.4-py3.9.egg/flake8/main/application.py: 17,18,19,23

.eggs/flake8-3.8.4-py3.9.egg/flake8/main/cli.py: 5

.eggs/flake8-3.8.4-py3.9.egg/flake8/main/git.py: 16,17,41

.eggs/flake8-3.8.4-py3.9.egg/flake8/main/mercurial.py: 12,27

.eggs/flake8-3.8.4-py3.9.egg/flake8/main/options.py: 5,6,7

.eggs/flake8-3.8.4-py3.9.egg/flake8/main/setuptools_command.py: 8

.eggs/flake8-3.8.4-py3.9.egg/flake8/main/vcs.py: 7,8,9

.eggs/flake8-3.8.4-py3.9.egg/flake8/options/aggregator.py: 10,11

.eggs/flake8-3.8.4-py3.9.egg/flake8/options/config.py: 8

.eggs/flake8-3.8.4-py3.9.egg/flake8/options/manager.py: 11

.eggs/flake8-3.8.4-py3.9.egg/flake8/plugins/manager.py: 5,6,7

.eggs/flake8-3.8.4-py3.9.egg/flake8/plugins/pyflakes.py: 18

.eggs/flake8-3.8.4-py3.9.egg/flake8/processor.py: 10,11,12,13

.eggs/flake8-3.8.4-py3.9.egg/flake8/statistics.py: 6

.eggs/flake8-3.8.4-py3.9.egg/flake8/style_guide.py: 13,14,15,16,17

.eggs/flake8-3.8.4

... `

  
  
Posted 3 years ago

One question - you can also set the agent.package_manager.extra_index_url , but since this is dynamic, will pip install still add the extra index URL from the pip config file? Or does it have to be set in this agent config variable?

  
  
Posted 3 years ago

Did the shell script route work? I have a similar question.

It's a little more complicated because the index URL is not fixed; it contains the token which is only valid for a max of 12 hours. That means the ~/.config/pip/pip.conf file will also need to be updated every 12 hours. Fortunately, this editing is done automatically by authenticating AWS codeartefact in the command line by logging in.

My current thinking is as follows:

Install the awscli - pip install awscli (could also use apt-get install awscli if pip is not already installed in docker container) Authenticate AWS CodeArtefact, which should also updated ~/.config/pip/pip.conf - aws codeartifact login --tool pip --repository my-repo --domain my-domain --domain-owner 111122223333 . This should work as expected if the correct IAM permissions/roles are attached to the agent and child container
I would then hope that pip installing inside the new container should have access to the private repository as well as the public

  
  
Posted 3 years ago

What do you have under the "installed packages" section? Also you can configure the agent to use poetry to restore the environment (instead of pip)

  
  
Posted 3 years ago

TenseOstrich47 you can actually enter this script as part of the extra_docker_shell_script

https://github.com/allegroai/clearml-agent/blob/b196ab57931f3c67efcb561df0c8a2fe7c0e76f9/docs/clearml.conf#L140

This will be executed at the beginning of each Task inside the container, and as long as the execution time is under 12h, you should be fine. wdyt?

  
  
Posted 3 years ago

Are you running the agent in docker mode or venv mode?

  
  
Posted 3 years ago

the poetry.toml looks like this:
[virtualenvs] in-project = trueDo you think this is the issue?

  
  
Posted 3 years ago

Sounds good to me. Thanks Martin πŸ™‚

  
  
Posted 3 years ago

Can you post the toml file? Maybe the answer is there

  
  
Posted 3 years ago

Awesome, thank you. I will give that a try later this week and update if it worked as expected! May finally solve my private dependencies issue πŸ˜‚

  
  
Posted 3 years ago

Do you know of any related best practice? I've never used CodeArchitect before

  
  
Posted 3 years ago

I'd like to add I changed the in-project = false . And the venvs is still getting in a different place to the one the clearml-agent tries to use when running the experimetn

  
  
Posted 3 years ago

Not sure, SuccessfulKoala55 but I think I've run into a bigger problem I realised that the tasks get created without any link to the original repo. So that when I clone a task on clearml-agent it doesn't have all the required packages that I need (including the private ones). Is there a way to detect the repository when initialising a task?

  
  
Posted 3 years ago

Correct:
extra_docker_shell_script: ["apt-get install -y awscli", "aws codeartifact login --tool pip --repository my-repo --domain my-domain --domain-owner 111122223333"]

  
  
Posted 3 years ago

By script, you mean entering these two lines separately as a list for that extra_docker_shell_scripts arugment?

  
  
Posted 3 years ago

    1. I confirm, the venv is created with all packages, when I go to the cached location /home/ubuntu/.clearml/venvs-builds/3.8 and activate it, the packages are there. The full log:
      ` 1617282623837 ip-172-28-11-109:0 INFO task 4ce5aedc75404225b37eda2d9bd9ad8f pulled from 903741980edd416697ca8074f782bdb4 by worker ip-172-28-11-109:0

1617282629221 ip-172-28-11-109:0 DEBUG Current configuration (clearml_agent v0.17.2, location: /tmp/.clearml_agent.xo_7zbpi.cfg):

agent.worker_id = ip-172-28-11-109:0
agent.worker_name = ip-172-28-11-109
agent.force_git_ssh_protocol = true
agent.python_binary =
agent.package_manager.type = poetry
agent.package_manager.pip_version = <20.2
agent.package_manager.system_site_packages = false
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = defaults
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = pytorch
agent.package_manager.torch_nightly = false
agent.package_manager.extra_index_url.0 = https://***
agent.venvs_dir = /home/ubuntu/.clearml/venvs-builds
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.vcs_cache.enabled = true
agent.vcs_cache.path = /home/ubuntu/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = /home/ubuntu/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = /home/ubuntu/.clearml/pip-cache
agent.docker_apt_cache = /home/ubuntu/.clearml/apt-cache
agent.docker_force_pull = false
agent.default_docker.image = nvidia/cuda:10.1-runtime-ubuntu18.04
agent.enable_task_env = false
agent.default_python = 3.8
agent.cuda_version = 110
agent.cudnn_version = 0
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.s3.key =
sdk.aws.s3.region =
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.default_output_uri =
sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false
sdk.development.store_uncommitted_code_diff_on_train = true
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.web_server = http://***
api.api_server = http://***
api.files_server = http://***
api.credentials.access_key = ***
api.host = ***

Executing task id [4ce5aedc75404225b37eda2d9bd9ad8f]:
repository =
branch = master
version_num = e267eca697341b8ccaf84be6f52480198b6e9682
tag =
docker_cmd = None
entry_point = create_dataset.py
working_dir = ***/dataset_creation

Using base prefix '/usr'
New python executable in /home/ubuntu/.clearml/venvs-builds/3.8/bin/python3.8
Also creating executable in /home/ubuntu/.clearml/venvs-builds/3.8/bin/python
Installing setuptools, pip, wheel...

1617282634455 ip-172-28-11-109:0 DEBUG done.

Using cached repository in "/home/ubuntu/.clearml/vcs-***"
Note: switching to 'e267eca697341b8ccaf84be6f52480198b6e9682'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

git switch -c <new-branch-name>

Or undo this operation with:

git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at e267eca Merge pull request #17 from ***
type: git
url:
branch: HEAD
commit: e267eca697341b8ccaf84be6f52480198b6e9682
root: /home/ubuntu/.clearml/venvs-builds/3.8/task_repository/***.git
Applying uncommitted changes

Poetry Enabled: Ignoring requested python packages, using repository poetry lock file!
Creating virtualenv *** in /home/ubuntu/.clearml/venvs-builds/3.8/task_repository/***.git/.venv
Installing dependencies from lock file

1617282639678 ip-172-28-11-109:0 DEBUG
Package operations: 125 installs, 0 updates, 0 removals

β€’ Installing six (1.15.0)

1617282644955 ip-172-28-11-109:0 DEBUG β€’ Installing certifi (2020.12.5)
β€’ Installing chardet (4.0.0)
β€’ Installing jmespath (0.10.0)
β€’ Installing idna (2.10)
β€’ Installing pyasn1 (0.4.8)
β€’ Installing python-dateutil (2.8.1)
β€’ Installing urllib3 (1.26.4)
β€’ Installing botocore (1.20.36)
β€’ Installing cachetools (4.2.1)
β€’ Installing cycler (0.10.0)
β€’ Installing kiwisolver (1.3.1)
β€’ Installing numpy (1.19.5)
β€’ Installing oauthlib (3.1.0)
β€’ Installing pillow (6.2.2)
β€’ Installing pyasn1-modules (0.2.8)
β€’ Installing pytz (2021.1)
β€’ Installing requests (2.25.1)
β€’ Installing pyparsing (2.4.7)
β€’ Installing rsa (4.7.2)
β€’ Installing google-auth (1.28.0)
β€’ Installing joblib (1.0.1)
β€’ Installing matplotlib (3.3.4)
β€’ Installing ply (3.11)
β€’ Installing python-levenshtein (0.12.2)
β€’ Installing requests-oauthlib (1.3.0)
β€’ Installing scipy (1.6.1)
β€’ Installing pandas (1.2.3)
β€’ Installing requests-file (1.5.1)
β€’ Installing soupsieve (2.2.1)
β€’ Installing s3transfer (0.3.6)
β€’ Installing texttable (1.6.3)

1617282650198 ip-172-28-11-109:0 DEBUG β€’ Installing threadpoolctl (2.1.0)
β€’ Installing tzlocal (2.1)
β€’ Installing absl-py (0.12.0)
β€’ Installing attrs (20.3.0)
β€’ Installing beautifulsoup4 (4.9.3)
β€’ Installing boto3 (1.17.36)
β€’ Installing compress-pickle (1.2.0)
β€’ Installing decorator (4.4.2)
β€’ Installing future (0.18.2)
β€’ Installing fuzzyset (0.0.19)
β€’ Installing google-auth-oauthlib (0.4.3)
β€’ Installing hjson (3.0.2)
β€’ Installing html-parser (0.2)
β€’ Installing grpcio (1.32.0)

1617282655480 ip-172-28-11-109:0 DEBUG β€’ Installing lxml (4.6.3)
β€’ Installing markdown (3.3.4)
β€’ Installing orderedmultidict (1.0.1)
β€’ Installing protobuf (3.15.6)
β€’ Installing psycopg2-binary (2.8.6)
β€’ Installing pyrsistent (0.17.3)
β€’ Installing pyyaml (5.4.1)
β€’ Installing rfc5424-logging-handler (1.4.3)
β€’ Installing scikit-learn (0.23.2)
β€’ Installing seaborn (0.10.1)
β€’ Installing tensorboard-plugin-wit (1.8.0)
β€’ Installing tldextract (2.2.3)
β€’ Installing werkzeug (1.0.1)

1617282660768 ip-172-28-11-109:0 DEBUG β€’ Installing astunparse (1.6.3)
β€’ Installing click (7.1.2)
β€’ Installing cython (0.29.14)
β€’ Installing flatbuffers (1.12)
β€’ Installing funcsigs (1.0.2)
β€’ Installing gast (0.3.3)
β€’ Installing google-pasta (0.2.0)
β€’ Installing furl (2.1.0)
β€’ Installing h5py (2.10.0)
β€’ Installing humanfriendly (9.1)
β€’ Installing isodate (0.6.0)
β€’ Installing jsonschema (3.2.0)
β€’ Installing keras-preprocessing (1.1.2)
β€’ Installing llvmlite (0.34.0)
β€’ Installing markupsafe (1.1.1)
β€’ Installing more-itertools (8.7.0)
β€’ Installing natebbcommon (3.1.0)
β€’ Installing networkx (2.5)
β€’ Installing opt-einsum (3.3.0)
β€’ Installing packaging (20.9)
β€’ Installing pathlib2 (2.3.5)
β€’ Installing pluggy (0.13.1)
β€’ Installing psutil (5.8.0)
β€’ Installing py (1.10.0)
β€’ Installing pyjwt (1.7.1)
β€’ Installing regex (2021.3.17)
β€’ Installing sentinels (1.0.0)
β€’ Installing smart-open (4.2.0)
β€’ Installing tensorboard (2.4.1)
β€’ Installing tensorflow-estimator (2.4.0)
β€’ Installing termcolor (1.1.0)
β€’ Installing torch (1.6.0+cpu )
β€’ Installing tqdm (4.59.0)
β€’ Installing typing-extensions (3.7.4.3)
β€’ Installing wcwidth (0.2.5)
β€’ Installing wrapt (1.12.1)

1617282671036 ip-172-28-11-109:0 DEBUG β€’ Installing ase (3.21.1)
β€’ Installing gensim (3.8.3)
β€’ Installing googledrivedownloader (0.4)
β€’ Installing jinja2 (2.11.3)
β€’ Installing mccabe (0.6.1)
β€’ Installing mongomock (3.22.1)
β€’ Installing natebbconnector (2.1.2)
β€’ Installing nltk (3.5)
β€’ Installing numba (0.51.2)
β€’ Installing pycodestyle (2.7.0)
β€’ Installing pyflakes (2.3.1)
β€’ Installing pymongo (3.11.3)
β€’ Installing pytest (5.4.3)
β€’ Installing python-louvain (0.15)
β€’ Installing pytorch-ranger (0.1.1)
β€’ Installing rdflib (5.0.0)
β€’ Installing tensorflow (2.4.1)
β€’ Installing trains (0.16.4)

1617282686331 ip-172-28-11-109:0 DEBUG β€’ Installing clearml (0.17.5)
β€’ Installing flake8 (3.9.0)
β€’ Installing natebblemae (4.1.0)
β€’ Installing pytest-mock (3.5.1)
β€’ Installing torch-cluster (1.5.8 )
β€’ Installing pytest-mongodb (2.2.0)
β€’ Installing torch-geometric (1.6.3)
β€’ Installing torch-sparse (0.6.8 )
β€’ Installing torch-spline-conv (1.2.0 )
β€’ Installing torch-scatter (2.0.5 )
β€’ Installing wget (3.2)
β€’ Installing torch-optimizer (0.1.0)

Installing the current project: *** (2.2.0)
Adding venv into cache: /home/ubuntu/.clearml/venvs-builds/3.8
Running task id [4ce5aedc75404225b37eda2d9bd9ad8f]:
[***/dataset_creation]$ /home/ubuntu/.clearml/venvs-builds/3.8/bin/python -u create_dataset.py
Summary - installed python packages:
pip: []

Environment setup completed successfully

Starting Task Execution:

Traceback (most recent call last):
File "create_dataset.py", line 6, in <module>
from pymongo.collection import Collection
ModuleNotFoundError: No module named 'pymongo' `

  
  
Posted 3 years ago
1K Views
25 Answers
3 years ago
one year ago
Tags