Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
When Trying To Run My Agent With Poetry As Package Manager (Could Also Be With Pip, But Definitely With Poetry), Strange Things Seem To Happen When Trying To Execute A Pipeline:

When trying to run my agent with poetry as package manager (could also be with pip, but definitely with poetry), strange things seem to happen when trying to execute a pipeline:

  • First of all, the agent wants to create or use a directory on root level, let's say /foobar_home, if the user I started the agent with is foobar. Otherwise, the pipeline won't even start executing tasks.
  • If I create this directory, then the task fails with error message 255, like so: Starting Task Execution:Process failed, exit code 255 without any further logs.
  
  
Posted 3 months ago
Votes Newest

Answers 15


Package operations: 26 installs, 0 updates, 0 removals

  - Installing attrs (23.2.0)
  - Installing rpds-py (0.19.0)

1721735566405 dnsName:cpu:0:service:a08289ce6f2e47f2afc4e9f4e8540575 DEBUG   - Installing referencing (0.35.1)
  - Installing six (1.16.0)
  - Installing certifi (2024.7.4)
  - Installing charset-normalizer (3.3.2)
  - Installing idna (3.7)
  - Installing jsonschema-specifications (2023.12.1)
  - Installing orderedmultidict (1.0.1)
  - Installing urllib3 (2.2.2)
  - Installing furl (2.1.3)
  - Installing jsonschema (4.23.0)
  - Installing numpy (2.0.1)
  - Installing pillow (10.4.0)
  - Installing pathlib2 (2.3.7.post1)
  - Installing pyjwt (2.8.0)
  - Installing psutil (6.0.0)
  - Installing pyparsing (3.1.2)
  - Installing python-dateutil (2.9.0.post0)
  - Installing pytz (2024.1)
  - Installing pyyaml (6.0.1)
  - Installing requests (2.32.3)
  - Installing tzdata (2024.1)
  - Installing clearml (1.16.2)
  - Installing pandas (2.2.2)
  - Installing psycopg2-binary (2.9.9)

1721735571453 dnsName:cpu:0:service:a08289ce6f2e47f2afc4e9f4e8540575 DEBUG
Installing the current project: clearml_showcase (0.1)
DEBUG:clearml_agent.helper.package.poetry_api:running: Executing: ('poetry', 'show')
DEBUG:urllib3.connectionpool:Resetting dropped connection: ipAddrOfClearMLServer
DEBUG:urllib3.connectionpool:
 "GET /v2.14/tasks.get_all HTTP/1.1" 200 440
DEBUG:urllib3.connectionpool:Resetting dropped connection: ipAddrOfClearMLServer
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.set_requirements HTTP/1.1" 200 306
Running task id [a08289ce6f2e47f2afc4e9f4e8540575]:
[.]$ poetry run python -u main_pipeline.py
Summary - installed python packages:
pip:
- 'attrs==23.2.0 # Classes Without Boilerplate'
- 'certifi==2024.7.4 # Python package for providing Mozilla''...'
- 'charset-normalizer==3.3.2 # The Real First Universal Charset Dete...'
- 'clearml==1.16.2 # ClearML - Auto-Magical Experiment Man...'
- 'furl==2.1.3 # URL manipulation made simple.'
- 'idna==3.7 # Internationalized Domain Names in App...'
- 'jsonschema==4.23.0 # An implementation of JSON Schema vali...'
- 'jsonschema-specifications==2023.12.1 # The JSON Schema meta-schemas and voca...'
- 'numpy==2.0.1 # Fundamental package for array computi...'
- 'orderedmultidict==1.0.1 # Ordered Multivalue Dictionary'
- 'pandas==2.2.2 # Powerful data structures for data ana...'
- 'pathlib2==2.3.7.post1 # Object-oriented filesystem paths'
- 'pillow==10.4.0 # Python Imaging Library (Fork)'
- 'psutil==6.0.0 # Cross-platform lib for process and sy...'
- 'psycopg2-binary==2.9.9 # psycopg2 - Python-PostgreSQL Database...'
- 'pyjwt==2.8.0 # JSON Web Token implementation in Python'
- 'pyparsing==3.1.2 # pyparsing module - Classes and method...'
- 'python-dateutil==2.9.0.post0 # Extensions to the standard Python dat...'
- 'pytz==2024.1 # World timezone definitions, modern an...'
- 'pyyaml==6.0.1 # YAML parser and emitter for Python'
- 'referencing==0.35.1 # JSON Referencing + Python'
- 'requests==2.32.3 # Python HTTP for Humans.'
- 'rpds-py==0.19.0 # Python bindings to Rust''s persistent ...'
- 'six==1.16.0 # Python 2 and 3 compatibility utilities'
- 'tzdata==2024.1 # Provider of IANA time zone data'
- 'urllib3==2.2.2 # HTTP library with thread-safe connect...'

Environment setup completed successfully


1721735576451 dnsName:cpu:0:service:a08289ce6f2e47f2afc4e9f4e8540575 DEBUG DEBUG:urllib3.connectionpool:Resetting dropped connection: ipAddrOfClearMLServer
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.started HTTP/1.1" 200 396
Starting Task Execution:


1721735576473 dnsName:cpu:0:service:a08289ce6f2e47f2afc4e9f4e8540575 DEBUG Process failed, exit code 255
  
  
Posted 3 months ago

Hi @<1724960468822396928:profile|CumbersomeSealion22> , can you provide a log of such a run?

  
  
Posted 3 months ago

@<1724960468822396928:profile|CumbersomeSealion22> can you print something at the start of your code before any imports? It looks like your code is started, but immediately exits

  
  
Posted 3 months ago

The file being run seems OK since this should be the pipeline's controller (i.e. main code loading and defining the steps)

  
  
Posted 3 months ago

I noticed in the documentation that --services-mode only works together with docker mode. So probably, this is the error, since I did not start it with docker. I would have expected an error when trying to start in --services-mode without docker, though.

Now, with docker mode activated, I get a new error: clearml_agent is not installed (using python:3.11.9-slim-bookworm as base image).

  
  
Posted 3 months ago

Hi @<1724960468822396928:profile|CumbersomeSealion22> ,it seems the package you're looking for isn't there:

ERROR: Could not find a version that satisfies the requirement clearml_showcase==0.1 (from versions: none)
ERROR: No matching distribution found for clearml_showcase==0.1 
  
  
Posted 3 months ago

Relevant pipeline logs:

1721797252114 myDNSName:cpu:1:service:a6d316cad2b54f36a0cb960dc04e5ab7 DEBUG Installing dependencies from lock file

Package operations: 26 installs, 0 updates, 0 removals

  - Installing attrs (23.2.0)
  - Installing rpds-py (0.19.0)
  - Installing referencing (0.35.1)
  - Installing six (1.16.0)
  - Installing certifi (2024.7.4)
  - Installing charset-normalizer (3.3.2)
  - Installing idna (3.7)
  - Installing jsonschema-specifications (2023.12.1)
  - Installing orderedmultidict (1.0.1)
  - Installing urllib3 (2.2.2)
  - Installing furl (2.1.3)
  - Installing jsonschema (4.23.0)
  - Installing pathlib2 (2.3.7.post1)
  - Installing numpy (2.0.1)
  - Installing pyjwt (2.8.0)
  - Installing pyparsing (3.1.2)
  - Installing pillow (10.4.0)
  - Installing psutil (6.0.0)
  - Installing python-dateutil (2.9.0.post0)
  - Installing pytz (2024.1)
  - Installing pyyaml (6.0.1)
  - Installing requests (2.32.3)
  - Installing tzdata (2024.1)

1721797262143 myDNSName:cpu:1:service:a6d316cad2b54f36a0cb960dc04e5ab7 DEBUG   - Installing clearml (1.16.2)
  - Installing pandas (2.2.2)
  - Installing psycopg2-binary (2.9.9)

1721797267204 myDNSName:cpu:1:service:a6d316cad2b54f36a0cb960dc04e5ab7 DEBUG
Installing the current project: clearml_showcase (0.1)
DEBUG:clearml_agent.helper.package.poetry_api:running: Executing: ('poetry', 'show')

1721797272278 myDNSName:cpu:1:service:a6d316cad2b54f36a0cb960dc04e5ab7 DEBUG DEBUG:urllib3.connectionpool:Resetting dropped connection: serverIPAddress
DEBUG:urllib3.connectionpool:
 "GET /v2.14/tasks.get_all HTTP/1.1" 200 440
DEBUG:urllib3.connectionpool:Resetting dropped connection: serverIPAddress
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.set_requirements HTTP/1.1" 200 306
Running task id [a6d316cad2b54f36a0cb960dc04e5ab7]:
[.]$ poetry run python -u main_pipeline.py
Summary - installed python packages:
pip:
- 'attrs==23.2.0 # Classes Without Boilerplate'
- 'certifi==2024.7.4 # Python package for providing Mozilla''...'
- 'charset-normalizer==3.3.2 # The Real First Universal Charset Dete...'
- 'clearml==1.16.2 # ClearML - Auto-Magical Experiment Man...'
- 'furl==2.1.3 # URL manipulation made simple.'
- 'idna==3.7 # Internationalized Domain Names in App...'
- 'jsonschema==4.23.0 # An implementation of JSON Schema vali...'
- 'jsonschema-specifications==2023.12.1 # The JSON Schema meta-schemas and voca...'
- 'numpy==2.0.1 # Fundamental package for array computi...'
- 'orderedmultidict==1.0.1 # Ordered Multivalue Dictionary'
- 'pandas==2.2.2 # Powerful data structures for data ana...'
- 'pathlib2==2.3.7.post1 # Object-oriented filesystem paths'
- 'pillow==10.4.0 # Python Imaging Library (Fork)'
- 'psutil==6.0.0 # Cross-platform lib for process and sy...'
- 'psycopg2-binary==2.9.9 # psycopg2 - Python-PostgreSQL Database...'
- 'pyjwt==2.8.0 # JSON Web Token implementation in Python'
- 'pyparsing==3.1.2 # pyparsing module - Classes and method...'
- 'python-dateutil==2.9.0.post0 # Extensions to the standard Python dat...'
- 'pytz==2024.1 # World timezone definitions, modern an...'
- 'pyyaml==6.0.1 # YAML parser and emitter for Python'
- 'referencing==0.35.1 # JSON Referencing + Python'
- 'requests==2.32.3 # Python HTTP for Humans.'
- 'rpds-py==0.19.0 # Python bindings to Rust''s persistent ...'
- 'six==1.16.0 # Python 2 and 3 compatibility utilities'
- 'tzdata==2024.1 # Provider of IANA time zone data'
- 'urllib3==2.2.2 # HTTP library with thread-safe connect...'

Environment setup completed successfully

DEBUG:urllib3.connectionpool:Resetting dropped connection: serverIPAddress
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.started HTTP/1.1" 200 394
Starting Task Execution:


1721797282293 myDNSName:cpu:1:service:a6d316cad2b54f36a0cb960dc04e5ab7 DEBUG ClearML results page: 

ClearML pipeline page: 

Starting the pipeline in queue  pipeline
Launching the next 1 steps
Launching step [create_labeled_dataset]

1721797287329 myDNSName:cpu:1:service:a6d316cad2b54f36a0cb960dc04e5ab7 DEBUG Launching step: create_labeled_dataset
Parameters:
None
Configurations:
{}
Overrides:
{}

1721797307400 myDNSName:cpu:1:service:a6d316cad2b54f36a0cb960dc04e5ab7 DEBUG Launching the next 0 steps

1721797317450 myDNSName:cpu:1:service:a6d316cad2b54f36a0cb960dc04e5ab7 DEBUG Setting pipeline controller Task as failed (due to failed steps) !

1721797317478 myDNSName:cpu:1:service:a6d316cad2b54f36a0cb960dc04e5ab7 DEBUG Process completed successfully
  
  
Posted 3 months ago

Traceback (most recent call last):
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/commands/worker.py", line 3221, in install_requirements_for_package_api
    package_api.load_requirements(cached_requirements)
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/package/pip_api/venv.py", line 41, in load_requirements
    super(VirtualenvPip, self).load_requirements(requirements)
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/package/pip_api/system.py", line 61, in load_requirements
    self.install_from_file(path)
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/package/pip_api/system.py", line 35, in install_from_file
    self.run_with_env(('install', '-r', path) + self.install_flags(), cwd=self.cwd)
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/package/pip_api/system.py", line 94, in run_with_env
    return (command.get_output if output else command.check_call)(stdin=DEVNULL, env=env, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/process.py", line 198, in check_call
    return self.call_subprocess(subprocess.check_call, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/process.py", line 245, in call_subprocess
    return func(list(self), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/clearml_user/.clearml/venvs-builds/3.11/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/tmp/cached-reqswqel5t8m.txt']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<frozen runpy>", line 198, in _run_module_as_main
  File "<frozen runpy>", line 88, in _run_code
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/__main__.py", line 87, in <module>
    sys.exit(main())
             ^^^^^^
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/__main__.py", line 83, in main
    return run_command(parser, args, command_name)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/__main__.py", line 46, in run_command
    return func(**args_dict)
           ^^^^^^^^^^^^^^^^^
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/commands/base.py", line 63, in newfunc
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/commands/worker.py", line 2676, in execute
    self.install_requirements(
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/commands/worker.py", line 3160, in install_requirements
    return self.install_requirements_for_package_api(execution, repo_info, requirements_manager,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/commands/worker.py", line 3225, in install_requirements_for_package_api
    raise ValueError("Could not install task requirements!\n{}".format(e))
ValueError: Could not install task requirements!
Command '['/home/clearml_user/.clearml/venvs-builds/3.11/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/tmp/cached-reqswqel5t8m.txt']' returned non-zero exit status 1.
  
  
Posted 3 months ago

I could solve the error by using image python:3.11.9-bookworm, since gcc was missing in the slim image (would be good to have some guideline to which minimum requirements a Docker image should have?).

Now, I am stuck with an error which seems to be related to poetry and pip, and their interplay. Poetry installs my own project, which is then listed as a requirement for pip, which does not find it on the PyPI server (obviously) and complains that it can't install it. I run the pipeline in services mode in a Docker container, and the regular tasks in normal mode in a virtualenv on my machine. The logs follow:

  
  
Posted 3 months ago

Local execution works fine.

  
  
Posted 3 months ago

1721735525702 myHostName info ClearML Task: created new task id=a08289ce6f2e47f2afc4e9f4e8540575
ClearML results page: 

1721735525998 myHostName info ClearML pipeline page: 

Starting the pipeline in queue  pipeline
1721735542512 dnsName:cpu:0 INFO task a08289ce6f2e47f2afc4e9f4e8540575 pulled from cbf8757ca3f44980a450f9ea8a1c300a by worker dnsName:cpu:0

1721735547620 dnsName:cpu:0 DEBUG DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): ipAddrOfClearMLServer:8008
DEBUG:urllib3.connectionpool:
 "GET /auth.login HTTP/1.1" 200 611
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): ipAddrOfClearMLServer:8008
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.get_by_id HTTP/1.1" 200 3225
DEBUG:urllib3.connectionpool:Resetting dropped connection: ipAddrOfClearMLServer
DEBUG:urllib3.connectionpool:
 "GET /tasks.dequeue HTTP/1.1" 200 313
DEBUG:urllib3.connectionpool:Resetting dropped connection: ipAddrOfClearMLServer
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.started HTTP/1.1" 200 330
DEBUG:urllib3.connectionpool:Resetting dropped connection: ipAddrOfClearMLServer
DEBUG:urllib3.connectionpool:
 "GET /workers.status_report HTTP/1.1" 200 283
Running task 'a08289ce6f2e47f2afc4e9f4e8540575'

1721735551432 dnsName:cpu:0:service:a08289ce6f2e47f2afc4e9f4e8540575 DEBUG DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): ipAddrOfClearMLServer:8008
DEBUG:urllib3.connectionpool:
 "GET /auth.login HTTP/1.1" 200 611
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): ipAddrOfClearMLServer:8008
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.get_by_id HTTP/1.1" 200 3232
DEBUG:urllib3.connectionpool:Resetting dropped connection: ipAddrOfClearMLServer
DEBUG:urllib3.connectionpool:
 "GET /tasks.dequeue HTTP/1.1" 200 313
DEBUG:urllib3.connectionpool:Resetting dropped connection: ipAddrOfClearMLServer
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.started HTTP/1.1" 200 331
DEBUG:clearml_agent.session:Run by interpreter: /opt/clearml/.venv/bin/python3.11
Current configuration (clearml_agent v1.8.1, location: /tmp/.clearml_agent.c06iczcd.cfg):
----------------------
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.auth.token_expiration_threshold_sec = ****
api.api_server = 

api.web_server = 

api.files_server = 

api.credentials.access_key = XTM42TSDNPGYYO3GJDLS
api.credentials.secret_key = ****
api.host = 

agent.worker_id = dnsName:cpu:0:service:a08289ce6f2e47f2afc4e9f4e8540575
agent.worker_name = dnsName
agent.force_git_ssh_protocol = false
agent.python_binary =
agent.package_manager.type = poetry
agent.package_manager.pip_version.0 = <20.2 ; python_version < '3.10'
agent.package_manager.pip_version.1 = <22.3 ; python_version >\= '3.10'
agent.package_manager.system_site_packages = false
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = pytorch
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = nvidia
agent.package_manager.conda_channels.3 = defaults
agent.package_manager.priority_optional_packages.0 = pygobject
agent.package_manager.torch_nightly = false
agent.package_manager.poetry_files_from_repo_working_dir = false
agent.venvs_dir = /home/clearml_user/.clearml/venvs-builds.2
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.venvs_cache.path = ~/.clearml/venvs-cache
agent.vcs_cache.enabled = true
agent.vcs_cache.path = /home/clearml_user/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = /home/clearml_user/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = /home/clearml_user/.clearml/pip-cache
agent.docker_apt_cache = /home/clearml_user/.clearml/apt-cache.2
agent.docker_force_pull = false
agent.default_docker.image = python:3.11.9-slim-bookworm
agent.enable_task_env = false
agent.sanitize_config_printout = ****
agent.hide_docker_command_env_vars.enabled = true
agent.hide_docker_command_env_vars.parse_embedded_urls = true
agent.abort_callback_max_timeout = 1800
agent.docker_internal_mounts.sdk_cache = /clearml_agent_cache
agent.docker_internal_mounts.apt_cache = /var/cache/apt/archives
agent.docker_internal_mounts.ssh_folder = ~/.ssh
agent.docker_internal_mounts.ssh_ro_folder = /.ssh
agent.docker_internal_mounts.pip_cache = /root/.cache/pip
agent.docker_internal_mounts.poetry_cache = /root/.cache/pypoetry
agent.docker_internal_mounts.vcs_cache = /root/.clearml/vcs-cache
agent.docker_internal_mounts.venv_build = ~/.clearml/venvs-builds
agent.docker_internal_mounts.pip_download = /root/.clearml/pip-download-cache
agent.apply_environment = true
agent.apply_files = true
agent.custom_build_script =
agent.disable_task_docker_override = false
agent.git_user =
agent.git_pass = ****
agent.debug = true
agent.default_python = 3.11
agent.cuda_version = 0
agent.cudnn_version = 0
sdk.storage.cache.default_base_dir = /home/clearml_user/.clearml/cache.2
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.s3.key =
sdk.aws.s3.secret = ****
sdk.aws.s3.region =
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.default_output_uri =
sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false

DEBUG:urllib3.connectionpool:Resetting dropped connection: ipAddrOfClearMLServer
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.started HTTP/1.1" 200 389
Executing task id [a08289ce6f2e47f2afc4e9f4e8540575]:
repository = myGitRepo
branch = stable
version_num = 1d99307ce75a60fc7c53ba9be75f16d07258d11c
tag =
docker_cmd =
entry_point = main_pipeline.py
working_dir = .

DEBUG:clearml_agent.commands.worker:Searching for python3.11
DEBUG:clearml_agent.commands.worker:Found: python3.11
created virtual environment CPython3.11.2.final.0-64 in 442ms
  creator CPython3Posix(dest=/home/clearml_user/.clearml/venvs-builds.2/3.11, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/clearml_user/.local/share/virtualenv)
    added seed packages: pip==24.1.2, setuptools==70.1.1, wheel==0.43.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator


DEBUG:clearml_agent.helper.repo:Running: ['git', 'config', '--global', '--replace-all', 'safe.directory', '*']
Using cached repository in "/home/clearml_user/.clearml/vcs-cache/clearml_showcase.git.01582d138e03d7305278c7f2c2f25dd2/clearml_showcase.git"
pulling git
DEBUG:clearml_agent.helper.repo:Running: ['git', 'fetch', '--all', '--recurse-submodules']

1721735556359 dnsName:cpu:0:service:a08289ce6f2e47f2afc4e9f4e8540575 DEBUG pulling git completed
DEBUG:clearml_agent.helper.repo:Running: ['git', 'checkout', '1d99307ce75a60fc7c53ba9be75f16d07258d11c', '--force']
Hinweis: Wechsle zu '1d99307ce75a60fc7c53ba9be75f16d07258d11c'.

Sie befinden sich im Zustand eines 'losgelösten HEAD'. Sie können sich
umschauen, experimentelle Änderungen vornehmen und diese committen, und
Sie können alle möglichen Commits, die Sie in diesem Zustand machen,
ohne Auswirkungen auf irgendeinen Branch verwerfen, indem Sie zu einem
anderen Branch wechseln.

Wenn Sie einen neuen Branch erstellen möchten, um Ihre erstellten Commits
zu behalten, können Sie das (jetzt oder später) durch Nutzung von
'switch' mit der Option -c tun. Beispiel:

  git switch -c <neuer-Branchname>

Oder um diese Operation rückgängig zu machen:
  git switch -

Sie können diesen Hinweis ausschalten, indem Sie die Konfigurationsvariable
'advice.detachedHead' auf 'false' setzen.

HEAD ist jetzt bei 1d99307 Update config
DEBUG:clearml_agent.helper.repo:Running: ['git', 'submodule', 'update', '--recursive']
type: git
url: myGitRepo
branch: HEAD
commit: 1d99307ce75a60fc7c53ba9be75f16d07258d11c
root: /home/clearml_user/.clearml/venvs-builds.2/3.11/task_repository/clearml_showcase.git
Applying uncommitted changes
INFO:clearml_agent.helper.repo:applying diff to /home/clearml_user/.clearml/venvs-builds.2/3.11/task_repository/clearml_showcase.git
DEBUG:clearml_agent.helper.repo:Running: ['git', 'apply', '--unidiff-zero']
INFO:clearml_agent.helper.repo:successfully applied uncommitted changes


DEBUG:clearml_agent.helper.package.poetry_api:PoetryConfig: calling initialize, enabled = True
Executing: (PosixPath('python3.11'), '-m', 'pip', '--disable-pip-version-check', 'install', "pip<20.2 ; python_version < '3.10'", "pip<22.3 ; python_version >= '3.10'", '--upgrade')
Looking in indexes: 
, 

Ignoring pip: markers 'python_version < "3.10"' don't match your environment
Requirement already satisfied: pip<22.3 in /opt/clearml/.venv/lib/python3.11/site-packages (22.2.2)
Executing: (PosixPath('python3.11'), '-m', 'pip', '--disable-pip-version-check', 'list')

1721735561359 dnsName:cpu:0:service:a08289ce6f2e47f2afc4e9f4e8540575 DEBUG Notice: Poetry was found, no specific version required, skipping poetry installation
DEBUG:clearml_agent.helper.package.poetry_api:running: Executing: ('poetry', 'config', '--local', 'virtualenvs.in-project', 'true')
Poetry Enabled: Ignoring requested python packages, using repository poetry lock file!
DEBUG:clearml_agent.helper.package.poetry_api:running: Executing: ('poetry', 'install', '-n')
Creating virtualenv clearml-showcase in /home/clearml_user/.clearml/venvs-builds.2/3.11/task_repository/clearml_showcase.git/.venv
Installing dependencies from lock file
  
  
Posted 3 months ago

I find this line a little strange:

poetry run python -u main_pipeline.py

because that's the file from which the pipeline is started, i.e. it should not be called again, I guess?

I have the main pipeline code in that file, and my steps in separate files which I load into the main file. This is all in one repo, so it should work.

  
  
Posted 3 months ago

Logs from the first task itself:

1721797284959 myDNSName:0 INFO task 20e5aaa6a8df4b1bbdc9fd5ccdcb5e7d pulled from 179eb7123cc04aa7a9a701890ac8ba0b by worker myDNSName:0

1721797290216 myDNSName:0 DEBUG DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): serverAddress:8008
DEBUG:urllib3.connectionpool:
 "GET /auth.login HTTP/1.1" 200 615
DEBUG:urllib3.connectionpool:Starting new HTTP connection (1): serverAddress:8008
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.get_by_id HTTP/1.1" 200 1790
DEBUG:urllib3.connectionpool:Resetting dropped connection: serverAddress
DEBUG:urllib3.connectionpool:
 "GET /tasks.dequeue HTTP/1.1" 200 313
DEBUG:urllib3.connectionpool:Resetting dropped connection: serverAddress
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.started HTTP/1.1" 200 331
DEBUG:clearml_agent.session:Run by interpreter: /opt/clearml/.venv/bin/python3.11
Current configuration (clearml_agent v1.8.1, location: /tmp/.clearml_agent.k9o67umy.cfg):
----------------------
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.auth.token_expiration_threshold_sec = ****
api.api_server = 

api.web_server = 

api.files_server = 

api.credentials.access_key = XTM42TSDNPGYYO3GJDLS
api.credentials.secret_key = ****
api.host = 

agent.worker_id = myDNSName:0
agent.worker_name = myDNSName
agent.force_git_ssh_protocol = false
agent.python_binary =
agent.package_manager.type = poetry
agent.package_manager.pip_version =
agent.package_manager.system_site_packages = false
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = pytorch
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = nvidia
agent.package_manager.conda_channels.3 = defaults
agent.package_manager.priority_optional_packages.0 = pygobject
agent.package_manager.torch_nightly = false
agent.package_manager.poetry_files_from_repo_working_dir = false
agent.venvs_dir = /home/clearml_user/.clearml/venvs-builds
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.venvs_cache.path = ~/.clearml/venvs-cache
agent.vcs_cache.enabled = true
agent.vcs_cache.path = /home/clearml_user/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = /home/clearml_user/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = /home/clearml_user/.clearml/pip-cache
agent.docker_apt_cache = /home/clearml_user/.clearml/apt-cache
agent.docker_force_pull = false
agent.default_docker.image = python:3.11.9-slim-bookworm
agent.enable_task_env = false
agent.sanitize_config_printout = ****
agent.hide_docker_command_env_vars.enabled = true
agent.hide_docker_command_env_vars.parse_embedded_urls = true
agent.abort_callback_max_timeout = 1800
agent.docker_internal_mounts.sdk_cache = /clearml_agent_cache
agent.docker_internal_mounts.apt_cache = /var/cache/apt/archives
agent.docker_internal_mounts.ssh_folder = ~/.ssh
agent.docker_internal_mounts.ssh_ro_folder = /.ssh
agent.docker_internal_mounts.pip_cache = /root/.cache/pip
agent.docker_internal_mounts.poetry_cache = /root/.cache/pypoetry
agent.docker_internal_mounts.vcs_cache = /root/.clearml/vcs-cache
agent.docker_internal_mounts.venv_build = ~/.clearml/venvs-builds
agent.docker_internal_mounts.pip_download = /root/.clearml/pip-download-cache
agent.apply_environment = true
agent.apply_files = true
agent.custom_build_script =
agent.disable_task_docker_override = false
agent.git_user =
agent.git_pass = ****
agent.force_git_root_python_path = true
agent.debug = true
agent.default_python = 3.11
agent.cuda_version = 0
agent.cudnn_version = 0
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.s3.key =
sdk.aws.s3.secret = ****
sdk.aws.s3.region =
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.default_output_uri =
sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false

DEBUG:urllib3.connectionpool:Resetting dropped connection: serverAddress
DEBUG:urllib3.connectionpool:
 "GET /v2.5/tasks.started HTTP/1.1" 200 355
Executing task id [20e5aaa6a8df4b1bbdc9fd5ccdcb5e7d]:
repository =
branch =
version_num =
tag =
docker_cmd =
entry_point = foo.py
working_dir = .

DEBUG:clearml_agent.commands.worker:Searching for python3.11
DEBUG:clearml_agent.commands.worker:Found: python3.11
created virtual environment CPython3.11.2.final.0-64 in 447ms
  creator CPython3Posix(dest=/home/clearml_user/.clearml/venvs-builds/3.11, clear=False, no_vcs_ignore=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/clearml_user/.local/share/virtualenv)
    added seed packages: pip==24.1.2, setuptools==70.1.1, wheel==0.43.0
  activators BashActivator,CShellActivator,FishActivator,NushellActivator,PowerShellActivator,PythonActivator


INFO:clearml_agent.commands.worker:found literal script in `script.diff`
DEBUG:clearml_agent.commands.worker:selected execution directory: /home/clearml_user/.clearml/venvs-builds/3.11/code


Executing: (PosixPath('/home/clearml_user/.clearml/venvs-builds/3.11/bin/python'), '-m', 'pip', '--disable-pip-version-check', 'install', 'pip', '--upgrade')

1721797295107 myDNSName:0 DEBUG Looking in indexes: 
, 

Requirement already satisfied: pip in /home/clearml_user/.clearml/venvs-builds/3.11/lib/python3.11/site-packages (24.1.2)

1721797300154 myDNSName:0 DEBUG Executing: (PosixPath('/home/clearml_user/.clearml/venvs-builds/3.11/bin/python'), '-m', 'pip', '--disable-pip-version-check', 'list')
Executing: (PosixPath('/home/clearml_user/.clearml/venvs-builds/3.11/bin/python'), '-m', 'pip', '--disable-pip-version-check', 'install', 'Cython')
Looking in indexes: 
, 

Collecting Cython
  Using cached Cython-3.0.10-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl.metadata (3.2 kB)
Using cached Cython-3.0.10-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (3.4 MB)
Installing collected packages: Cython

1721797305206 myDNSName:0 DEBUG Process failed, exit code 1
1721797305257 myDNSName:0 DEBUG Successfully installed Cython-3.0.10
INFO:clearml_agent.commands.worker:Found task requirements section, trying to install
Executing: (PosixPath('/home/clearml_user/.clearml/venvs-builds/3.11/bin/python'), '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/tmp/cached-reqswqel5t8m.txt')
Looking in indexes: 
, 

ERROR: Could not find a version that satisfies the requirement clearml_showcase==0.1 (from versions: none)
ERROR: No matching distribution found for clearml_showcase==0.1
INFO:clearml_agent.commands.worker:Traceback (most recent call last):
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/commands/worker.py", line 3221, in install_requirements_for_package_api
    package_api.load_requirements(cached_requirements)
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/package/pip_api/venv.py", line 41, in load_requirements
    super(VirtualenvPip, self).load_requirements(requirements)
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/package/pip_api/system.py", line 61, in load_requirements
    self.install_from_file(path)
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/package/pip_api/system.py", line 35, in install_from_file
    self.run_with_env(('install', '-r', path) + self.install_flags(), cwd=self.cwd)
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/package/pip_api/system.py", line 94, in run_with_env
    return (command.get_output if output else command.check_call)(stdin=DEVNULL, env=env, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/process.py", line 198, in check_call
    return self.call_subprocess(subprocess.check_call, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/clearml/.venv/lib/python3.11/site-packages/clearml_agent/helper/process.py", line 245, in call_subprocess
    return func(list(self), *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/clearml_user/.clearml/venvs-builds/3.11/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/tmp/cached-reqswqel5t8m.txt']' returned non-zero exit status 1.
  
  
Posted 3 months ago

Do you have a log of this?

  
  
Posted 3 months ago

I can even comment out the separate steps and create dummy ones in main_pipeline.py and load them, and still, the error persists.

  
  
Posted 3 months ago
276 Views
15 Answers
3 months ago
3 months ago
Tags