hey there,
i am trying to use a local docker agent, but i cannot run tasks there. Using the bare metal variant with venvs works fine.
It seems as the process cannot proceed after installing clearml-agent on the docker container, because the experiment is still on running state but there is no more console output in the WebUI after :

Using cached rpds_py-0.18.0-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.1 MB)
Using cached zipp-3.18.1-py3-none-any.whl (8.2 kB)
Installing collected packages: distlib, zipp, urllib3, six, rpds-py, PyYAML, pyparsing, pyjwt, psutil, platformdirs, pkgutil-resolve-name, idna, filelock, charset-normalizer, certifi, attrs, virtualenv, requests, referencing, python-dateutil, pathlib2, orderedmultidict, importlib-resources, jsonschema-specifications, furl, jsonschema, clearml-agent
2024-04-15 16:04:21
Successfully installed PyYAML-6.0.1 attrs-23.2.0 certifi-2024.2.2 charset-normalizer-3.3.2 clearml-agent-1.8.0 distlib-0.3.8 filelock-3.13.4 furl-2.1.3 idna-3.7 importlib-resources-6.4.0 jsonschema-4.21.1 jsonschema-specifications-2023.12.1 orderedmultidict-1.0.1 pathlib2-2.3.7.post1 pkgutil-resolve-name-1.3.10 platformdirs-4.2.0 psutil-5.9.8 pyjwt-2.8.0 pyparsing-3.1.2 python-dateutil-2.8.2 referencing-0.34.0 requests-2.31.0 rpds-py-0.18.0 six-1.16.0 urllib3-1.26.18 virtualenv-20.25.1 zipp-3.18.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: 

. Also i figured, that only Python 3.8 is installed inside the Container, contrary to the defined 3.10 in the conf file.

Posted 2 months ago
Votes Newest

Answers 3

i think, I managed to find the problem. In the docker container, it seems that the clearml_agent excecute cannot be performed

Posted 2 months ago

Hi @<1690896105262288896:profile|EnergeticTiger5> , can you add a full long of the run please?

Posted 2 months ago

Thanks for the quick response.
I startet the deamon with:
clearml-agent daemon --queue "4gb" --docker clearml/fractional-gpu:u22-cu11.7-4gb --force-current-version

this is my clearaml.conf on the server:

agent {
    # Set GIT user/pass credentials (if user/pass are set, GIT protocol will be set to https)
    # all other domains will use public access (no user/pass). Default: always send user/pass for any VCS domain
    package_manager: {
        type: pip,
        pip_version: [""]
        pytorch_resolve: none
        extra_pip_install_flags: ["--user"]
        extra_index_url: ["XXX"]
    # Force GIT protocol to use SSH regardless of the git url (Assumes GIT user/pass are blank)
    force_git_ssh_protocol: false

    # unique name of this worker, if None, created based on hostname:process_id
    # Overridden with os environment: CLEARML_WORKER_NAME
    worker_id: ""
    docker_use_activated_venv: false
    extra_docker_arguments: ["--pid=host","-e","http_proxy=XXX", "-e","https_proxy=XXX"]
Posted 2 months ago