Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello! I Have An Issue Reproducing My Runs. The Task.Create Completes Successfully. When I Clone And Enqueue A Completed Task The Clone Fails. It Fails During The Python Requirements Installation. Why Is This? Do You Know How I Can Debug? Thank You In Adv

Hello! I have an issue reproducing my runs. The task.Create completes successfully. When I clone and enqueue a completed task the clone fails. It fails during the python requirements installation. Why is this? Do you know how I can debug? Thank you in advance!

  
  
Posted 2 months ago
Votes Newest

Answers 29


Resetting and enqueuing task which has built successfully also fails 😞

  
  
Posted 2 months ago

@<1523701205467926528:profile|AgitatedDove14> if we go with the ultralytics case:

INSTALLED PACKAGES for working manual execution

absl-py==2.1.0
albucore==0.0.13
albumentations==1.4.14
anaconda-anon-usage @ file:///croot/anaconda-anon-usage_1710965072196/work
annotated-types==0.7.0
anyio==4.4.0
archspec @ file:///croot/archspec_1709217642129/work
astor==0.8.1
asttokens @ file:///opt/conda/conda-bld/asttokens_1646925590279/work
astunparse==1.6.3
attrs @ file:///croot/attrs_1695717823297/work
Automat==24.8.1
beautifulsoup4 @ file:///croot/beautifulsoup4-split_1681493039619/work
boltons @ file:///croot/boltons_1677628692245/work
Brotli @ file:///croot/brotli-split_1714483155106/work
cattrs==23.2.3
certifi @ file:///croot/certifi_1707229174982/work/certifi
cffi @ file:///croot/cffi_1714483155441/work
chardet @ file:///home/builder/ci_310/chardet_1640804867535/work
charset-normalizer @ file:///tmp/build/80754af9/charset-normalizer_1630003229654/work
chex==0.1.86
clearml==1.16.4
click @ file:///croot/click_1698129812380/work
coloredlogs==15.0.1
Comet==3.1.0
conda @ file:///croot/conda_1689269889729/work
conda-build @ file:///croot/conda-build_1710789183177/work
conda-content-trust @ file:///croot/conda-content-trust_1714483159009/work
conda-libmamba-solver @ file:///croot/conda-libmamba-solver_1691418897561/work/src
conda-package-handling @ file:///croot/conda-package-handling_1714483155348/work
conda_index @ file:///croot/conda-index_1706633791028/work
conda_package_streaming @ file:///croot/conda-package-streaming_1690987966409/work
constantly==23.10.4
contourpy==1.3.0
coremltools==7.2
cryptography @ file:///croot/cryptography_1714660666131/work
cycler==0.12.1
Cython==3.0.11
decorator @ file:///opt/conda/conda-bld/decorator_1643638310831/work
distlib==0.3.8
distro @ file:///croot/distro_1714488253808/work
dnspython==2.6.1
etils==1.7.0
eval_type_backport==0.2.0
exceptiongroup @ file:///croot/exceptiongroup_1706031385326/work
executing @ file:///opt/conda/conda-bld/executing_1646925071911/work
expecttest==0.2.1
filelock @ file:///croot/filelock_1700591183607/work
flatbuffers==24.3.25
flax==0.9.0
fonttools==4.53.1
frozendict @ file:///croot/frozendict_1713194832637/work
fsspec==2024.6.0
furl==2.1.3
gast==0.6.0
gmpy2 @ file:///tmp/build/80754af9/gmpy2_1645455533097/work
google-pasta==0.2.0
grpcio==1.66.0
h11==0.14.0
h5py==3.11.0
httpcore==1.0.5
httpx==0.27.2
humanfriendly==10.0
humanize==4.10.0
hyperlink==21.0.0
hypothesis==6.103.0
idna @ file:///croot/idna_1714398848350/work
imageio==2.35.1
importlib_resources==6.4.4
incremental==24.7.2
ipython @ file:///croot/ipython_1704833016303/work
jax==0.4.31
jaxlib==0.4.31
jedi @ file:///tmp/build/80754af9/jedi_1644315229345/work
Jinja2 @ file:///croot/jinja2_1716993405101/work
jsonpatch @ file:///croot/jsonpatch_1714483231291/work
jsonpointer==2.1
jsonschema @ file:///croot/jsonschema_1699041609003/work
jsonschema-specifications @ file:///croot/jsonschema-specifications_1699032386549/work
keras==3.5.0
kiwisolver==1.4.5
lark==1.1.9
lazy_loader==0.4
libarchive-c @ file:///tmp/build/80754af9/python-libarchive-c_1617780486945/work
libclang==18.1.1
libmambapy @ file:///croot/mamba-split_1714483352891/work/libmambapy
lxml==5.3.0
Markdown==3.7
markdown-it-py==3.0.0
MarkupSafe @ file:///croot/markupsafe_1704205993651/work
matplotlib==3.9.2
matplotlib-inline @ file:///opt/conda/conda-bld/matplotlib-inline_1662014470464/work
mdurl==0.1.2
menuinst @ file:///croot/menuinst_1716404372721/work
mkl-fft @ file:///croot/mkl_fft_1695058164594/work
mkl-random @ file:///croot/mkl_random_1695059800811/work
mkl-service==2.4.0
ml-dtypes==0.4.0
more-itertools @ file:///croot/more-itertools_1700662129964/work
mpmath @ file:///croot/mpmath_1690848262763/work
msgpack==1.0.8
namex==0.0.8
ncnn==1.0.20240820
nest-asyncio==1.6.0
networkx @ file:///croot/networkx_1717597493534/work
numpy==1.23.5
nvidia-cuda-runtime-cu12==12.6.37
onnx==1.16.2
onnx-graphsurgeon==0.5.2
onnx2tf==1.22.3
onnxruntime==1.19.0
onnxslim==0.1.32
opencv-python==4.10.0.84
opencv-python-headless==4.10.0.84
openvino==2024.3.0
openvino-telemetry==2024.1.0
opt-einsum==3.3.0
optax==0.2.3
optree==0.11.0
orbax-checkpoint==0.6.1
orderedmultidict==1.0.1
packaging @ file:///croot/packaging_1710807400464/work
paddlepaddle==2.6.1
pandas==2.2.2
parso @ file:///opt/conda/conda-bld/parso_1641458642106/work
pathlib2==2.3.7.post1
pexpect @ file:///tmp/build/80754af9/pexpect_1605563209008/work
pillow @ file:///croot/pillow_1714398848491/work
pkginfo @ file:///croot/pkginfo_1715695984887/work
platformdirs @ file:///croot/platformdirs_1692205439124/work
pluggy @ file:///tmp/build/80754af9/pluggy_1648024709248/work
portalocker==2.10.1
prompt-toolkit @ file:///croot/prompt-toolkit_1704404351921/work
protobuf==3.20.3
psutil @ file:///opt/conda/conda-bld/psutil_1656431268089/work
ptyprocess @ file:///tmp/build/80754af9/ptyprocess_1609355006118/work/dist/ptyprocess-0.7.0-py2.py3-none-any.whl
pure-eval @ file:///opt/conda/conda-bld/pure_eval_1646925070566/work
py-cpuinfo==9.0.0
pyaml==24.7.0
pybind11==2.13.5
pycocotools==2.0.8
pycosat @ file:///croot/pycosat_1714510623388/work
pycparser @ file:///tmp/build/80754af9/pycparser_1636541352034/work
pydantic==2.8.2
pydantic_core==2.20.1
Pygments @ file:///croot/pygments_1684279966437/work
PyJWT==2.8.0
pyOpenSSL @ file:///croot/pyopenssl_1708380408460/work
pyparsing==3.1.4
PySocks @ file:///home/builder/ci_310/pysocks_1640793678128/work
python-dateutil==2.8.2
python-etcd==0.4.5
pytz @ file:///croot/pytz_1713974312559/work
PyYAML @ file:///croot/pyyaml_1698096049011/work
referencing @ file:///croot/referencing_1699012038513/work
requests==2.31.0
rich==13.8.0
rpds-py @ file:///croot/rpds-py_1698945930462/work
ruamel.yaml @ file:///croot/ruamel.yaml_1666304550667/work
ruamel.yaml.clib @ file:///croot/ruamel.yaml.clib_1666302247304/work
scikit-image==0.24.0
scipy==1.14.1
seaborn==0.13.2
six @ file:///tmp/build/80754af9/six_1644875935023/work
sng4onnx==1.0.4
sniffio==1.3.1
sortedcontainers==2.4.0
sounddevice==0.5.0
soupsieve @ file:///croot/soupsieve_1696347547217/work
stack-data @ file:///opt/conda/conda-bld/stack_data_1646927590127/work
sympy==1.12.1
tensorboard==2.17.1
tensorboard-data-server==0.7.2
tensorflow==2.17.0
tensorflow-hub==0.16.1
tensorflow-io-gcs-filesystem==0.37.1
tensorflow_decision_forests==1.10.0
tensorflowjs==4.20.0
tensorrt-cu12==10.1.0
tensorrt-cu12-bindings==10.1.0
tensorrt-cu12-libs==10.1.0
tensorstore==0.1.64
termcolor==2.4.0
tf_keras==2.17.0
tflite-support==0.4.4
tifffile==2024.8.28
tomli @ file:///opt/conda/conda-bld/tomli_1657175507142/work
toolz @ file:///croot/toolz_1667464077321/work
torch==2.3.1
torchaudio==2.3.1
torchelastic==0.2.2
torchvision==0.18.1
tqdm @ file:///croot/tqdm_1716395931952/work
traitlets @ file:///croot/traitlets_1671143879854/work
triton==2.3.1
truststore @ file:///croot/truststore_1695244293384/work
Twisted==24.7.0
types-dataclasses==0.6.6
typing_extensions @ file:///croot/typing_extensions_1715268824938/work
tzdata==2024.1
-e git+

ultralytics-thop==2.0.5
urllib3==1.26.19
virtualenv==20.26.3
wcwidth @ file:///Users/ktietz/demo/mc3/conda-bld/wcwidth_1629357192024/work
Werkzeug==3.0.4
wrapt==1.16.0
wurlitzer==3.1.1
x2paddle==1.5.0
ydf==0.7.0
zipp==3.20.1
zope.interface==7.0.3
zstandard @ file:///croot/zstandard_1714677652653/work
  
  
Posted 2 months ago

Setting agent.venvs_cache path back to ~/.clearml/venvs-cache seems to have done the trick!

  
  
Posted 2 months ago

"Original PIP" is empty as for this task we can rely on the docker image to provide the python packages

  
  
Posted 2 months ago

How are you getting:

beautifulsoup4 @ file:///croot/beautifulsoup4-split_1681493039619/work

This comes with the docker image ultralytics/ultralytics:latest

  
  
Posted 2 months ago

Hi @<1523701205467926528:profile|AgitatedDove14>
ClearML Agent 1.9.0

  
  
Posted 2 months ago

Thank you for your help @<1523701205467926528:profile|AgitatedDove14>

  
  
Posted 2 months ago

Full log for the failed clone

  
  
Posted 2 months ago

Great to hear it got solved. BTW network drives are supported but you have to make sure the mount file system supports locks (NFS does)

  
  
Posted 2 months ago

Docker mode

  
  
Posted 2 months ago

Thanks @<1523701205467926528:profile|AgitatedDove14> , will take a look

  
  
Posted 2 months ago

Task log

  
  
Posted 2 months ago

Hi @<1734020162731905024:profile|RattyBluewhale45>
What's the clearml agent version? And could you verify with the latest RC?
Lastly how are you running the agent, docker mode? What's the bade container?

  
  
Posted 2 months ago

Maybe it's related to this section?

WARNING:clearml_agent.helper.package.requirements:Local file not found [anaconda-anon-usage @ file:///croot/anaconda-anon-usage_1710965072196/work], references removed
  
  
Posted 2 months ago

It was pointing to a network drive before to avoid the local directory filling up

  
  
Posted 2 months ago

In a cloned run with new container ultralytics/ultralytics:latest I get this error:

clearml_agent: ERROR: Could not install task requirements!
Command '['/root/.clearml/venvs-builds/3.10/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/tmp/cached-reqs7171xfem.txt', '--extra-index-url', '
', '--extra-index-url', '
 returned non-zero exit status 1.
  
  
Posted 2 months ago

is this what you had on the Original manual execution ?

Yes this installed packages list is what succeeded via manual submission to agent

  
  
Posted 2 months ago

How are you getting:

beautifulsoup4 @ file:///croot/beautifulsoup4-split_1681493039619/work

is this what you had on the Original manual execution ? (i.e. not the one executed by the agent) - you can also look under "org _pip" dropdown in the "installed packages" of the failed Task

  
  
Posted 2 months ago

As I get a bunch of these warnings in both of the clones that failed

  
  
Posted 2 months ago

1724924574994 g-s:gpu1 DEBUG WARNING:root:Could not lock cache folder /root/.clearml/venvs-cache: [Errno 9] Bad file descriptor

You have an issue with your OS / mount, specifically "/mnt/clearml/" is the base folder for all the cached stuff and it fails to create the lock files there either use a Local folder or try to understand what is the issue with the Host machine /mnt/ mounts (because it looks like a network mount)

  
  
Posted 2 months ago

The original run completes successfully, it's only the runs cloned from the GUI which fail

  
  
Posted 2 months ago

DEBUG   Installing build dependencies ... [?25l- \ | / - done
[?25h  Getting requirements to build wheel ... [?25l- error
  [1;31merror[0m: [1msubprocess-exited-with-error[0m
  
  [31m×[0m [32mGetting requirements to build wheel[0m did not run successfully.
  [31m│[0m exit code: [1;36m1[0m
  [31m╰─>[0m [31m[21 lines of output][0m
  [31m   [0m Traceback (most recent call last):
  [31m   [0m   File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
  [31m   [0m     main()
  [31m   [0m   File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
  [31m   [0m     json_out['return_val'] = hook(**hook_input['kwargs'])
  [31m   [0m   File "/root/.clearml/venvs-builds/3.8/lib/python3.8/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 118, in get_requires_for_build_wheel
  [31m   [0m     return hook(config_settings)
  [31m   [0m   File "/tmp/pip-build-env-plkplfap/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 327, in get_requires_for_build_wheel
  [31m   [0m     return self._get_build_requires(config_settings, requirements=[])
  [31m   [0m   File "/tmp/pip-build-env-plkplfap/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 297, in _get_build_requires
  [31m   [0m     self.run_setup()
  [31m   [0m   File "/tmp/pip-build-env-plkplfap/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 497, in run_setup
  [31m   [0m     super().run_setup(setup_script=setup_script)
  [31m   [0m   File "/tmp/pip-build-env-plkplfap/overlay/lib/python3.8/site-packages/setuptools/build_meta.py", line 313, in run_setup
  [31m   [0m     exec(code, locals())
  [31m   [0m   File "<string>", line 9, in <module>
  [31m   [0m   File "/tmp/pip-req-build-_u883rml/build_tools/setup_helpers/__init__.py", line 1, in <module>
  [31m   [0m     from .extension import *  # noqa
  [31m   [0m   File "/tmp/pip-req-build-_u883rml/build_tools/setup_helpers/extension.py", line 6, in <module>
  [31m   [0m     from torch.utils.cpp_extension import BuildExtension as TorchBuildExtension, CppExtension
  [31m   [0m ModuleNotFoundError: No module named 'torch'
  [31m   [0m [31m[end of output][0m
  
  [1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
[1;31merror[0m: [1msubprocess-exited-with-error[0m

[31m×[0m [32mGetting requirements to build wheel[0m did not run successfully.
[31m│[0m exit code: [1;36m1[0m
[31m╰─>[0m See above for output.

[1;35mnote[0m: This error originates from a subprocess, and is likely not a problem with pip.
[?25h
clearml_agent: ERROR: Could not install task requirements!
Command '['/root/.clearml/venvs-builds/3.8/bin/python', '-m', 'pip', '--disable-pip-version-check', 'install', '-r', '/tmp/cached-reqs18jcb6cz.txt', '--extra-index-url', '
', '--extra-index-url', '
 returned non-zero exit status 1.
  
  
Posted 2 months ago

Thank you so much for your help @<1523701205467926528:profile|AgitatedDove14> !

  
  
Posted 2 months ago

WARNING:clearml_agent.helper.package.requirements:Local file not found [torch-tensorrt @ file:///opt/pytorch/torch_tensorrt/py/dist/torch_tensorrt-1.3.0a0-cp38-cp38-linux_x86_64.whl], references removed
  
  
Posted 2 months ago

agent.package_manager.pip_version=""

  
  
Posted 2 months ago

@<1734020162731905024:profile|RattyBluewhale45> could you attach the full Task log? Also what do you have under "installed packages" in the original manual execution that works for you?

  
  
Posted 2 months ago

Notice the error:

Cannot install albucore==0.0.13 and numpy==1.23.5 because these package versions have conflicting dependencies

what is the pip version you have configured in the clearml.conf? also can you provide the full Task log (i.e. click on Download in the web UI console tab)

  
  
Posted 2 months ago

lastly try to add:

extra_pip_install_flags: ["--use-deprecated=legacy-resolver", ]

None

  
  
Posted 2 months ago
205 Views
29 Answers
2 months ago
2 months ago
Tags