Reputation
Badges 1
89 × Eureka!@<1717350332247314432:profile|WittySeal70> what's strange is I can import the package in the docker container when I run it outside of clearML
Hey yes it's self deployed
[2024-08-13 16:56:36,447] [9] [INFO] [clearml.service_repo] Returned 200 for workers.get_activity_report in 342ms
[2024-08-13 16:56:36,462] [9] [INFO] [clearml.service_repo] Returned 200 for workers.get_activity_report in 261ms
"Original PIP" is empty as for this task we can rely on the docker image to provide the python packages
It's hanging at
Installing collected packages: zipp, importlib-resources, rpds-py, pkgutil-resolve-name, attrs, referencing, jsonschema-specifications, jsonschema, certifi, urllib3, idna, charset-normalizer, requests, pyparsing, PyYAML, six, pathlib2, orderedmultidict, furl, pyjwt, psutil, python-dateutil, platformdirs, distlib, filelock, virtualenv, clearml-agent
Successfully installed PyYAML-6.0.2 attrs-23.2.0 certifi-2024.7.4 charset-normalizer-3.3.2 clearml-agent-1.8.1 distlib-0.3....
Solved that by setting docker_args=["--privileged", "--network=host"]
Isn't the problem that CUDA 12 is being installed?
pip install ultralytics --no-deps would also work. Is there a way to pass this to clearML?
agent.package_manager.pip_version=""
Maybe it's related to this section?
WARNING:clearml_agent.helper.package.requirements:Local file not found [anaconda-anon-usage @ file:///croot/anaconda-anon-usage_1710965072196/work], references removed
We are getting the dataset like this:
clearml_dataset = Dataset.get(
dataset_id=config.get("dataset_id"), alias=config.get("dataset_alias")
)
dataset_dir = clearml_dataset.get_local_copy()
@<1523701205467926528:profile|AgitatedDove14> if we go with the ultralytics case:
INSTALLED PACKAGES for working manual execution
absl-py==2.1.0
albucore==0.0.13
albumentations==1.4.14
anaconda-anon-usage @ file:///croot/anaconda-anon-usage_1710965072196/work
annotated-types==0.7.0
anyio==4.4.0
archspec @ file:///croot/archspec_1709217642129/work
astor==0.8.1
asttokens @ file:///opt/conda/conda-bld/asttokens_1646925590279/work
astunparse==1.6.3
attrs @ file:///croot/attrs_169571782329...
I can install the correct torch version with this command:pip install --pre torchvision --force-reinstall --index-url ` None ```
Using docker="ultralytics/ultralytics:latest" and docker_args=["--privileged"] seems to work!
We are using allegroai/clearml:latest API server
Full log for the failed clone
I have set agent.package_manager.pip_version="" which resolved that message
@<1523701070390366208:profile|CostlyOstrich36> do you have any ideas?
ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm).
�Traceback (most recent call last):
File "/root/.clearml/venvs-builds/3.10/task_repository/script.py", line 36, in <module>
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/multiprocessing/queues.py", line 244, in _feed
obj = _ForkingPickler.dumps(obj)
File "/opt/conda/lib/python3.10/multiprocessing/reduction.py", line 51, in dumps
cls(buf, protoco...
Setting ultralytics workers=0 seems to work as per the thread above!

