Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello Everyone ! Problem Description: I Have My Virtual Environment (Conda) In Which I Do Have Detectron2 Installed. When I Run The Task Locally It Is Working (Some Training Script). I Also Have Clearml Agent Installed. In My Config I Do Have Python Binar

Hello everyone !
Problem description: I have my virtual environment (conda) in which I do have detectron2 installed. When I run the task locally it is working (some training script). I also have clearML agent installed. In my config I do have python binary path set up the the path where is the virtual environment python and I also do have type of package manager conda. It still looks like there is a problem of version. When I try to run my code from the clearML dashboard I get this:
` 1666780820893 ivanNTB:1 INFO task c888921a793644b9bb448db857605405 pulled from e70c64ad56b1460cbaf2f3c0ad756719 by worker ivanNTB:1

1666780827836 ivanNTB:1 DEBUG Current configuration (clearml_agent v1.4.1, location: /tmp/.clearml_agent.q92u9oak.cfg):

api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.api_server =
api.web_server =
api.files_server =
api.credentials.access_key = S44CA87918WHCEKQ1KSB
api.host =
sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.s3.key =
sdk.aws.s3.region =
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.default_output_uri =
sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false
agent.worker_id = ivanNTB:1
agent.worker_name = ivanNTB
agent.force_git_ssh_protocol = false
agent.python_binary = /home/ivan/miniconda3/envs/detectron2/bin/python
agent.package_manager.type = conda
agent.package_manager.pip_version = 22.2.2
agent.package_manager.system_site_packages = false
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = pytorch
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = defaults
agent.package_manager.priority_optional_packages.0 = pygobject
agent.package_manager.torch_nightly = false
agent.venvs_dir = /home/ivan/.clearml/venvs-builds.1
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.vcs_cache.enabled = true
agent.vcs_cache.path = /home/ivan/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = /home/ivan/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = /home/ivan/.clearml/pip-cache
agent.docker_apt_cache = /home/ivan/.clearml/apt-cache.1
agent.docker_force_pull = false
agent.default_docker.image = nvidia/cuda:11.3-cudnn7-runtime-ubuntu22.04
agent.enable_task_env = false
agent.hide_docker_command_env_vars.enabled = true
agent.hide_docker_command_env_vars.parse_embedded_urls = true
agent.abort_callback_max_timeout = 1800
agent.docker_internal_mounts.sdk_cache = /clearml_agent_cache
agent.docker_internal_mounts.apt_cache = /var/cache/apt/archives
agent.docker_internal_mounts.ssh_folder = ~/.ssh
agent.docker_internal_mounts.ssh_ro_folder = /.ssh
agent.docker_internal_mounts.pip_cache = /root/.cache/pip
agent.docker_internal_mounts.poetry_cache = /root/.cache/pypoetry
agent.docker_internal_mounts.vcs_cache = /root/.clearml/vcs-cache
agent.docker_internal_mounts.venv_build = ~/.clearml/venvs-builds
agent.docker_internal_mounts.pip_download = /root/.clearml/pip-download-cache
agent.apply_environment = true
agent.apply_files = true
agent.custom_build_script =
agent.git_user =
agent.cuda_version = 113
agent.ignore_requested_python_version = true
agent.default_python = 3.9
agent.cudnn_version = 0

Executing task id [c888921a793644b9bb448db857605405]:
repository = [URL for my project on git]
branch = master
version_num = df1684f5902885c4183d2fc5416675f22a1b2f8b
tag =
docker_cmd =
entry_point = ml_framework/training/train.py
working_dir = .

Executing Conda: /home/ivan/miniconda3/condabin/conda env remove -p /home/ivan/.clearml/venvs-builds.1 --quiet --json

Remove all packages in environment /home/ivan/.clearml/venvs-builds.1:

::: Python virtual environment cache is disabled. To accelerate spin-up time set agent.venvs_cache.path=~/.clearml/venvs-cache :::

Executing Conda: /home/ivan/miniconda3/condabin/conda create --yes --mkdir --prefix /home/ivan/.clearml/venvs-builds.1 python=

1666780833406 ivanNTB:1 DEBUG

Using cached repository in "/home/ivan/.clearml/vcs-cache/ML_framework.git.18e1bea6b665775a38a1efc47ff1548c/ML_framework.git"
Note: switching to 'df1684f5902885c4183d2fc5416675f22a1b2f8b'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by switching back to a branch.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -c with the switch command. Example:

git switch -c <new-branch-name>

Or undo this operation with:

git switch -

Turn off this advice by setting config variable advice.detachedHead to false

HEAD is now at df1684f Clean up include
type: git
url: [git url]
branch: HEAD
commit: df1684f5902885c4183d2fc5416675f22a1b2f8b
root: /home/ivan/.clearml/venvs-builds.1/task_repository/ML_framework.git
Applying uncommitted changes
Executing: ('git', 'apply', '--unidiff-zero'): b'<stdin>:38: trailing whitespace.\n \nwarning: 1 line adds whitespace errors.\n'

Executing Conda: /home/ivan/miniconda3/condabin/conda install -p /home/ivan/.clearml/venvs-builds.1 -c pytorch -c conda-forge -c defaults pip==22.2.2 --quiet --json
Pass
Conda: Trying to install requirements:
['cudatoolkit=11.3']
Executing Conda: /home/ivan/miniconda3/condabin/conda env update -p /home/ivan/.clearml/venvs-builds.1 --file /tmp/conda_envexdz15x1.yml --quiet --json

1666780839072 ivanNTB:1 DEBUG By downloading and using the CUDA Toolkit conda packages, you accept the terms and conditions of the CUDA End User License Agreement (EULA):

1666780844933 ivanNTB:1 DEBUG Pass
Conda: Installing requirements: step 2 - using pip:
['Send2Trash==1.8.0', 'clearml==1.7.2', 'detectron2==0.6+cu113', 'fvcore==0.1.5.post20220512', 'imgaug==0.4.0', 'numpy==1.23.4', 'omegaconf==2.2.3', 'open3d==0.15.2', 'opencv_python==4.6.0.66', 'pycocotools==2.0.5', 'pytest==7.1.3', 'scikit_learn==1.1.2', 'scipy==1.9.2', 'tensorboard==2.10.1', 'torch==1.10.0+cu113', 'torch_cluster==1.6.0', 'torchvision==0.11.1+cu113', 'tqdm==4.64.1', 'trimesh==3.15.5']
Defaulting to user installation because normal site-packages is not writeable
Requirement already satisfied: numpy==1.23.4 in ./.local/lib/python3.10/site-packages (1.23.4)
Torch CUDA 113 download page found
Found PyTorch version torch==1.10.0 matching CUDA version 113
Found PyTorch version torchvision==0.11.1 matching CUDA version 113
Defaulting to user installation because normal site-packages is not writeable
ERROR: torch-1.10.0+cu113-cp39-cp39-linux_x86_64.whl is not a supported wheel on this platform.
Command 'source /home/ivan/miniconda3/etc/profile.d/conda.sh && conda activate /home/ivan/.clearml/venvs-builds.1 && pip install -r /tmp/cached-reqs0j86sz3n.txt' returned non-zero exit status 1.

clearml_agent: ERROR: Could not install task requirements!
Command 'source /home/ivan/miniconda3/etc/profile.d/conda.sh && conda activate /home/ivan/.clearml/venvs-builds.1 && pip install -r /tmp/cached-reqs0j86sz3n.txt' returned non-zero exit status 1. `1666780845548 ivanNTB:1 DEBUG Process failed, exit code 1

  
  
Posted one year ago
Votes Newest

Answers 14


you can edit the requirements section directly

  
  
Posted one year ago

Still not solved, idk if these dependencies are cached somewhere but when i change requirements.txt or i add it manually into code it still have problems with the torch and is looking for 'torch==1.10.0+cu113'

  
  
Posted one year ago

you can edit the requirements section directly <- where ? if i create requirements.txt it seems to be ignored

  
  
Posted one year ago

or add requirements manually via code

  
  
Posted one year ago

Also, in the original experiment, what pytorch version is detected?

  
  
Posted one year ago

Can you please add the ~/clearml.conf for the agent? Also, are you trying to run everything on the same machine or different ones?

  
  
Posted one year ago

Hi ExasperatedCrocodile76 I noticed you try to install 'torch==1.10.0+cu113'
Conda: Installing requirements: step 2 - using pip: ['Send2Trash==1.8.0', 'clearml==1.7.2', 'detectron2==0.6+cu113', 'fvcore==0.1.5.post20220512', 'imgaug==0.4.0', 'numpy==1.23.4', 'omegaconf==2.2.3', 'open3d==0.15.2', 'opencv_python==4.6.0.66', 'pycocotools==2.0.5', 'pytest==7.1.3', 'scikit_learn==1.1.2', 'scipy==1.9.2', 'tensorboard==2.10.1', 'torch==1.10.0+cu113', 'torch_cluster==1.6.0', 'torchvision==0.11.1+cu113', 'tqdm==4.64.1', 'trimesh==3.15.5']I think it should be pytorch instead. Check this comment in a related GH thread: https://github.com/pytorch/pytorch/issues/47354#issuecomment-852909264

  
  
Posted one year ago

Original experiment has 1.10.0 pytorch and 113 cuda ['1.10.0+cu113']. Everything was run on the my local computer. In the virutal env i have these versions (however the system itself has little bit newer).

  
  
Posted one year ago

I think that might be the issue. Transfering from pip to Conda package managers can sometimes be problematic. Try to manually edit the requirements to reflect the settings in https://pytorch.org/

  
  
Posted one year ago

when i run it locally it was python script.py and for the remote you are right

  
  
Posted one year ago

but for the local execution virtual env (conda) named detectron2 was used

  
  
Posted one year ago

for the requirements how do you mean it please? To add requirements.txt into root directory ith the description of packages is enough ? or do you have to put somewhere you want to use this file? Thanks

  
  
Posted one year ago

ExasperatedCrocodile76 , did you run the original experiment on linux machine with pip and the remote machine is linux with conda package manager?

  
  
Posted one year ago
627 Views
14 Answers
one year ago
one year ago
Tags