About

Answered

About

About clearml-agent daemon --docker ,
I tried to execute a training task with yolov5 model (by ultralytics),
I enqueued the task with the right queue from web UI,
and the docker container (image; ultralytics/yolov5:latest) is running, but the training task doesn't seem progressing at all.
In console, I've found some lines about libraries installed, but nothing further is happening.

What should I do to execute the training task remotely?
Thanks for any advice or opinions in advance!

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DangerousStarfish38
				
					0
					 × 1

Votes Newest

Answers 4

Hi @<1664079296102141952:profile|DangerousStarfish38> , can you please add the full log of the task/agent? Also please add the configuration and the line you used to run the agent 🙂

  				
Posted 
	one year ago

					More
				  		
  Report
		
					CostlyOstrich36
				
					0

@<1523701070390366208:profile|CostlyOstrich36> Hi! Actually, I changed it to run the training in local environment now (not docker) !
I ran a queued task from web UI by clicking 'enqueue', and I got this error!

# Error logs
agent.package_manager.system_site_packages = false
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = pytorch
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = defaults
agent.package_manager.priority_optional_packages.0 = pygobject
agent.package_manager.torch_nightly = false
agent.package_manager.poetry_files_from_repo_working_dir = false
agent.venvs_dir = C:/Users/20220375/.clearml/venvs-builds
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.venvs_cache.path = ~/.clearml/venvs-cache
agent.vcs_cache.enabled = true
agent.vcs_cache.path = C:/Users/20220375/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = C:/Users/20220375/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = C:/Users/20220375/.clearml/pip-cache
agent.docker_apt_cache = C:/Users/20220375/.clearml/apt-cache
agent.docker_force_pull = false
agent.default_docker.image = nvidia/cuda:11.0.3-cudnn8-runtime-ubuntu20.04
agent.enable_task_env = false
agent.hide_docker_command_env_vars.enabled = true
agent.hide_docker_command_env_vars.parse_embedded_urls = true
agent.abort_callback_max_timeout = 1800
agent.docker_internal_mounts.sdk_cache = /clearml_agent_cache
agent.docker_internal_mounts.apt_cache = /var/cache/apt/archives
agent.docker_internal_mounts.ssh_folder = ~/.ssh
agent.docker_internal_mounts.ssh_ro_folder = /.ssh
agent.docker_internal_mounts.pip_cache = /root/.cache/pip
agent.docker_internal_mounts.poetry_cache = /root/.cache/pypoetry
agent.docker_internal_mounts.vcs_cache = /root/.clearml/vcs-cache
agent.docker_internal_mounts.venv_build = ~/.clearml/venvs-builds
agent.docker_internal_mounts.pip_download = /root/.clearml/pip-download-cache
agent.apply_environment = true
agent.apply_files = true
agent.custom_build_script = 
agent.disable_task_docker_override = false
agent.git_user = 
agent.default_python = 3.9
agent.cuda_version = 116
agent.cudnn_version = 0
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.api_server =


api.web_server =


api.files_server =


api.credentials.access_key = 2HC4ALTUTJ6T8KYG06VS
api.host =


sdk.storage.cache.default_base_dir = ~/.clearml/cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.s3.key = 
sdk.aws.s3.region = 
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.default_output_uri =


sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false
Executing task id [dd9c55dba3b34d19b160c1a54cd6f446]:
repository =


branch = master
version_num = 2d6d601b686e7991a96cad4f20e8cde3b6cbccde
tag = 
docker_cmd = ultralytics/yolov5:latest --ipc=host -e=CLEARML_AGENT_SKIP_PYTHON_ENV_INSTALL=1
entry_point = train.py
working_dir = .
error getting python3 version: Command '['python3', '--version']' returned non-zero exit status 9009.
C:\Users\20220375\AppData\Local\Programs\Python\Python39\python.exe: No module named virtualenv
clearml_agent: ERROR: Command '['python', '-m', 'virtualenv', 'C:\\Users\\20220375\\.clearml\\venvs-builds\\3.9']' returned non-zero exit status 1.

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DangerousStarfish38
				
					0
					 × 1

But, I also would like to know how to run this with docker!
This is the log file!
Thanks a lot!!

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DangerousStarfish38
				
					0
					 × 1

I've found this from docs.
Am I not supposed to run the agent in docker mode on Windows computer?

  				
Posted 
	one year ago

					More
				  		
  Report
		
					DangerousStarfish38
				
					0
					 × 1

Write your answer

2K Views

4 Answers

one year ago