@<1585078763312386048:profile|ArrogantButterfly10> can you attach the pipeline controller's log?
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
1687953244763 si-sajalv:0 ERROR User aborted: stopping task (3)
1687953245766 si-sajalv:0 DEBUG Current configuration (clearml_agent v1.5.2, location: /tmp/clearml.conf):
----------------------
sdk.storage.cache.default_base_dir = /clearml_agent_cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.s3.key =
sdk.aws.s3.region =
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.default_output_uri =
sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false
agent.worker_id = si-sajalv:0
agent.worker_name = si-sajalv
agent.force_git_ssh_protocol = false
agent.python_binary =
agent.package_manager.type = pip
agent.package_manager.pip_version.0 = <20.2 ; python_version < '3.10'
agent.package_manager.pip_version.1 = <22.3 ; python_version >\= '3.10'
agent.package_manager.system_site_packages = true
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = pytorch
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = defaults
agent.package_manager.priority_optional_packages.0 = pygobject
agent.package_manager.torch_nightly = false
agent.package_manager.poetry_files_from_repo_working_dir = false
agent.package_manager.conda_env_as_base_docker = false
agent.venvs_dir = /root/.clearml/venvs-builds
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.venvs_cache.path = /root/.clearml/venvs-cache
agent.vcs_cache.enabled = true
agent.vcs_cache.path = /root/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = /root/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = /app/C:/Users/sajal.vasal/.clearml/pip-cache
agent.docker_apt_cache = /app/C:/Users/sajal.vasal/.clearml/apt-cache
agent.docker_force_pull = false
agent.default_docker.image = nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04
agent.enable_task_env = false
agent.hide_docker_command_env_vars.enabled = true
agent.hide_docker_command_env_vars.parse_embedded_urls = true
agent.abort_callback_max_timeout = 1800
agent.docker_internal_mounts.sdk_cache = /clearml_agent_cache
agent.docker_internal_mounts.apt_cache = /var/cache/apt/archives
agent.docker_internal_mounts.ssh_folder = ~/.ssh
agent.docker_internal_mounts.ssh_ro_folder = /.ssh
agent.docker_internal_mounts.pip_cache = /root/.cache/pip
agent.docker_internal_mounts.poetry_cache = /root/.cache/pypoetry
agent.docker_internal_mounts.vcs_cache = /root/.clearml/vcs-cache
agent.docker_internal_mounts.venv_build = ~/.clearml/venvs-builds
agent.docker_internal_mounts.pip_download = /root/.clearml/pip-download-cache
agent.apply_environment = true
agent.apply_files = true
agent.custom_build_script =
agent.disable_task_docker_override = false
agent.git_user = hotshotdragon
agent.default_python = 3.10
agent.cuda_version = 122
agent.cudnn_version = 0
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.api_server =
Noneapi.web_server =
Noneapi.files_server =
Noneapi.credentials.access_key = ***
api.host =
None
Executing task id [080b31ec96124229bc5dc6a1955f58de]:
repository =
Nonebranch = main
version_num = 4f79399df585f502828e8613cfc0688fe7cefada
tag =
docker_cmd = clearml-pipeline:0.4
entry_point = control.py
working_dir = .
Hey, I seem to have resolve this issue, but stuck in another.
Apparently even after all the tasks got completed of a pipeline, the pipeline is still running, I had to abort it manually. Am I missing any code to stop it after all tasks execution?
It was stuck here. I had to abort manually. All the tasks completed though.
Environment setup completed successfully
Starting Task Execution:
2023-06-28 12:31:52
ClearML results page:
None2023-06-28 12:31:58
ClearML pipeline page:
None2023-06-28 12:32:05
Launching the next 1 steps
Launching step [stage_data]
2023-06-28 12:32:16
Launching step: stage_data
Parameters:
{'General/dataset_url': '
None '}
Configurations:
OrderedDict()
Overrides:
OrderedDict()
This is all I can find from the log, or am I looking at another thing.
If the step is pending, it basically means nothing takes it from the queue and executes it - look at the agent's log and try and see what's going on (is it monitoring the queue? is it pulling the tasks?)
I ll explain you what happened, I ran " None " this code, so all the steps of pipeline ran
so the individual part of pipeline ran, but in dashboard when I am seeing the pipeline it is running continuously and not ending even after all the tasks are completed.
the above part is from the console of the pipeline
You said the pipeline completed running (one stage? more than one stage?) but I don't see that in the log?
Hi @<1585078763312386048:profile|ArrogantButterfly10> , do you have an agent monitoring the queue into which the pipeline steps are enqueued?