Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, While Running A Pipeline By Task, The Pipeline Is Stuck In 1St Stage Only. Console Is Showing

Hi, While running a pipeline by task, the pipeline is stuck in 1st stage only. Console is showing
Configurations:
OrderedDict()
Overrides:
OrderedDict()

and not moving forward. While inspecting queue, it is in pending stage.. can anyone help?

  
  
Posted one year ago
Votes Newest

Answers 10


yes

  
  
Posted one year ago

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m

1687953244763 si-sajalv:0 ERROR User aborted: stopping task (3)

1687953245766 si-sajalv:0 DEBUG Current configuration (clearml_agent v1.5.2, location: /tmp/clearml.conf):
----------------------
sdk.storage.cache.default_base_dir = /clearml_agent_cache
sdk.storage.cache.size.min_free_bytes = 10GB
sdk.storage.direct_access.0.url = file://*
sdk.metrics.file_history_size = 100
sdk.metrics.matplotlib_untitled_history_size = 100
sdk.metrics.images.format = JPEG
sdk.metrics.images.quality = 87
sdk.metrics.images.subsampling = 0
sdk.metrics.tensorboard_single_series_per_graph = false
sdk.network.metrics.file_upload_threads = 4
sdk.network.metrics.file_upload_starvation_warning_sec = 120
sdk.network.iteration.max_retries_on_server_error = 5
sdk.network.iteration.retry_backoff_factor_sec = 10
sdk.aws.s3.key =
sdk.aws.s3.region =
sdk.aws.boto3.pool_connections = 512
sdk.aws.boto3.max_multipart_concurrency = 16
sdk.log.null_log_propagate = false
sdk.log.task_log_buffer_capacity = 66
sdk.log.disable_urllib3_info = true
sdk.development.task_reuse_time_window_in_hours = 72.0
sdk.development.vcs_repo_detect_async = true
sdk.development.store_uncommitted_code_diff = true
sdk.development.support_stopping = true
sdk.development.default_output_uri =
sdk.development.force_analyze_entire_repo = false
sdk.development.suppress_update_message = false
sdk.development.detect_with_pip_freeze = false
sdk.development.worker.report_period_sec = 2
sdk.development.worker.ping_period_sec = 30
sdk.development.worker.log_stdout = true
sdk.development.worker.report_global_mem_used = false
agent.worker_id = si-sajalv:0
agent.worker_name = si-sajalv
agent.force_git_ssh_protocol = false
agent.python_binary =
agent.package_manager.type = pip
agent.package_manager.pip_version.0 = <20.2 ; python_version < '3.10'
agent.package_manager.pip_version.1 = <22.3 ; python_version >\= '3.10'
agent.package_manager.system_site_packages = true
agent.package_manager.force_upgrade = false
agent.package_manager.conda_channels.0 = pytorch
agent.package_manager.conda_channels.1 = conda-forge
agent.package_manager.conda_channels.2 = defaults
agent.package_manager.priority_optional_packages.0 = pygobject
agent.package_manager.torch_nightly = false
agent.package_manager.poetry_files_from_repo_working_dir = false
agent.package_manager.conda_env_as_base_docker = false
agent.venvs_dir = /root/.clearml/venvs-builds
agent.venvs_cache.max_entries = 10
agent.venvs_cache.free_space_threshold_gb = 2.0
agent.venvs_cache.path = /root/.clearml/venvs-cache
agent.vcs_cache.enabled = true
agent.vcs_cache.path = /root/.clearml/vcs-cache
agent.venv_update.enabled = false
agent.pip_download_cache.enabled = true
agent.pip_download_cache.path = /root/.clearml/pip-download-cache
agent.translate_ssh = true
agent.reload_config = false
agent.docker_pip_cache = /app/C:/Users/sajal.vasal/.clearml/pip-cache
agent.docker_apt_cache = /app/C:/Users/sajal.vasal/.clearml/apt-cache
agent.docker_force_pull = false
agent.default_docker.image = nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04
agent.enable_task_env = false
agent.hide_docker_command_env_vars.enabled = true
agent.hide_docker_command_env_vars.parse_embedded_urls = true
agent.abort_callback_max_timeout = 1800
agent.docker_internal_mounts.sdk_cache = /clearml_agent_cache
agent.docker_internal_mounts.apt_cache = /var/cache/apt/archives
agent.docker_internal_mounts.ssh_folder = ~/.ssh
agent.docker_internal_mounts.ssh_ro_folder = /.ssh
agent.docker_internal_mounts.pip_cache = /root/.cache/pip
agent.docker_internal_mounts.poetry_cache = /root/.cache/pypoetry
agent.docker_internal_mounts.vcs_cache = /root/.clearml/vcs-cache
agent.docker_internal_mounts.venv_build = ~/.clearml/venvs-builds
agent.docker_internal_mounts.pip_download = /root/.clearml/pip-download-cache
agent.apply_environment = true
agent.apply_files = true
agent.custom_build_script =
agent.disable_task_docker_override = false
agent.git_user = hotshotdragon
agent.default_python = 3.10
agent.cuda_version = 122
agent.cudnn_version = 0
api.version = 1.5
api.verify_certificate = true
api.default_version = 1.5
api.http.max_req_size = 15728640
api.http.retries.total = 240
api.http.retries.connect = 240
api.http.retries.read = 240
api.http.retries.redirect = 240
api.http.retries.status = 240
api.http.retries.backoff_factor = 1.0
api.http.retries.backoff_max = 120.0
api.http.wait_on_maintenance_forever = true
api.http.pool_maxsize = 512
api.http.pool_connections = 512
api.api_server = None
api.web_server = None
api.files_server = None
api.credentials.access_key = ***
api.host = None

Executing task id [080b31ec96124229bc5dc6a1955f58de]:
repository = None
branch = main
version_num = 4f79399df585f502828e8613cfc0688fe7cefada
tag =
docker_cmd = clearml-pipeline:0.4
entry_point = control.py
working_dir = .

  
  
Posted one year ago

It was stuck here. I had to abort manually. All the tasks completed though.

  
  
Posted one year ago

You said the pipeline completed running (one stage? more than one stage?) but I don't see that in the log?

  
  
Posted one year ago

Hey, I seem to have resolve this issue, but stuck in another.
Apparently even after all the tasks got completed of a pipeline, the pipeline is still running, I had to abort it manually. Am I missing any code to stop it after all tasks execution?

  
  
Posted one year ago

Environment setup completed successfully
Starting Task Execution:
2023-06-28 12:31:52
ClearML results page: None
2023-06-28 12:31:58
ClearML pipeline page: None
2023-06-28 12:32:05
Launching the next 1 steps
Launching step [stage_data]
2023-06-28 12:32:16
Launching step: stage_data
Parameters:
{'General/dataset_url': ' None '}
Configurations:
OrderedDict()
Overrides:
OrderedDict()

This is all I can find from the log, or am I looking at another thing.

  
  
Posted one year ago

If the step is pending, it basically means nothing takes it from the queue and executes it - look at the agent's log and try and see what's going on (is it monitoring the queue? is it pulling the tasks?)

  
  
Posted one year ago

Hi @<1585078763312386048:profile|ArrogantButterfly10> , do you have an agent monitoring the queue into which the pipeline steps are enqueued?

  
  
Posted one year ago

I ll explain you what happened, I ran " None " this code, so all the steps of pipeline ran

so the individual part of pipeline ran, but in dashboard when I am seeing the pipeline it is running continuously and not ending even after all the tasks are completed.
the above part is from the console of the pipeline

  
  
Posted one year ago

@<1585078763312386048:profile|ArrogantButterfly10> can you attach the pipeline controller's log?

  
  
Posted one year ago
909 Views
10 Answers
one year ago
one year ago
Tags