Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello! I Was Hoping I Could Get Some Debug Help. I'Ve Set Up A Clearml Pipeline Using The Pipelinecontroller, And When Running Through

Hello! I was hoping I could get some debug help. I've set up a ClearML pipeline using the PipelineController, and when running through pipeline.start_locally() (but run_pipeline_steps_locally=False ) everything works perfectly with tasks being queued and run successfully. When I run remotely ( pipeline.start() ), I can see the pipeline task created and the environment set up within that task. It seems to proceed normally, and the output ends with
Environment setup completed successfully Starting Task Execution:However, it seems to be entirely hanging here in the "Running" state. All stages of the pipeline are "Pending", none of them are actually enqueued anywhere, and it seems like nothing is happening.

Does this sound like some obvious error that I'm doing / any advice for where to look to see what's going on? Using clearml==1.7.3rc1, clearml-agent==1.4.1 , and I have a worker listening to the services queue.

  
  
Posted one year ago
Votes Newest

Answers 6


Yup, code/git reference is there. Will private message you the log

  
  
Posted one year ago

sets up the venv correctly, prints

Starting Task Execution:

then does nothing

Can you provide a log?
Do you see the code/git reference in the Pipeline Task details - Execution Tab ?

  
  
Posted one year ago

Yup! Have two queues: services with one worker spun up in --services-mode , and another queue (say foo ) that has a bunch of GPU workers on them. When I start the pipeline locally, jobs get sent off to foo and executed exactly how I'd expect. If I keep everything exactly the same, and just change pipeline.start_locally() -> pipeline.start() , the pipeline task itself is picked up by the worker in the services queue, sets up the venv correctly, prints Starting Task Execution: then does nothing 😕

  
  
Posted one year ago

It just seems frozen at the place where it should be spinning up the tasks within the pipeline

And is there an agent for those ? usually there is one agent for running logic tasks (like pipelines) running with --services-mode which means multiple Tasks can be executed by the same agent. And other agents for compute Tasks that are a signle Task per agent (but you can run multiple agents on the same machine)

  
  
Posted one year ago

Yup, there was an agent listening to the services queue, it picked up the pipeline job and started to execute it. It just seems frozen at the place where it should be spinning up the tasks within the pipeline

  
  
Posted one year ago

Hi SteadySeagull18

However, it seems to be entirely hanging here in the "Running" state.

Did you set a an agent to listen to the "services" queue ?
Someone needs to run the pipeline logic itself, it is sometimes part of the clearml-server deployment but not a mist

  
  
Posted one year ago