Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi There, Does Anyone Have Suggestions For Best Practice For Deploying A Pipeline So That It Can Run Remotely On A Clearml Server Using A Docker Image? I Am Finding The Clearml Docs And Videos Insufficient To Get The Pipeline To Actually Run To Completion

Hi there, does anyone have suggestions for best practice for deploying a pipeline so that it can run remotely on a ClearML server using a Docker image? I am finding the ClearML docs and videos insufficient to get the pipeline to actually run to completion remotely.

Steps I have done:

  • Written a pipeline with PipelineDecorator with one component and one pipeline function, executing_pipeline() .
  • In Clearml UI, created two queues - queue-services and queue-forecasting . - In the pipeline code, assigned the pipeline_execution_queue for the controller to queue-services
  • Assigned the other queue queue-forecasting to the pipeline component using execution_queue parameter.- Added a main.py which calls executing_pipeline() .
  • Built a docker image with the packages installed (as we use internal packages on Code Artifcact) and specified an entry point main.py .
  • On the ClearML server, started two agents (workers), both in --docker using the above docker image- One called worker-services in --services-mode and attached to queue queue-services
  • One called worker-forecasting and attached to queue queue-forecasting- Back in my IDE, I run python main.py and it starts and then switches to remote execution.
    BUT, in the ClearML UI:
  • Under Pipelines, I can see the pipeline says running up until it gets to "Launching step [step_name]" and then just hangs there.
  • If I go to the Experiments tab I can see the task for this step just stays in "Pending" mode.
  • Under the Workers and queues tab, I can see the queue queue-forecasting with the worker assigned and under "Next experiment" is the step name. But, nothing happens.
    Any tips or ideas?!
  
  
Posted 9 months ago
Votes Newest

Answers 3


Which gives me an idea. Could you please remove the entrypoint from the docker image altogether and try again ?

Overriding the entrypoint in the image can lead to docker run/docker exec failing to work properly , because instead of a shell it will use your entrypoint to run everything

  
  
Posted 9 months ago

Ok, thanks! Going to try this now. I included an entry point from reading some other messages on Slack here when trying to figure out how to use Docker for running remotely.

  
  
Posted 9 months ago

Hey @<1654294828365647872:profile|GorgeousShrimp11> can you abort all pending experiments that wait to be fetched from this queue and try again ? Off the top of my head it could be that the clearml-agent can’t pull the custom docker image. In general you should treat the docker images not as step definitions but only as the environment , hence setting the entrypoint is not necessary

  
  
Posted 9 months ago
564 Views
3 Answers
9 months ago
9 months ago
Tags
Similar posts