Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Anyone Here With Any Idea Why My Service Tasks Get Aborted When Going To Sleep?

Anyone here with any idea why my service tasks get aborted when going to sleep?

  
  
Posted 11 months ago
Votes Newest

Answers 9


Log

  
  
Posted 11 months ago

Hi @<1523701868901961728:profile|ReassuredTiger98> ! Looks like the task actually somehow gets ran by both an agent and locally at the same time, so one of the is aborted. Any idea why this might happen?

  
  
Posted 11 months ago

Hmm okay let me check that, I think I understand the issue

  
  
Posted 11 months ago

Well, after restarting the agent (to set it into --detached more) it set the cleanup_task.py into service mode, but my monitoring tasks are just executed on the agent itself (no new service clearml-agent is started) and then it is aborted right after starting.

  
  
Posted 11 months ago

@<1523701435869433856:profile|SmugDolphin23> Good catch. I have a good but unsatisfying message for you guys: I restarted the whole machine (server and agent) and now it works fine ...

  
  
Posted 11 months ago

Okay, I found something out: When I use docker image ubuntu:22.04 it does not spin up a service agent and aborts the task. When I used python:latest everything works fine!

  
  
Posted 11 months ago

There might be something wrong with the agent using ubuntu:22.04 . Anyway, good to know everything works fine now

  
  
Posted 11 months ago

Hi @<1523701868901961728:profile|ReassuredTiger98>

Anyone here with any idea why my service tasks get aborted when going to sleep?

I think I understand the issue, clearml==1.4.0 try running with the latest clearml (1.10.x)
It will keep pinging the backend "Im alive" so the backend does not think this process is dead (which I suspect what happened, and after 2 hours the backend basically set the Task to aborted because it "thought" it was killed)

  
  
Posted 11 months ago

With clearml==1.4.1 it works, but with the current version it aborts. Here is a log with latest clearml

  
  
Posted 11 months ago
694 Views
9 Answers
11 months ago
11 months ago
Tags