Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Unanswered
Hi, What Could Be The Reason That A Task Ran On An Agent Just Stopped Updating? The Status Is Still "Running" But It Doesn'T Seems Like It. The Agent Is Running On A Docker On A Gpu. It Completed 92 Epochs And Started 93. Run Started At 18:37 Feb 27, Last


looked in clearml_server/logs/apiserver.log:
last report on 2023-02-28 08:39:27,981. nothing wrong.
looking for the last update message on 03:21:
[2023-02-28 03:21:21,380] [9] [INFO] [clearml.service_repo] Returned 200 for events.add_batch in 46ms
[2023-02-28 03:21:25,103] [9] [INFO] [clearml.service_repo] Returned 200 for tasks.ping in 8ms
[2023-02-28 03:21:25,119] [9] [INFO] [clearml.service_repo] Returned 200 for tasks.get_all in 7ms
[2023-02-28 03:21:25,128] [9] [INFO] [clearml.service_repo] Returned 200 for queues.get_all in 5ms
[2023-02-28 03:21:25,145] [9] [INFO] [clearml.service_repo] Returned 200 for queues.get_next_task in 10ms
[2023-02-28 03:21:26,454] [9] [INFO] [clearml.service_repo] Returned 200 for events.add_batch in 61ms
[2023-02-28 03:21:30,142] [9] [INFO] [clearml.service_repo] Returned 200 for tasks.ping in 8ms

looks fine - these lines repeats themselves the entire log.

looking in /tmp/.clearml_agent_daemon_outsw6p97f4.txt
filed last modified at 3:18. last lines middle of epoch 92 exactly like reported on webserver.

looking in /tmp/.clearml_agent_out.t3g81c0n.txt
file last modified 3:21 and the last line is exactly like reported on webserver

can't see anything abnormal

  
  
Posted one year ago
185 Views
0 Answers
one year ago
one year ago