Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
CourageousCoyote72
Moderator
1 Question, 4 Answers
  Active since 15 July 2025
  Last activity one month ago

Reputation

0

Badges 1

4 × Eureka!
0 Votes
16 Answers
243 Views
0 Votes 16 Answers 243 Views
one month ago
0 Hello Everyone, We’Re Encountering A Persistent Issue With Our Autoscaler Setup And Could Really Use Some Help. Despite Having The Autoscaler Running And The Queue (Default_Cpu) Properly Populated (87 Jobs Pending), The Tasks Are Never Picked Up And Exe

Unfortunately, the issue is only partially resolved: while some jobs are running on one instance, on another instance (default_gpu), our jobs are still pending… 😢
image
image

one month ago
0 Hello Everyone, We’Re Encountering A Persistent Issue With Our Autoscaler Setup And Could Really Use Some Help. Despite Having The Autoscaler Running And The Queue (Default_Cpu) Properly Populated (87 Jobs Pending), The Tasks Are Never Picked Up And Exe

Hello
Sorry for my late reply.

I’m running into an issue with my default_gpu queue: the ClearML auto-scaler detects the job and puts it into the ā€œPendingā€ state, but it never actually runs. From the auto-scaler logs (see screenshot 1), this seems expected since it only checks the queue every 5 minutes. I’ve also attached the relevant log file.

However, I don’t see anything in the logs that clearly explains the problem. Looking at AWS, I can see that the instance starts, stays in ā€œInitializi...

one month ago
0 Hello Everyone, We’Re Encountering A Persistent Issue With Our Autoscaler Setup And Could Really Use Some Help. Despite Having The Autoscaler Running And The Queue (Default_Cpu) Properly Populated (87 Jobs Pending), The Tasks Are Never Picked Up And Exe

I do not see any artifacts linked to the jobs in the default_gpu queue. We have not changed the configuration; as a debugging step, we simply restarted the instance.
image

one month ago