Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello, I Am Running My Own Instance Of The Clearml-Server. All Works As Expected, But Sometimes My Training Tasks Get Stuck For 40+ Minutes (While Usually Taking About 5 Minutes) With The Following Log :

Hello,
I am running my own instance of the clearml-server.
All works as expected, but sometimes my training tasks get stuck for 40+ minutes (while usually taking about 5 minutes) with the following log :
2022-02-01 15:58:35,921 - clearml.Task - INFO - Waiting for previous model to uploadAny idea what is happening ?

  
  
Posted 2 years ago
Votes Newest

Answers 10


Well, I assume ClearML SDK is waiting for models to be uploaded to the fileserver - is your self-hosted server remote, or on the same network? Can it be you have a limited bandwidth to it?

  
  
Posted 2 years ago

yeah, it is async, but when talking a snapshot it will wait for the previous model to finish uploading I think

  
  
Posted 2 years ago

Okey thanks! I'll try this, if it does not work I'll just deactivate the automatic detection feature.

  
  
Posted 2 years ago

Is there a way to make it synchronous ?

  
  
Posted 2 years ago

how are you reporting your models?

  
  
Posted 2 years ago

so if you have very large snapshots that are close to one another one might wait for the other for quite some time

  
  
Posted 2 years ago

I don't really know. I just detected it automatically from the start, so I haven't looked into it yet.

  
  
Posted 2 years ago

Oh, if it's using the automagic than it's always in the background, which means it's also async

  
  
Posted 2 years ago

you can try calling task.flush(wait_for_uploads=True)

  
  
Posted 2 years ago

The fileserver is remote, but the bandwidth is not an issue.
Is the automatic artifact storage of clearml async ? (meaning even if the task is finished it could still be uploading associated artifacts ?)

  
  
Posted 2 years ago
1K Views
10 Answers
2 years ago
one year ago
Tags