Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Why Does Clearml Still Waste Time On Requirement Analysis When I Provide Them? Any Tips For How I Can Reduce Clearml Overhead ... (The Time Before Work Actually Starts)?

why does clearml still waste time on requirement analysis when I provide them?
any tips for how I can reduce clearml overhead ... (the time before work actually starts)?
image
image

  
  
Posted 6 months ago
Votes Newest

Answers 9


Hi @<1689446563463565312:profile|SmallTurkey79>
This call is to set an existing (already created Task's requirements). Since it was just created it waits for the automatic package detection before overriding it.
What you want is " Task.force_requirements_env_freeze " (notice Class level, that need to be called Before Task.init)

Task.force_requirements_env_freeze(requirements_file="requirements.txt")
task = Task.init(...)
  
  
Posted 6 months ago

yup! that's what I was wondering if you'd help me find a way to change the timings of. Is there an option I can override to make the retry more aggressive?

I've definitely narrowed it down to the reverse proxy I'm behind. when I switch to a cloudflare tunnel, the overhead of the network is <1s compared to localhost, everything feels snappy!

But for security reasons, I need to keep using the reverse proxy, hence my question about configuring the silent clearml retries.

  
  
Posted 6 months ago

. Ive seen parameters connect and task create in

seconds

and other times it takes 4 minutes.

This might be your backend (cleamrl-server) replying slowly becuase of load?

Is there a way (at the class level) to control the retry logic on connecting to the API server?

The difference in the two screenshots is literally only the URLs in

clearml.conf

and it went from 30s down to 2-3s.

Yes that could be network, also notice that there is auto retries that are quiet basically if a request is dropped due to network issues / timeout etc, it will automatically retry (but of course it will look slow from the outside because of the retries)

  
  
Posted 6 months ago

@MichaelHi @<1689446563463565312:profile|SmallTurkey79> , the trigger to the detection is the call to Task.init() , so while calling set_packages() overrides any packages found, it will not prevent it.

  
  
Posted 6 months ago

yup! that's what I was wondering if you'd help me find a way to change the timings of. Is there an option I can override to make the retry more aggressive?

you mean wait for less?
None
add to your clearml.conf:

api.http.retries.backoff_factor = 0.1
  
  
Posted 6 months ago

thanks for the clarification. is there any bypass? (a git diff + git rev parse should take mere milliseconds)

I'm working out of a mono repo, and am beginning to suspect its a cause of slowness. next week ill try moving a pipeline over to a new repo to test if this theory holds any water.

  
  
Posted 6 months ago

thanks so much!
I've been running a bunch of tests with timers and seeing an absurd amount of variance. Ive seen parameters connect and task create in seconds and other times it takes 4 minutes.

Since I see timeout connection errors somewhat regularly, I'm wondering if perhaps I'm having networking errors. Is there a way (at the class level) to control the retry logic on connecting to the API server?

my operating theory is that some sort of backoff / timeout (eg 10s) is causing the high variance.

To test this, I spun up a local instance and pointing to localhost as well as the normal reverse-proxy, and found that localhost had "overhead times" that were completely reasonable - practically none at all.

The difference in the two screenshots is literally only the URLs in clearml.conf and it went from 30s down to 2-3s.

(server has been destroyed already, not worried about the keys showing)
image
image

  
  
Posted 6 months ago

thank you very much.

for remote workers, would this env variable get parsed correctly?
CLEARML_API_HTTP_RETRIES_BACKOFF_FACTOR=0.1

  
  
Posted 6 months ago

I'm not familiar with this one, I think you should be able to control it with:
None

CLEARML_AGENT__API__HTTP__RETRIES__BACKOFF_FACTOR
  
  
Posted 6 months ago
698 Views
9 Answers
6 months ago
6 months ago
Tags