Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hi, I'Ve Got A Quick Question About

Hi, I've got a quick question about Task.connect . I'm trying to connect a dictionary which contains a few thousand hashes and it works fine locally but takes a long time from the GCP VM that it the task is running on. I think this is because of the limited egress available. Does Task.connect send each element of the dictionary as a separate api request? Has anyone else encountered this issue?

  
  
Posted 2 years ago
Votes Newest

Answers 12


Yeah it's strange isn't it!

  
  
Posted 2 years ago

Where is the cleamlr-server running? GCP as well?

  
  
Posted 2 years ago

SuperiorPanda77 I have to admit, not sure what would cause the slowness only on GCP ... (if anything I would expect the network infrastructure would be faster)

  
  
Posted 2 years ago

connect_configuration

seems to take about the same amount of time unfortunately!

I think it is a better solution, that said from your description it sounds the issue is the upload bandwidth (i.e. json-ing the dict itself), could that be it?
(and even 1000 entries seems like something that would end up at 1mb upload, that is not that much)

  
  
Posted 2 years ago

the time taken to upload halved. It is puzzling because as you say it's not that much to upload.

Maybe it was the load on the server? meaning dealing with multiple requests at the same time delayed the requests?!

For now I've whittled down the number of entries to a more select but useful few and that has solved the issue. If it crops up again I will try

connect_configuration

properly.
Thanks for your help!

My pleasure 🙂

  
  
Posted 2 years ago

That said, maybe the connect dict is not the best solution for thousand key dictionary

Seems like it isn't haha!
What is the difference with connect_configuration ? The nice thing about it not being an artifact is that we can use the gui to see which hashes have changed (which admittedly when there are a few thousand is tricky anyway)

  
  
Posted 2 years ago

I realise I made a mistake and hadn't actually used connect_configuration !

I think the issue is the bandwidth yeah, for example when I doubled the number of CPUs (which doubles the allowed egress) the time taken to upload halved. It is puzzling because as you say it's not that much to upload.

For now I've whittled down the number of entries to a more select but useful few and that has solved the issue. If it crops up again I will try connect_configuration properly.

Thanks for your help!

  
  
Posted 2 years ago

. Does

Task.connect

send each element of the dictionary as a separate api request? Has anyone else encountered this issue?

Hi SuperiorPanda77
the task.connect ends up as a single call with all the data being sent on a single request.
That said, maybe the connect dict is not the best solution for thousand key dictionary ...
Maybe artifact, or connect_configuration are better suited ?
wdyt?

  
  
Posted 2 years ago

Maybe it was the load on the server? meaning dealing with multiple requests at the same time delayed the requests?!

Possibly but I think the server was fine as I could run the same task locally and it took a few seconds (rather than 75) to upload. The egress limit on the agent was 32 Gbps which seems much larger than what I though I was sending but I don't have a good idea of what that limit actually means in practice!

  
  
Posted 2 years ago

connect_configuration seems to take about the same amount of time unfortunately!

  
  
Posted 2 years ago

Yep GCP. I wonder if it's something to do with Container-Opimized OS, which is how I'm running the agents

  
  
Posted 2 years ago

Can't think of a reason it will have such an effect ...

  
  
Posted 2 years ago
1K Views
12 Answers
2 years ago
one year ago
Tags