Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Answered
Hello Guys, I Have A Strange Situation With A Pipeline Controller I'M Testing Atm. If I Run The Controller Directly In My Pycharm On Notebook It Connects Correctly To The K8S Cluster With Trains Installed. After This, If I Go Directly In The Ui, I Reset T

Hello guys,
I have a strange situation with a Pipeline Controller I'm testing atm. If I run the controller directly in my PyCharm on notebook it connects correctly to the K8s cluster with Trains installed. After this, if I go directly in the UI, I reset the Controller and I enqueue again (in default queue), it runs and until the end but it didn't clone tasks and run them.
From notebook I got
Launching step: stage_data Parameters: Nonethat is ok while resetting and enqueueing ends without launching any step. Agent on the cluster is 0.16.4 with dockerMode false. Any suggestion?

  
  
Posted 3 years ago
Votes Newest

Answers 29


\o/

  
  
Posted 3 years ago

No worries, just found it. Thanks!
I'll make sure to followup on the GitHub issue for better visibility 🙂

  
  
Posted 3 years ago

today I'm in the middle of sprint plannings for my team so I will not be probably fast to help if needed, but feel free to ping me just in case (I will try to do my best)

  
  
Posted 3 years ago

Hi JuicyFox94
I think you are correct, this bug will explain the entire thing.
Basically what happens is that remote_execute stops the local run before the configuration is set on the Task. Then running remotely the code pull the configuration, sees that it is empty and does nothing.
Let me see if I can reproduce it...

  
  
Posted 3 years ago

I finally found where the issue is, I opened an issue on gh so it's more manageable:
https://github.com/allegroai/clearml/issues/273

I also found a not so good (at least for me) behaviour:
https://github.com/allegroai/clearml/issues/272

  
  
Posted 3 years ago

My pleasure

  
  
Posted 3 years ago

a big ty for now

  
  
Posted 3 years ago

Sure thing 🙂

  
  
Posted 3 years ago

and I will give you feedback here

  
  
Posted 3 years ago

maybe this can cause the issue?

Not likely.
In the original pipeline (the one executed from the Pycharm) do you see the "Pipeline" section under Configuration -> "Config objects" in the UI?

  
  
Posted 3 years ago

perfect, now the whole process is clear to me

  
  
Posted 3 years ago

I mean this blob is then saved on the fs

It can if you do:
temp_file = task.connect_configuration('/path/to/config/file', name='configuration object is a config file')Then temp_file is actually a local copy of the text coming from the Task.
When running in manual mode the content of '/path/to/config/file' is stored on the Task When running remotely by the agent, the content from the Task is dumped into a temp file and the path to the file is returned in temp_file

  
  
Posted 3 years ago

maybe this can cause the issue?

  
  
Posted 3 years ago

I have multiple agents not sharing /root/.trains

  
  
Posted 3 years ago

I mean this blob is then saved on the fs

  
  
Posted 3 years ago

what is?

  
  
Posted 3 years ago

?

  
  
Posted 3 years ago

btw it's on the filesystem

  
  
Posted 3 years ago

👍

  
  
Posted 3 years ago

got it

  
  
Posted 3 years ago

My bad, there is a mixture in terms.
"configuration object" is just a dictionary (or plain text) stored on the Task itself.
It has no file representation (well you could get it dumped to a file, but it is actually stored a s a blob of text on the Task itself, at the backend side)

  
  
Posted 3 years ago

this configuration object is stored as a file in /root/.trains ?

  
  
Posted 3 years ago

If the manual execution (i.e. pycharm) was working it should have stored it on the Pipeline Task.

  
  
Posted 3 years ago

checking it 😄

  
  
Posted 3 years ago

I have a theory

  
  
Posted 3 years ago

Good, so we narrowed it down. Now the question is how come it is empty ?

  
  
Posted 3 years ago

yaya confirmed

  
  
Posted 3 years ago

then I enqueue it and it's created but obv empty

  
  
Posted 3 years ago

ok the issue must be there, After first creation nothing is there

  
  
Posted 3 years ago
1K Views
29 Answers
3 years ago
one year ago
Tags