Examples: query, "exact match", wildcard*, wild?ard, wild*rd
Fuzzy search: cake~ (finds cakes, bake)
Term boost: "red velvet"^4, chocolate^2
Field grouping: tags:(+work -"fun-stuff")
Escaping: Escape characters +-&|!(){}[]^"~*?:\ with \, e.g. \+
Range search: properties.timestamp:[1587729413488 TO *] (inclusive), properties.title:{A TO Z}(excluding A and Z)
Combinations: chocolate AND vanilla, chocolate OR vanilla, (chocolate OR vanilla) NOT "vanilla pudding"
Field search: properties.title:"The Title" AND text
Profile picture
SillyPuppy19
Moderator
2 Questions, 7 Answers
  Active since 10 January 2023
  Last activity one year ago

Reputation

0

Badges 1

7 × Eureka!
0 Votes
10 Answers
457 Views
0 Votes 10 Answers 457 Views
3 years ago
0 Votes
4 Answers
579 Views
0 Votes 4 Answers 579 Views
3 years ago
0 Another Conundrum: I Have A Single Script That Launches Training Jobs For Various Models. It Does This By Accepting A Flag Which Is The Model Name, And Dynamically Loading The Module To Train It. This Didn'T Mesh Well With Trains, Because The Project And

AgitatedDove14 sorry if that wasn't clear. I think the issue is that when trains-agent runs the script, none of the flag values are set until the Task object is initialized. For that to happen, the task object needs to know what project/task to connect to, which I presume is via the project_name and task_name parameters.

If those parameters are themselves dependent on flags, then they will be uninitialized when trains-agent runs the script, as it does not run it with any comman...

3 years ago
0 Hi Everyone! Quick Question: I Have A Script That Allows The Model To Be Saved Out In Case Of An Early Exit. At The Moment The Script Is Catching The Sigint And Sigterm Signals, Ending The Training And Writing Out The Model. I Understand I Could Use Check

SuccessfulKoala55 that's good to know. I moved the signal register handles above the call to Task.init() as you suggested. This is what I should be seeing when the script is terminated manually:

` I0526 07:46:14.391154 140262441822016 engine.py:837] Engine run starting with max_epochs=100.
I0526 07:46:14.542132 140262441822016 train_utils.py:223] Epoch[1] Iter[1] Loss: 0.43599218130111694
I0526 07:46:24.078526 140262441822016 train_utils.py:46] 2 signal intercepted.
I0526 07:46:24.078...

3 years ago
0 Hi Everyone! Quick Question: I Have A Script That Allows The Model To Be Saved Out In Case Of An Early Exit. At The Moment The Script Is Catching The Sigint And Sigterm Signals, Ending The Training And Writing Out The Model. I Understand I Could Use Check

AgitatedDove14 I'm definitely after a graceful abort from a long experiment. I don't necessarily want to throw the state away but I don't want to have to recover everything from checkpoints, hence the save-on-terminate. If there's another way I should be looking at it I'd love to get your thoughts.

3 years ago
0 Hi Everyone! Quick Question: I Have A Script That Allows The Model To Be Saved Out In Case Of An Early Exit. At The Moment The Script Is Catching The Sigint And Sigterm Signals, Ending The Training And Writing Out The Model. I Understand I Could Use Check

Ah, the 2 second grace period answers a question I had. I tried to hijack the Tasks's signal handler to see if I can do my exit cleanup then run the Task's handler, but it didn't seem to work. I think I must have triggered the 2s cooldown and had my task terminated.

I think I can work around this right now by running my tasks manually without trains-agent, but I'd love a way to do something on exit. AgitatedDove14 I'd be happy to create an issue. I think the solution might be a bit more in...

3 years ago