Reputation
Badges 1
21 × Eureka!I'm not sure that this is exactly that, though I wish to continue from a given checkpoint.
Also, will this overwrite graphs starting at a given step?
That's great for continuing from the last checkpoint, but, unless I misunderstand you, my intention is different:
Suppose I trained a model for 30k epochs over night, and looking at the graphs, I wish to get back to the 22k'th epoch and retrain it from there differently, while preserving all the history up to that point.
So, I start by cloning the task, and.. what can I do then to "get back" to the previous epoch? This means that I would like all metrics, logs, checkpoints, etc. from the 22k...
You might be able to also find out exactly what needs to be pickled using the f_code
of the function (but that's limited to C implementation of python).
AgitatedDove14
how would you specify the main python script entry point?
If you want to use a script, then the entry point should be the trivial one (either __main__
or main()
).
wouldn't that make more sense rather than a function call?
From what I can tell, W&B provide both options - either to specify a script path/module (that will be run regularly) or specify a function as an entry point.
Analysis of the actual repository (i.e. it will actually look for imp...
Hey AgitatedDove14 ,
I wish to be able to continue a previous run, but from a certain checkpoint onward (perhaps with changed data, perhaps with different LR...). So I wish to be able to be able to "go back" to the epoch of the checkpoint, and continue from there while retaining the entire history up to that point.
AgitatedDove14 It is not ideal, but might suffice. If I'll decide to pursue this path, I'll get back to you with this. Thanks!
Could you please elaborate on how to use Task.create
to achieve this?
Any chance you create an Issue on GitHub with this feature suggestion, If we have some support we could accelerate the implementation
Sure.
Do you have a specific PBT implementation you are considering ?
Yes. Ray has a nice implementation of that and of many other things (It also supports Optuna & BOHB internally). This could relate to a https://github.com/allegroai/trains/issues/227#issuecomment-716513710 I opened in the GIT.
TPE and HBOB are competitive to PBT in terms of performance
Yes, and I believe ASHA is also, but I wanted to compare PBT.
Thanks anyway.
Manually should be the simplest, so let's start from there...
OK, that looks like a nice workaround. Thanks!
Thanks to you both. When I'll get back to this, I'll have a deeper look and see if it fits my needs. I do, however, suggest that you implement a simple entry-point API like W&B does. Maybe something along the lines ofTask.create("Name", function=start_task_func)
Implementing the whole thing Emanuel wrote above, or using raw JSON seems very tedious.
I think it's nicer when you want to wrap some execution path, and not just use it. If you could also provide the aforementioned pickled extra parameters, then this will be extremely useful.
The reason I'm reluctant is that you might have calls/functions/variables in global scope of the file storing the function, and then users will not know why something broke, ans it will be very cumbersome to debug.
The global scope for that function is the local scope of the current function. You ...
If you want the "magic" property. Otherwise, you could also allow specifying a globals
argument like in timeit
.
That's actually very easy. The correct packages and repo are the same as now - the loaded ones (if you pass a function as an argument, you already loaded its module and related packages from the relevant git repo & commit).
For the arguments, you could extract them using task.get_parameters_as_dict()
. You could also allow passing additional arguments that will be pickled (but that's unnecessary):Task.create("Name", function=start_task_func, arg1, arg2, arg3=arg3)
The W&B interface i...
GrumpyPenguin23 Actually, no. I wish to create an experiment from scratch starting at well-defined entry point (either a script or a function).
I wish to do this in order to wrap my existing framework with a new entry-point such that, at least for the time being, I will not need to modify the innards of the framework in order to deploy it well. I would also like to do this dynamically, such that the wrapped entry point could be configured externally.
GrumpyPenguin23 Might be. Like I wrote before - I put that path on hold for now. Thanks.
AgitatedDove14 Possibly. You could also specify additional packages to require just like you do now (in params['script']['requirements']['pip']
).
How do you determine which packages to require now?