Answered

Hi, Is There A Way To Create A Draft Experiment Manually? That Is - Give It A Some File To Run, Or, Better Yet, A Function To Run Which Will Be The Start Of The Experiment? In W&B, For Example It Is Possible To Simply Write (Their

Hi,
Is there a way to create a draft experiment manually? That is - give it a some file to run, or, better yet, a function to run which will be the start of the experiment?
in W&B, for example it is possible to simply write (their Sweep is similar your Task ):
wandb.agent(sweep_id, function=train)

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

Votes Newest

Answers 30

GrumpyPenguin23 Might be. Like I wrote before - I put that path on hold for now. Thanks.

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

That's actually very easy. The correct packages and repo are the same as now - the loaded ones (if you pass a function as an argument, you already loaded its module and related packages from the relevant git repo & commit).
For the arguments, you could extract them using task.get_parameters_as_dict() . You could also allow passing additional arguments that will be pickled (but that's unnecessary):
Task.create("Name", function=start_task_func, arg1, arg2, arg3=arg3)The W&B interface is very intuitive and simple at that point:
` def train():
run = wandb.init()
print("config:", dict(run.config))
for epoch in range(35):
print("running", epoch)
wandb.log({"metric": run.config.param1, "epoch": epoch})
time.sleep(1)

wandb.agent(sweep_id, function=train) `I was hoping that you had something similar.

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

GrumpyPenguin23 Actually, no. I wish to create an experiment from scratch starting at well-defined entry point (either a script or a function).
I wish to do this in order to wrap my existing framework with a new entry-point such that, at least for the time being, I will not need to modify the innards of the framework in order to deploy it well. I would also like to do this dynamically, such that the wrapped entry point could be configured externally.

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

Or use the gc . Could also be used.

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

Great!

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

AgitatedDove14

how would you specify the main python script entry point?

If you want to use a script, then the entry point should be the trivial one (either __main__ or main() ).

wouldn't that make more sense rather than a function call?

From what I can tell, W&B provide both options - either to specify a script path/module (that will be run regularly) or specify a function as an entry point.

Analysis of the actual repository (i.e. it will actually look for imports

) this way you get the exact versions you hve, but nit the clutter of the entire virtual environment

You could also do that here instead of what I suggested if it's easier. In what I suggested you also get the exact versions.

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

I think it's nicer when you want to wrap some execution path, and not just use it. If you could also provide the aforementioned pickled extra parameters, then this will be extremely useful.

The reason I'm reluctant is that you might have calls/functions/variables in global scope of the file storing the function, and then users will not know why something broke, ans it will be very cumbersome to debug.

The global scope for that function is the local scope of the current function. You could always pickle locals() (and warn regarding unpicklable parameters), or just warn that all used parameters must be passed explicitly (as arguments, or like in timeit.timeit(..., globals=...) ).

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

I do this:
` base_task = Task.create(project_name=self.regression_project_name,
task_name=BASE_TASKS[block_type][engine], task_type=task_type)
params = base_task.export_task()

Git repo

params['script']['repository'] = subprocess.check_output(['git', 'config', '--get', 'remote.origin.url'],
cwd=REPO_NAME).decode().strip()

Git commit

params['script']['version_num'] = subprocess.check_output(['git', 'rev-parse', 'HEAD'],
cwd=REPO_NAME).decode().strip()

Git branch

params['script']['branch'] = subprocess.check_output(['git', 'rev-parse', '--abbrev-ref', 'HEAD'],
cwd=REPO_NAME).decode().strip()

Git diff

params['script']['diff'] = subprocess.check_output(['git', 'diff'],
cwd=REPO_NAME).decode()

Dir to execute code from - . corresponds to base git repo

params['script']['working_dir'] = '.'

Code execution script path

params['script']['entry_point'] = os.path.relpath(EXECUTABLES[(engine, block_type)], REPO_NAME)

Conda env path to run from TODO: change this to the env on host to interperter once trains stabilizes

params['execution']['docker_cmd'] = '/home/egarbin/.conda/envs/regression'

So it wouldn't try to install packages from cache (as it takes the given conda env)

params['script']['requirements']['pip'] = '\n \n'
params['script']['requirements']['conda'] = '\n \n'
base_task.update_task(params) `

  				
Posted 
	4 years ago

					More  		
  Report
		
					SmarmySeaurchin8
				
					0
					 × 1

OddAlligator72 can you link to the wandb docs? Looks like you want a custom entry point, I'm thinking "maybe" but probably the answer is that we do it a little differently here.

  				
Posted 
	4 years ago

					More  		
  Report
		
					GrumpyPenguin23
				
					0
					 × 1

OddAlligator72 FYI, in you current code you can always do
if use_trains: from trains import Task Task.init()Might be easier 😉

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

I like the idea of using the timeit interface, and I think we could actually hack it to do most of the heavy lifting for us 🙂

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

AgitatedDove14 Possibly. You could also specify additional packages to require just like you do now (in params['script']['requirements']['pip'] ).
How do you determine which packages to require now?

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

OddAlligator72 FYI you can also import / export an entire Task (basically allowing you to create it from scratch/json, even without calling Task.create)
Task.import_task(...) Task.export_task(...)

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

You might be able to also find out exactly what needs to be pickled using the

f_code

of the function (but that's limited to C implementation of python).

Nice!

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

OddAlligator72 so if I get you correctly, it is equivalent to creating a file called driver.py with all your entry points with an argparser and using it instead of train.py?

  				
Posted 
	4 years ago

					More  		
  Report
		
					GrumpyPenguin23
				
					0
					 × 1

Hmm interesting ...
Any chance you create an Issue on GitHub with this feature suggestion,
If we have some support we could accelerate the implementation

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Of course you can edit which parameters you like

  				
Posted 
	4 years ago

					More  		
  Report
		
					SmarmySeaurchin8
				
					0
					 × 1

Could you please elaborate on how to use Task.create to achieve this?

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

OddAlligator72 I like this idea.
The single thing I'm not sure about is the "function entry point"
Why would one do that? Meaning why wouldn't you have a proper python entry-point.
The reason I'm reluctant is that you might have calls/functions/variables in global scope of the file storing the function, and then users will not know why something broke, ans it will be very cumbersome to debug.
A simple script entry point seems trivial to launch and debug locally.
What do you think ? What would be your specific use case for that?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

OddAlligator72 okay, that is possible, how would you specify the main python script entry point? (wouldn't that make more sense rather than a function call?)

How do you determine which packages to require now?

Analysis of the actual repository (i.e. it will actually look for imports 🙂 ) this way you get the exact versions you hve, but nit the clutter of the entire virtual environment

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

OddAlligator72 what you are saying is, take the repository / packages from the runtime, aka the python code calling the "Task.create(start_task_func)" ?
Is that correct ?
BTW: notice that the execution itself will be launched on other remote machines, not on this local machine

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

OddAlligator72 I think you got sidetracked into the wrong corner here, lets decompose what you are asking for please, tell me if I am getting somewhere near what you mean:
you have an experiment you already ran you want to change the parameters in it and run it again if possible you only want to run a single function in the file attached to that experiment

  				
Posted 
	4 years ago

					More  		
  Report
		
					GrumpyPenguin23
				
					0
					 × 1

You might be able to also find out exactly what needs to be pickled using the f_code of the function (but that's limited to C implementation of python).

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

https://github.com/allegroai/trains/issues/230

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

OddAlligator72 just so I'm sure I understand your suggestion:
pickle the entire locals() on current machine.
On remote machine, create a mock entry point python, restore the "locals()" and execute the function ?

BTW:
Making this actually work regardless on a machine is some major magic in motion ... 😉

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

If you want the "magic" property. Otherwise, you could also allow specifying a globals argument like in timeit .

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

Any chance you create an Issue on GitHub with this feature suggestion, If we have some support we could accelerate the implementation

Sure.

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

AgitatedDove14 It is not ideal, but might suffice. If I'll decide to pursue this path, I'll get back to you with this. Thanks!

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

Thanks to you both. When I'll get back to this, I'll have a deeper look and see if it fits my needs. I do, however, suggest that you implement a simple entry-point API like W&B does. Maybe something along the lines of
Task.create("Name", function=start_task_func)Implementing the whole thing Emanuel wrote above, or using raw JSON seems very tedious.

  				
Posted 
	4 years ago

					More  		
  Report
		
					OddAlligator72
				
					0
					 × 1

OddAlligator72 quick question:

suggest that you implement a simple entry-point API

How would the system get the correct packages / git repo / arguments if you are only passing a single function entrypoint ?

  				
Posted 
	4 years ago

					More  		
  Report
		
					AgitatedDove14
				
					0
					 × 1

Write your answer

2K Views

30 Answers

4 years ago

2 years ago