Anyone Using Trains With Snakemake? I Am Running My Workflow With Snakemake In A Docker Container, And It Can Output To The Trains Server Of Course, But Executing A Task From Trains Ui Tries To Run The Script In Its Own Container... It Downloads An Ubuntu

Unanswered

Hmm I think so. It doesn't sound exactly compatible with Snakemake, more kind a replacement, though the pipelining trains does is quite different. Snakemake really is all about the DAG. I just tell it what output I want to get and it figures out what jobs to run to get that, does it massively parallel, and, very importantly, it NEVER repeats work it has already done and has artifacts for already (well, unless you force it to, but that's a conscious choice you have to make). This is super important for big, expensive jobs. Does trains handle that? I haven't seen that in the docs yet.

I read the trains pipeline example but it confused me https://github.com/allegroai/trains/blob/0.15.1/examples/automation/task_piping_example.py and https://allegro.ai/docs/examples/automation/task_piping/ . Is this the analog to the DAG somehow? It looks like it's showing how to enqueue a task, unless I'm misunderstanding it.

What I think I am understanding about trains so far is that it's great at tracking one-off script runs and storing artifacts and metadata about training jobs, but doesn't replace kubeflow or snakemake's DAG as a first-class citizen. How does Allegro handle DAGgy workflows?

  				
Posted 
	5 years ago

					More
				  		
  Report
		
					BroadMole98
				
					0
					 × 1

300 Views

0 Answers

5 years ago

2 years ago