This benchmark tests parallel speedup for assembly and solve.
It is organized as follows:

  assemble-poisson: simple program for assembling Poisson matrix
  solve-poisson:    simple program for solving Poisson's equation

  submit-bench:     script for submitting jobs using dolfin_utils.pjobs
  analyse-bench:    script for analysing output from submit-bench

  bench:            script used on benchbot for simple checking of speedup

Possible future improvements:

 1. MUMPS vs UMFPACK gives skewed results (MUMPS faster?)
 2. Time assemble_cells and apply separately
 3. Use barriers between timings
