Bagel assembler kernel generator
The Bagel assembler generation library is written and maintained by Peter Boyle
It is composed of two key parts: a library to which one can programme a
generic RISC assembler kernel, and a set of programmes that use the
library to produce key QCD and linear algebra operations.
The kernels it generates are commonly used in both the Chroma and
Physics System QCD code bases when targetting QCDOC and
BlueGene. It has
comms hooks for both FAKE (single processor), SCU
calls (QCDOC) and QMP calls (for Chroma) . The kernels are designed to
from any C or C++ code base, however.
The generator is retargetable, but for now key targets are
ppc440, bgl, bgq and powerIII.
If you use this software, please cite it as
Computer Physics Communications Volume 180, Issue 12, December 2009, Pages 2739–2748
"P.A. Boyle, http://www.ph.ed.ac.uk/~paboyle/bagel/Bagel.html, 2005"
Production Bagel for BlueGene/Q includes a multilevel HDCG solver
Bagel for BlueGene/Q prerelease versions (beta quality):
Remember the old joke about optimising your code until the lights flicker?
This time I really have: this is the Edinburgh BG/Q running the BFM inverter and the lights are
DC LEDs on the bulk power modules.
They only flicker when running Bagel!
Bagel-3 will change the interface substantially. The data layout
becomes opaque and architecture dependent with an import/export interface.
Entire algorithms are implemented in Bagel to amortize layout change
Preconditioned conjugate gradient
Unpreconditioned conjugate gradient
Mixed precision defect correction inversion
Multi-mass conjugate gradient
Implicitly restarted shifted Lanczos (Rudy Arthur)
Wilson twisted mass
Domain Wall (5d even odd)
Domain Wall (4d even odd)
Overlap general Cayley form Mobius kernels (tanh or zolotarev)
Overlap general Continued fraction Wilson kernel (tanh or zolotarev)
Overlap general Partial fraction Wilson kernel (tanh or zolotarev)
configure CXX=mpicxx LIBS="-lSPI -lSPI_l1p -L$BGQ/spi/lib/" CC=mpicc CXXFLAGS="-fpermissive -fopenmp -I$BGQ -I$BGQ/spi/include/kernel/cnk/" \
env CXX=g++ CC=gcc ~/BGQ_sfw/src/revision_controlled/bagel/configure --enable-itype=uint64_t --enable-isize=8 --enable-ifmt=%lx --prefix=$INST --enable-istype=uint32_t --enable-issize=4 --enable-isfmt=%x
- Supports double Hummer complex primitives,
- mixed precision support
- simple runtime selection of reduced precision
- new bluegene kernels in addition to QCDOC kernels
- Multi-core enabled (posix threads and BG/L co-routines - tested
on MacOS-X/pthreads and due care taken to ensure L1 coherency on BG/L).
I'm hoping the mixed precision approach will yield a modest speed
up. More on this when I've benchmarked it.
Please do cite the bagel paper.
It would also be kind to mail me
and let me know if you like it!