Suits.ppt
Large Scale Biomolecular Simulation:
Blue Matter Molecular Dynamics on Blue Gene/L
Frank SuitsBiomolecular Dynamics & Scalable Modelinghttp://www.research.ibm.com/bluegene
High Performance Computing for
2004 IBM Corporation
Large Scale Biomolecular Simulation
Blue Gene protein science goals and history
Overview of molecular dynamics
Ways to use power of BG/L for protein science
Our current simulation efforts and results
Blue Matter design goals
Optimization efforts
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Large team effort
Blue Gene Hardware
System Software
Blue Matter development
– Biomolecular dynamics and Scalable Modeling Group:
• Bob Germain, Blake Fitch, Mike Pitman, Yuriy Zhestkov, Alex Rayshubskiy,
Maria Eleftheriou, Alan Grossfield
• Almaden science team: William Swope, Jed Pitera, Hans Horn
Protein Science collaborators
My own background:
– Physics (there are a lot of us)
– Current role: mostly analysis of scientific results
– Touched much of the code base, but specialists assigned to key code
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
IBM Announces $100 Million Research Initiative to
build World's Fastest Supercomputer
"Blue Gene" to Tackle Protein Folding Grand Challenge
YORKTOWN HEIGHTS, NY, December 6, 1999 -- IBM today announced a new $100million exploratory research initiative to build a supercomputer 500 times more powerfulthan the world's fastest computers today. The new computer -- nicknamed "Blue Gene" byIBM researchers -- will be capable of more than one quadrillion operations per second (onepetaflop). This level of performance will make Blue Gene 1,000 times more powerful thanthe Deep Blue machine that beat world chess champion Garry Kasparov in 1997, and about2 million times more powerful than today's top desktop PCs.
Blue Gene's massive computing power will initially be used to model the folding of humanproteins, making this fundamental study of biology the company's first computing "grandchallenge" since the Deep Blue experiment. Learning more about how proteins fold isexpected to give medical researchers better understanding of diseases, as well as potentialcures.
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Blue Gene program
December 1999: Blue Gene project announcement
November 2001: Research partnership with Lawrence Livermore National
Laboratory (LLNL).
June 2003: First chips completed
November 2003: BG/L Half rack prototype (512 nodes) ranked #73 on 22nd
Top500 List announced at SC2003 (1.435 TFlop/s ).
– 32 node system folding proteins live on the demo floor at SC2003
February 2, 2004: Second pass BG/L chips delivered to Research
March 2, 2004: 1024 node prototype achieves 2.8 TFlop/s on Linpack – would
qualify as #23
April 16, 2004: 2048 node prototype achieves 5.6 TFlop/s on Linpack – would
qualify as #10
May 11, 2004: 4096 node prototype (500 MHz) achieves 11.68 TFlop/s on
Linpack – #4 on Top500
May 18, 2004 First production Blue Matter runs on membrane systems
June 2, 2004 2048 node prototype (pass 2 chips, 700MHz) achieves 8.655
TFlop/s on Linpack-- #8 on Top500
September 29, 2004 8192 node system (pass 2 chips) achieves 36.01 TFlop/s on
Linpack (passes Earth Simulator)
October 2004: 120ns on rhodopsin in membrane (NVE)
November 2004: #1 in Top500 at 70 TFlop/s (1/4 of completed system)
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Blue Gene Science Mission
Advance our understanding of biologically
important processes via simulation, in particular the
mechanisms behind protein folding
Current Activities include:
– Thermodynamic & kinetic studies of model
– Structural and dynamical studies of membrane and
membrane/protein systems
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Scaling Directions
statistical certainty
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Time Scales: Biopolymers and Membranes
Helix-Coil Transition
Lipid exchange via diffusion
Ligand-Protein Binding
Torsional correlation in lipid headgroups
Electron Transfer
Adapted from "The Protein Folding Problem", Chan and Dill, Physics Today, Feb. 1993
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
The science plan – a spectrum of projects
systematically cover a range of system sizes, topological complexity
– discovering the "rules" of folding
– applying those rules to have impact on disease
address a broad range of scientific questions and impact areas:
– thermodynamics
– folding kinetics
– folding-related disease (CF, Alzheimer's, GPCR's)
improve our understanding not just of protein folding but protein function
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
b-hairpin Simulation
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Free Energy Landscape of Beta Hairpin (PNAS 2001)
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Free energy surface with trajectories: Kinetics
Each color isa separate trajectory
Some overlap,others are distinct
Can they be chainedtogether?
J. Phys. Chem. B, 2004 (2 papers)
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
"trp-cage" folding (PNAS 2003)
Small 20 amino
Simulations started
from a completely
unfolded state
Simulations could
reproduce &
explain sequence-
dependent folding
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Membrane Proteins
Membrane processes enable:
– cell signal detection, ion and nutrient transport
– infection processes target specific membranes
– Over 50% of drug discovery research targets are membrane
Experiment and simulation
play a concerted role in
understanding membrane
biophysics
Simulation can be
validated by
experiment
Simulation can then help to interpret experiment
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Lipid Membrane Simulation
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Overview ofBlue Gene Membrane Protein Studies
– Extensive hydrogen bonding network with headgroups
– Excellent agreement with experiment for both structural
and dynamic properties
– Cholesterol induces dramatic lateral organization
– Cholesterol shows preference of STEA over DHA
– Significant Angular anisotropy of Cholesterol Environment
GPCR in a membrane environment
– Rhodopsin with 2:2:1 SDPC/SDPE/CHOL
– 100 ns cis-retinal - 200+ ns trans-retinal
– Current production rate 15 hrs / ns on 512 nodes BG/L
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Most Recent Publication (yesterday)
Molecular-Level Organization of Saturated and Polyunsaturated Fatty Acids
in a Phosphatidylcholine Bilayer Containing Cholesterol
Pitman, Suits, MacKerell, Feller, Biochemistry 2004
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Rhodopsin and the Eye
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
GPCR-based drugs among the 200 best-selling prescriptions,and their GPCR targets
GPCR target
2000 sales(US $m)
Johnson & Johnson
Eli Lilly
Congestiv e heart
Serev ent
Atrov ent
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Current Simulations of Rhodopsin in Membrane
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Some analysis examples:
Lipid neighborhood around a cholesterol
Each lipid has two different "chains," shown red and blue
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
2D contours give some idea of neighborhood,but only in slice. 3D possibilities?
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
3D isosurfaces of density show lipid distributed symmetrically,while cholesterols show strong orientation preference…
Red: Lipid Blue: Other cholesterols
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Also see water pulled in from aboveand cholesterols preferentially oriented to each other . .
Blue: other cholesterols
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Selected Publications
Molecular-Level Organization of Saturated and Polyunsaturated Fatty
Acids in a Phosphatidylcholine Bilayer Containing Cholesterol;Biochemistry, In Press, 2004
Describing Protein Folding Kinetics by Molecular Dynamics Simulations.
1. Theory; The Journal of Physical Chemistry B; 2004; 108(21); 6571-6581
Describing Protein Folding Kinetics by Molecular Dynamics Simulations.
2. Example Applications to Alanine Dipeptide and a beta-HairpinPeptide; The Journal of Physical Chemistry B; 2004; 108(21); 6582-6594
Understanding folding and design: Replica-exchange simulations of "Trp-
cage" miniproteins, PNAS USA, Vol. 100, Issue 13, June 24, 2003, pp.
7587-7592
Can a continuum solvent model reproduce the free energy landscape of
a beta-hairpin folding in water?, Proc. Natl. Acad. Sci. USA, Vol. 99,Issue 20, October 1, 2002, pp. 12777-12782
The free energy landscape for beta-hairpin folding in explicit water, Proc.
Natl. Acad. Sci. USA, Vol. 98, Issue 26, December 18, 2001, pp. 14931-14936
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
BG/L communication network
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Ocean view with Torus
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Why another MD program?
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Blue Matter "Porting" issues
Written from scratch
Small memory footprint and state (megabytes)
Low i/o needs
– Still, can accumulate large amount of data
– Staged reduction with archive/spinning
– "streaming" demo
Strong scaling needs (small #atoms per node)
For large node count, communication bound
– Novel strategies for decomposition
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Design scalable MD environment for large node
Address strong scalability problems
Research novel modular programming techniques
Build reusable framework components
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
C++ with templates
Database oriented
Scientific functions "registered" as User-Defined Functions
Molecular system represented as xml and stored in database
Each system generates uniqe C++
– No single executable
– Java pulls system from database and generates code based
on run-time parameters
– Opitimization due to compile time constants and reduced code
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Blue Matter Overview
Separate MD program into multiple subpackages (offload
function to host where possible)
– MD core engine (massively parallel, minimal in size)
– Setup programs to setup force field assignments, etc
– Monitoring and analysis tools to analyze MD trajectories, etc
Run time parameters have already been built in
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Blue Matter Overview
Blue Matter Runtime
RegressionTest Driver Scripts
Parallel ApplicationDatagrams Management
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Blue Matter Molecular Dynamics code
Multiple Force Field Support
– CHARMM, OPLS-AA, AMBER, GROMOS (in progress), Polarizable
Explicit water models
– TIP3P, SPC, SPCE, rigid or floppy
Integrators, time reversible
– Verlet, rRespa
Temperature control
– Andersen, Nose-Hoover
Pressure control
– Andersen (time reversible)
Methods for long-range electrostatics
– Implemented: Ewald, P3ME (FFT-based), Lekner (pairwise)
– Tentative: Fast Multipole
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
What is being calculated?
Bonded atoms 1-2, 1-3, 1-4 (quick, list based)
Non-bond (N 2 – but switch truncated range)
– Lennard-Jones
Periodic imaging
– Ewald (DFT) or
– P3ME (3D FFT)
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
How is the problem partitioned?
CURRENTLY (very good to 1024 nodes):
– Atoms in fragments of 1-5 or so
– Fragments distributed across nodes
– Load balancing occurs based on measured times
– All nodes know positions of all atoms
– Each node calculates forces on its atoms
– Parallel 3D FFT across all nodes, leaving piece of result on each node
– Each node applies force due to FFT piece to all atoms
– All forces are combined (all reduce), each node knows forces on all atoms
– All nodes update positions
FUTURE (Many K nodes):
– Interaction decomposition
– "N 2" interactions are distributed rather than N atoms
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Excellent energy conservation – validation of code
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Optimization and Scalability
With empirical results
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Timing 512 way for rhodopsin, lipids, water, 43k atoms
assign c harge (4.6ms)
Floating point MPI all
setting up globalize positions
update positions (2ms)
bonded forc e c omputation
Convolution (1.25ms)
pairwise non-bonded
MPI c all in globalizing positions (5.1ms)
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Optimization of non-bond interactions
Verlet lists
– Check only O(N) interactions with particles on the list
– Lists are recalculated only when particles cross the
– Dynamic tuning of the guard zone size for optimization
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Verlet list tuning to find optimum
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Verlet list tuning
After tuning
Short step 0.25 ms/time step - most steps
Short step 0.2 ms/time step – 5 out of 6
Long step 0.39 ms/time step - infrequently
Long step 0.3 ms/time step – 1 out of 6
Average about 0.25 ms/time step
Average about 0.22 ms/time step
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Verlet list tuning
Before tuning
– Short step 0.25 ms/time step - most steps
– Long step 0.39 ms/time step - infrequently
– Average about 0.25 ms/time step
After tuning
– Short step 0.2 ms/time step – 5 out of 6 steps
– Long step 0.3 ms/time step – 1 out of 6 steps
– Average about 0.22 ms/time step
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
What Limits the Scalability of MD?
Inherent limitations on concurrency:
– Bonded force evaluation
Represents only small fraction of computation, can be distributed moderately well.
– Real space non-bond force evaluation
Large fraction of computation, but good distribution can be achieved using volume or
– Reciprocal space contribution to force evaluation for Ewald
P3ME uses 3D FFT with global communication
Ewald with direct evaluation uses floating point reduction
Load balancing
System software
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Long range electrostatics
Ewald method
– Replaces a slowly conditionally converging infinite sum for electrostatic
force with two fast converging sums, one in real space and another inreciprocal (Fourier transformed) space
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Long range electrostatics
Ewald method
– Replaces a slowly conditionally converging infinite sum for electrostatic
force with two fast converging sums, one in real space and another inreciprocal (Fourier transformed) space
Real space term is computed together with other pairwise terms
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Long range electrostatics
Ewald method
– Replaces a slowly conditionally converging infinite sum for electrostatic
force with two fast converging sums, one in real space and another inreciprocal (Fourier transformed) space
Real space term is computed either directly or using FFT (in P3ME method)
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Parallel decomposition of P3ME
Reciprocal term in P3ME algorithm using FFT
– Charges redistributed over points on a mesh
– Fourier transformation takes into reciprocal space (FFT)
– Convolution in reciprocal space with Green functions
– Inverse Fourier transformation to get electric potentials on the
– Interpolation of these potentials to particle locations
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Multi dimensional FFTs are important kernels for a
Molecular Dynamics algorithm, that we used in Blue Gene
science program
– We use the P3ME (Particle-Particle-Particle-Mesh-Ewald)
method to compute long range interaction between chargesin the simulated system
– P3ME requires computation of 3D FFT of charge
distribution in every time step of the simulation
– Target simulation sizes on the BG/L: 5K-200K atoms
• Typical sizes of 3D-FFT needed are 643 to 2563
Because of their importance we need a 3D FFT solution for
BG/L that scales to very large node counts.
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Existing implementations and Challenge for BG/L
Typical parallel 3D FFT implementations (e.g., FFTW) use slab
decomposition to min communication
– In principle the scalability is limited to N processors (for a N x N x N FFT)
– Typical sizes of FFT used for MD are 643 to 2563
Our application in Blue Gene/L must scale to 2048 nodes or more
In theory, row-column decomposition can scale to N2 nodes without
parallelizing individual 1D FFTs
In volumetric decomposition, each computation phase is separated by
data movement (transposition)
Important to perform the transposes efficiently, because they can
become very expensive
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
The 3D FFT Algorithm
Volumetric decomposition divides 3D FFT computation into three stages ofcomputation of N2 1D FFT of length N Each 1D FFT is independent and can be computed in parallel
N x N 1D FFTs along the z-dim
N x N 1D FFTs along the y-
N x N 1D FFTs along the x-dim
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
3D-FFT for Blue Gene/L
Requirement: Scalable 3D FFT for meshes ranging from 323
to 2563 as part of particle mesh molecular dynamics
Design goal: 3D FFT decomposition with strong scaling
characteristics for mesh sizes of interest (as an alternative
to "slab" based 3D FFT decomposition).
Prototyping: "Active Packet" and MPI programming model
versions of volumetric 3D FFT have been implemented.
Results: MPI version shows scaling on SP (Power4)
superior to that of FFTW; BG/L versions show continued
speedups through 1024 nodes.
Conclusion: Volumetric'' 3D FFT will scale well enough to
support many biomolecular simulation experiments,
including mesh sizes around 1283.
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Volumetric 3D-FFT on Power4 Cluster
128x128x128 FFTBG
3D-FFT time (seconds)
MPI version shows scaling on SP (Power4) superior to that of FFTW
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Original goals of Blue Gene science program
Blue Matter provides scalable MD environment
with innovative design approaches
Large simulations are running right now, and will
get bigger as nodes arrive
Stay tuned
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Alex Balaeff
Mike Pitman
Bruce Berne
Alex Rayshubskiy
Maria Eleftheriou
Yuk Sham
Scott Feller
Frank Suits
Blake Fitch
Bill Swope
Klaus Gawrisch
Chris Ward
Alan Grossfield
Yuri Zhestkov
Jed Pitera
Ruhong Zhou
Blue Gene Hardware and
System Software teams
High Performanc e Computing for Large Sc ale
Dec ember 8, 2004
2004 IBM Corporation
Biomolec ular Simulation
Source: http://www.scc.acad.bg/ncsa/articles/library/Library2014_Supercomputers-at-Work/Molecular%20Dynamics%20Blue%20Matter/Largre_Scale_Biomoleculare_Simulation.pdf
J_ID: CHI Customer A_ID: 08-0027 Cadmus Art: CHI20564 Date: 23-MAY-08 Stage: I CHIRALITY 00:000–000 (2008) Use of Large-Scale Chromatography in the Preparation of Armodafinil WILLY HAUCK,1 PHILIPPE ADAM,2 CHRISTELLE BOBIER,2* AND NELSON LANDMESSER3 1Novasep Inc., Boothwyn, Pennsylvania 2Novasep SAS, Pompey, France 3Cephalon Inc., West Chester, Pennsylvania Armodafinil, the (R)-enantiomer of modafinil, is a medication used to
PROGRAMME: TECHNOLOGY, RESEARCH AND DEVELOPMENT SERVICES DIRECTORATE: PLANT SCIENCE A. PROGRAMME & PROJECT LEADER INFORMATION Programme leader Project leader (Researcher) Title, initials, surname Present position Specialist Agricultural Scientist Specialist Agricultural Scientist