RMSD calculation in AWK

1. Overview

The RMSD (root-mean-square deviation) is a common way to compare structures of biomolecules or solid bodies[3]. The basic idea is a least square optimization to superimpose the models. The AWK script performs the optimal alignment and prints out the RMSD of two PDB files. The program can be used as follows:

./pdbalign.awk file1.pdb file2.pdb

The implementation of the RMSD is based on Coutsias et al[1], who presented a method based on quaternions to align rotations. The entire algorithm is based on first centering the structures to the center of geometry (COG), then performing a rotational alignment based on quaternions[4]. To compute the eigenvectors the Householder transform is used to get a tridiagonal matrix and the QL decomposition is used to calculate the eigenvalues[2].



3. References

[1]Coutsias E.A., Seok C., Dill K.A., Using quaternions to calculate RMSD. J Comput Chem.(2004) 25(15):1849-1857.
[2]Press W.H., Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press(1992).
[3]RMSD http://en.wikipedia.org/wiki/Root-mean-square_deviation_(bioinformatics)
[4]Quaternions http://en.wikipedia.org/wiki/Quaternions_and_spatial_rotation