PHENIX
Structural genomics seeks to rapidly expand the number of protein structures to permit extraction of the maximum amount of information from the genomic sequence databases. The advent of several large-scale pilot projects, funded by NIH/NIGMS, leads to many new challenges in the field of macromolecular structure determination. One of the primary goals of the CCI is the development of a novel software package called PHENIX (Python-based Hierarchical Environment for Integrated Xtallography). This new software will provide the necessary algorithms to proceed from reduced intensity data to a refined molecular model, and facilitate structure solution for both the novice and expert crystallographer.
The development of the PHENIX software involves the extended international community of crystallographers both through workshops and formal collaborations. This software is being developed as part of an international collaboration, funded by NIH and headed by the CCI group. Those currently involved are: Tom Terwilliger (Los Alamos National Laboratory), Randy Read (University of Cambridge, U.K.), Tom Ioerger and Jim Sacchettini (Texas A&M University). PHENIX is designed with an open and flexible architecture to encourage its use by other developers, and to promote easy incorporation into the home lab and synchrotron beamline environments. It will be supported on UNIX and Windows platforms and openly distributed to non-profit users.
Our experience with the development of the Crystallography & NMR System has shown us that it is preferable to work with a scripting language when implementing large, high-level applications. Python is an interpreted, dynamically typed scripting language for the rapid development of maintainable applications. This is facilitated by clear and simple syntax, support for object-oriented design, and rich support for a modular program architecture. To provide fast execution time and memory efficiency we are developing many of the fundamental crystallographic methods and data objects of PHENIX in C++. The Boost Python Library is used to make these toolbox classes and functions available at the Python scripting level.
Computational Crystallography Toolbox
Recent software design concepts have revolutionized the way large applications are written. Modern programs typically feature object-oriented design, databases, graphical user interfaces, distributed computing, and platform independence. For use in the PHENIX software package we are developing a new open-source toolbox for fundamental crystallographic data structures and algorithms. This Computational Crystallography toolbox (cctbx) is also intended to allow other developers to efficiently implement crystallographic applications that exploit modern programming techniques.
We have implemented algorithms in C++ for the handling and manipulation of unit cell parameters, space group symmetry, and reciprocal-space arrays, for the calculation of statistical quantities binned by resolution range, and for the conversion between reciprocal space arrays and real-space arrays by Fourier transformation. At this early stage we have tested the PHENIX toolkit with examples of some basic crystallographic tasks, including importing data, performing Wilson scaling, and calculating anomalous difference Patterson maps.
The cctbx is now available as an open-source package at SourceForge. Currently the toolbox contains a unit cell toolbox (uctbx), a space group toolbox (sgtbx), an element toolbox (eltbx) for the handling of scattering factors and other element properties, a structure-factor toolbox (sftbx), a Fast Fourier Transform toolbox (fftbx) and an atom displacement parameter toolbox (adptbx). The code is designed with an open and flexible architecture to promote extendibility and easy incorporation into other software environments.