Rigid Body Refinement

Overview

This code example implements the core algorithms used in rigid body refinement:

  • Computation of moved sites given three Euler angles and a three-dimensional linear translation vector (six degrees of freedom total).
  • Summation of gradients w.r.t. Cartesian coordinates, to obtain gradients w.r.t. the Euler angles and the linear translation vector.

The script consists of two parts of roughly equal size. The first part (just 80 lines) implements a rigid_body class with this interface:

Here ea in ea_gradients() is for "Euler angles". The rigid body parameters are stored as rigid_body.ea and rigid_body.lt.

The second part of the script (less than 70 lines) is a unit test, to automatically check the analytical gradient calculations in the first part via finite differences.

To keep the example concise, the target function is a simple least-squares function restraining the sites to the original positions. In a real application, e.g. refinement against x-ray data with other cctbx facilities, the computation of the gradients w.r.t. the Cartesian coordinates of the atomic sites is far more complex, in particular if space group symmetry is involved. However, the computation of the gradients w.r.t. the Euler angles and the linear translation of the rigid body as implemented in this script is completely general. I.e., any refinement program that already includes computation of the gradients w.r.t. the Cartesian coordinates can be easily extended to also support rigid body refinement.

The unit_cell_refinement.py example shows how the target function results (functional and gradients) can be used in combination with the general purpose quasi-Newton LBFGS minimizer to iteratively update the Euler angle and linear translation parameters.

Use print and help()!

Before you study the example script, download and install the cctbx. This will enable you to run the script. While analyzing the script, insert print statements and run the script to find out more about the objects. It may also be useful to insert help(obj) to see the attributes and methods of obj, where obj can be any of the objects created in the script.

If you don't know what an object is: thing is a pretty good approximation.

The rigid_body constructor

The "constructor" (called "__init__" method in Python) of the rigid_body class is:

The only argument is an array of sites, i.e. the current coordinates of the points ("atoms") in the rigid body. To see the list of coordinates, insert this print statement:

This will show a list of Python tuples:

The self.sites_orig = sites assignment keeps a reference to this array for use in other methods (also known as "member functions" in some other languages). In view of repeated use in other methods, the center of mass is computed via sites.mean(). sites is expected to be a one-dimensional C++ array of type scitbx.array_family.flex.vec3_double. This is not formalized in the interface, but implied by the use of certain methods, in this case .mean(). This approach is known colloquially as "duck typing". Any other type with compatible methods could be used instead. As far as Python is concerned: if it walks like a duck and quacks like a duck, it is a duck.

The scitbx.matrix module implements many commonly used matrix algorithms, e.g. the matrix product, dot product, cross product, matrix inversion etc. The rigid_body constructor uses the scitbx.matrix.col (column) type to store the center of mass, the Euler angles, and the linear translation vector. It is highly recommended to spend a few minutes reading the matrix/__init__.py script. Since it is quite short and mostly implemented in pure Python it is largely self-explanatory.

Conversion from Euler angles to a rotation matrix

The rigid_body.rotation_matrix() method passes the Euler angles to the euler_xyz_matrix() function further up in the script:

The relevant code is:

In general, there are 12 different conventions for Euler angles. In addition to these, the rotation can be defined in two ways: a counterclockwise rotation of the basis system (equivalent to a clockwise rotation of vectors w.r.t. that basis system) or a counterclockwise rotation of vectors (equivalent to a clockwise rotation of the basis system). The function above implements the "xyz" convention with counterclockwise rotation of vectors. The "xyz" convention is advantageous for rigid-body refinement since the Gimbal lock problem occurs only if the rotation around y is plus or minus 90 degrees, a situation that is very unlikely to be reached since the convergence radius of rigid body refinement is typically much smaller.

The Mathematica code for generating the rotation matrix is included as a Python "docstring". From this it is easy to see exactly how the rotations are defined, and how to reproduce the matrix.

The angle_scale was introduced as a way to balance the scale of the angular parameters compared to the linear translation parameters. For the LBFGS minimizer to work optimally, it is helpful to balance the relative scale of all parameters. The exact angle_scale value is not too critical, as long as it is in the right order of magnitude. The value used in the script is the result of a few empirical tests.

For completeness it is noted that we could have used scitbx.math.euler_angles.xyz_matrix instead, which is implemented in C++. However, for clarity, and to allow experimenting with the angle_scale value, the function is re-implemented in the example script.

Computation of moved sites

These two methods are concerned with the computation of the moved sites, given self.rotation_matrix() as introduced above and the linear translation self.lt:

It will be helpful to insert print statements to inspect the objects involved. self.center_of_mass_orig + self.lt uses the scitbx.matrix implementation for element-wise addition of the three vector components. self.sites_orig - self.center_of_mass_orig uses the overloaded binary minus operator of the flex.vec3_double C++ array type. The multiplication with self.rotation_matrix() and the final addition of self.center_of_mass_moved() are again C++ operations. Therefore, the code will work efficiently even for a large number of sites. Since all operations involving large arrays are performed in C++, we can use Python's concise syntax without incurring a significant performance penalty, compared to a more arcane pure C++ implementation.

Computation of gradients

The gradient calculation code is the most complex part of the script (as usual):

The first action in this method is to compute sites_moved as explained above. The next step is to call an external function object energy_cart_function. In the "duck typing" spirit (see above), the requirements for energy_cart_function are implied by the implementation. The implementation of the trivial least-squared restraints to the original sites serves as an example:

Again, it will be helpful to insert print statement to inspect the types involved. nodes and homes are flex.vec3_double arrays. Therefore the calculations in .functional() and .gradients() are fast C++ array operations.

Any function object with the same interface could be used instead. I.e. the rigid_body code could be used unmodified in refinement against x-ray data. All application-specific calculations will be concentrated in the (much more complex) energy_cart equivalent.

The transformation of the gradients w.r.t. Cartesian coordinates to gradients w.r.t. the six rigid body parameters consists of two parts and follows the procedure described by Schwieters & Clore (2001). The first part is a subset of the "Newton-Euler" equations for the motion of a rigid body. The Cartesian gradients are summed to obtain the total translational gradients (linear force changing the linear velocity in the context of dynamics) and the total angular gradients (angular force changing the angular velocity):

In the case of a freely moving rigid body, the pivot is the center of mass. (In the context of dynamics, any other pivot implies external forces.)

The implementation of the summations is rather inefficient since the loop over the (potentially large number of) sites is in Python. While sum_grads for the translational gradients could be obtained as with the C++ array operation flex.sum(d_potential_energy_d_site), there is no existing C++ array version of the more complex steps for calculating the angular gradients. For clarity, to keep the function uniform, both results are computed in Python. For actual applications, the entire function should be re-implemented in C++.

Given the total angular gradients from the solution of the Newton-Euler equations, it is just a matter of transforming these values into the frame of reference for the Euler angles. This step is described in Goldstein (2002), towards the end of section 4.9. The formula for the Euler angle xyz convention is given in the appendix of Goldstein (2002):

Since Goldstein (2002) defines the Euler angles as a rotation of the basis system, the sign of the sinus terms is reversed in the function above.

The transformation of the angular velocity to time derivatives of the Euler angles is, evidently, a simple linear transformation. Luckily, from this it follows that the transformation of the gradients is simply given by the transpose of the same matrix (Murshudov et al., 1999, very end of appendix B). This leads to the last three lines of the .ea_gradients() implementation. The preceding line simply reverses the sign of the Newton-Euler forces; this could be avoided by changing newton_euler_f(), but we prefer to keep the implementation compatible with the Schwieters & Clore (2001) equations to avoid confusion.

Unit tests

The nested functions inside the exercise() function are unit tests to ensure the correctness of the gradient code discussed above.

For the idea behind the finite difference tests, please refer back to the Unit Cell Refinement tutorial. With this background, the test code should be easy to follow. The only slightly unobvious fragments are in incr_position() function, e.g.:

The detour through a Python list is necessary because the scitbx.matrix objects (in this case rb.ea) are understood to be immutable. This is not actually fully enforced to keep the implementations simple, but is true for many objects in the cctbx. The motivation is to avoid surprises due to "side effects" caused by changing objects in place. Of course, the rb.ea = matrix.col(v) assignment is doing just that, which tells us that it is sometimes impractical to be pure about the "no in-place operations" idea, taking runtime and memory consumption considerations into account. However, our practical experience strongly suggests it is important to apply the "no in-place operations" idea in most cases, and that it is a good excuse for the three lines of gymnastics above.

Typical (good) unit tests should produce no output. However, in this special case we chose to simply print the analytical and finite-difference gradients for visual inspection. In most other cctbx unit tests, the following would be used instead (to be inserted in show_gradients()):

This replaces the visual inspection with an automatic test. If included an routine testing script, it ensures that the results do not change as a side-effect of development work on the underlying libraries. This form of automatic testing is the only practical approach to quality assurance in an ever-growing and evolving library, in particular if there are many developers.

References

  • Goldstein, H. (2002). Classical Mechanics, 3rd edition.
  • Murshudov, G.N., Vagin, A.A., Lebedev, A., Wilson, K.S., Dodson, E.J. (1999). Acta Cryst. D55, 247-255.
  • Schwieters, C.D., Clore, G.M. (2001). J. Mag. Res., 152(2), 288-302.