Calculate R-factors for a crystal structure

This example script shows how to calculate R-factors with a model and a mtz file.

Required imports

This script requires the following imports:

from __future__ import absolute_import, division, print_function
import iotbx.pdb
import mmtbx.model
import mmtbx.f_model
from iotbx import reflection_file_reader

To calculate R-factors we need a model (coordinates, b-factors, atom-type, occupancy, etc.) to obtain calculated structure factors and data (structure factor amplitudes or intensities, R-free-flags for Rfree). The following sections explain how to obtain the model and data objects from the input files.

If you don't have examples files at hand, you can use files that come with the phenix installation:

import os
import libtbx.load_env
pdb_file = libtbx.env.find_in_repositories(
  relative_path="phenix_regression/pdb/1yjp_h.pdb",
  test=os.path.isfile)
mtz_file = libtbx.env.find_in_repositories(
  relative_path="phenix_regression/reflection_files/1yjp.mtz",
  test=os.path.isfile)
assert (not None in [pdb_file, mtz_file])

Get the model object

First, let's create a model object from the input file pdb_file.

# Read in the model file and create model object
pdb_inp = iotbx.pdb.input(file_name = pdb_file)
model = mmtbx.model.manager(model_input = pdb_inp)

Read the mtz file and get the miller array

We need data to calculate R-factors. First we'll read the mtz file, then we get an appropriate Miller array. In this example, we know that the miller arrays have the names FOBS,SIGFOBS and R-free-flags in the mtz file.

# Get miller arrays for data and Rfree flags
miller_arrays = reflection_file_reader.any_reflection_file(file_name =
  mtz_file).as_miller_arrays()
for ma in miller_arrays:
  print(ma.info().label_string())
  if(ma.info().label_string()=="FOBS_X,SIGFOBS_X"):
    f_obs = ma
  if(ma.info().label_string()=="R-free-flags"):
    r_free_flags = ma

Now we have to remove reflections that don't have an equivalent in the other array, in other words, both arrays need to have the same number of reflections. We also bring the r_free_flags array into a boolean format suitable for further calculations.

# Obtain a common set of reflections
f_obs, r_free_flags = f_obs.common_sets(r_free_flags)
r_free_flags = r_free_flags.array(data = r_free_flags.data()==0)

To doublecheck, let's print the number of reflections that are "free" and "work", respectively:

print(r_free_flags.data().count(True), r_free_flags.data().count(False))

Calculate R-factors

The following lines of code yield the f_model object that combines the data and model:

fmodel = mmtbx.f_model.manager(
  f_obs          = f_obs,
  r_free_flags   = r_free_flags,
  xray_structure = model.get_xray_structure())

You can see that we did not pass the model object itself to f_model. The reason is that model contains way more infomation than necessary to calculate structure factors, for example it may carry restraint information. To calculate R-factors, this is not needed. Instead, the minimal structural information is contained in the xray_structure object.

We need one more step before calculating R-factors: The calculated and observed structure factor amplitudes are usually not on a common scale and low resolution structure factors need to be adjusted. To address this, the update_all_scales() method will calculate scale factors and add a bulk solvent model.

fmodel.update_all_scales()

Finally, we print the results:

print("r_work=%6.4f r_free=%6.4f"%(fmodel.r_work(), fmodel.r_free()))
fmodel.show(show_header=False, show_approx=False)