This example script shows how to calculate R-factors with a model and a mtz file.
This script requires the following imports:
from __future__ import absolute_import, division, print_function import iotbx.pdb import mmtbx.model import mmtbx.f_model from iotbx import reflection_file_reader
To calculate R-factors we need a model (coordinates, b-factors, atom-type, occupancy, etc.) to obtain calculated structure factors and data (structure factor amplitudes or intensities, R-free-flags for Rfree). The following sections explain how to obtain the model and data objects from the input files.
If you don't have examples files at hand, you can use files that come with the phenix installation:
import os import libtbx.load_env pdb_file = libtbx.env.find_in_repositories( relative_path="phenix_regression/pdb/1yjp_h.pdb", test=os.path.isfile) mtz_file = libtbx.env.find_in_repositories( relative_path="phenix_regression/reflection_files/1yjp.mtz", test=os.path.isfile) assert (not None in [pdb_file, mtz_file])
First, let's create a model object from the input file
# Read in the model file and create model object pdb_inp = iotbx.pdb.input(file_name = pdb_file) model = mmtbx.model.manager(model_input = pdb_inp)
We need data to calculate R-factors. First we'll read the mtz file, then we get an appropriate Miller array. In this example, we know that the miller arrays have the names
R-free-flags in the mtz file.
# Get miller arrays for data and Rfree flags miller_arrays = reflection_file_reader.any_reflection_file(file_name = mtz_file).as_miller_arrays() for ma in miller_arrays: print(ma.info().label_string()) if(ma.info().label_string()=="FOBS_X,SIGFOBS_X"): f_obs = ma if(ma.info().label_string()=="R-free-flags"): r_free_flags = ma
Now we have to remove reflections that don't have an equivalent in the other array, in other words, both arrays need to have the same number of reflections. We also bring the r_free_flags array into a boolean format suitable for further calculations.
# Obtain a common set of reflections f_obs, r_free_flags = f_obs.common_sets(r_free_flags) r_free_flags = r_free_flags.array(data = r_free_flags.data()==0)
To doublecheck, let's print the number of reflections that are "free" and "work", respectively:
The following lines of code yield the
f_model object that combines the data and model:
fmodel = mmtbx.f_model.manager( f_obs = f_obs, r_free_flags = r_free_flags, xray_structure = model.get_xray_structure())
You can see that we did not pass the model object itself to
f_model. The reason is that
model contains way more infomation than necessary to calculate structure factors, for example it may carry restraint information. To calculate R-factors, this is not needed. Instead, the minimal structural information is contained in the
We need one more step before calculating R-factors: The calculated and observed structure factor amplitudes are usually not on a common scale and low resolution structure factors need to be adjusted. To address this, the
update_all_scales() method will calculate scale factors and add a bulk solvent model.
Finally, we print the results:
print("r_work=%6.4f r_free=%6.4f"%(fmodel.r_work(), fmodel.r_free())) fmodel.show(show_header=False, show_approx=False)