Phenix and CCTBX Programming Suggestions

This site gives helpful programming tips to get started with cctbx.

Webpages

Run tests on your code

Before committing anything to the Phenix or CCTBX repository, run all the tests and make sure they all pass.

phenix_regression.test_all_parallel nproc=72 sort

When done, test_1.log will have a summary of your tests. The "sort" runs slow tests first.

If you are in a cycle of development where a lot of tests are failing and you want to run just tests that are affected by changes you have made and have not committed, use the keyword "git" instead of "sort". It will only run a subset of tests.
Note: be sure to run all tests before you commit your changes.

Make tests for your code

  • "If it isn't tested, it doesn't work"
  • CCTBX methods: Make a unit test for every functional unit.
  • Phenix methods (often high-level): Make unit tests if feasible, otherwise make very quick-running tests that exercise all important functionalities.
  • You can read how to make a unit test here
  • You can read detailed expectations about contributing to cctbx here

Write code that is compatible with mmCIF format

Overall suggestions

  • Use the Program template and do all other read/write with the data_manager present in the template. You can read how to make a Program template here:
  • When you write a file, capture the actual file name (which may end in .cif):
    fn = self.data_manager.write_model_file(model, target_file_name)
  • Create tests with models that cannot fit in PDB format. You can do this with pdbtools:
    phenix.pdbtools model.pdb output.file_name=model.cif old_id=A new_id=AXZLONG
  • When you have Phenix set up, you can see details of how to write mmCIF-compliant code by typing
    phenix.pdb_cif_conversion

Notes for existing code

  • Convert to using the Program template and the data_manager if possible.
  • Convert all the following to use the data_manager (none are mmCIF-compliant):
    • model.model_as_pdb()
    • hierarchy.as_pdb_string()
    • hierarchy.write_pdb_file()
    • For models, use:
      fn = self.data_manager.write_model_file(model, target_file_name)
    • For hierarchies, write out the model that it is part of instead.
    • If your hierarchy is not part of a model, you can use:
      m = hierarchy.as_model_manager(crystal_symmetry = crystal_symmetry)
      file_name = 'mypdb.pdb'
      fn = self.data_manager.write_model_file(m, file_name)
  • Find and fix problems in myfile.py with:
    libtbx.find_pdb_mmcif_problems myfile.py
  • Code that parses PDB-formatted text should be rewritten as it is not mmCIF-compatible.

Be aware of how the objects in Phenix/cctbx work

"Is it a new object or a pointer to the old one?"" This is an important but subtle feature of Python programming. Any time you make an object you need to be able to answer this question.

Member functions of cctbx objects sometimes return a new object and other times return a pointer to the old one. This makes a difference because if it is a pointer when you change the object it also changes the original.

CCTBX functions may change the object itself, or return a part of the object, or return something entirely new.

You can read all about this here.

Safe use of map_model_manager, map_manager, and model in Phenix

These managers are available when you use the Program template for your programming and your parameters include maps and models.

Get your manager with the Program template and data_manager

  • The data_manager automatically provides access to a map, model, and two half maps if the user supplies them
  • Normally just get a map_model_manager. This will be ready to use:
    mmm = self.data_manager.get_map_model_manager(from_phil=True)
  • if you have just a model, get a model manager
    m = self.data_manager.get_model(self.params.input_files.model_file)

Use the methods of a manager to do what you want to do

  • Mask all the maps around atoms in the model:
    map_model_manager.mask_all_maps_around_atoms()
  • Get a model-building object that knows about the map and model:
    mb = map_model_manager.model_building()
  • Get the model and map
    m = map_model_manager.model() mm = map_model_manager.map_manager()
  • You can see all the features of these managers here.
  • You can also see many specific ways to use these managers here and here.

The model or map you get from a map_model_manager is the same model or map that is still in the manager

  • If you change a model or map you are changing the ones in the map_model_manager too
  • If you want an unattached copy, you need to make a deep_copy().
  • Some attributes of a model you might reasonably change: sites_cart, occ, b, u. These do not affect compatibility with other objects in a map_model_manager
  • Some attributes of a model you should not change if it is part of a ap_model_manager: crystal_symmetry, unit_cell_crystal_symmetry, shift_cart. These are handled in the map_model_manager and affect all objects in the manager.
  • You can check to make sure your map_model_manager is in sync using the method: map_model_manager.check_consistency() which will stop if there is anything not consistent.

Boxed maps and origin shifts

Some maps have their origin at xyz = (0,0,0), but others may have an origin at some other location, usually because they have been cut out of a larger map (they have been boxed).

You can read all about this here.

  • If you have a model all by itself, or a map which has its origin on xyz = (0,0,0), origin shift is (0,0,0). All coordinates are the same as in an mmCIF or PDB file.
  • The origin shift comes up if you have a map that does not have its origin at xyz = (0,0,0). If the map has an origin at xyz=(100.,0,0) then the origin shift is (-100.0,0,0). The origin shift is applied to all coordinates in all models and to the map.
    In effect, the models and maps are all shifted to place their new origin at xyz=(0,0,0). Then you can work with them and the maps and models will have the right relationships. Origin shifts in Phenix always correspond to an integral number of grid units in the map.
  • The map_model_manager, model manager, and map_manager all keep track of the origin shift. You can get it with mmm.shift_cart(), model.shift_cart(), etc.
  • If you write out a model or map it will be shifted back to its original place when written to your output file unless you specify otherwise.
  • The hierarchy object does not know about origin shifts. If you extract a hierarchy from a model and write it out like this (not recommended):
    text = model.get_hierarchy().as_pdb_or_mmcif_string() then it will write the internal coordinates without shifting them back to their original location. Better to make this clear with the model object which does know about origin shifts and specifying if you want to shift back (to original location; this example does not shift back):
    text = model.as_pdb_or_mmcif_string(do_not_shift_back=True)