Creating a program in Phenix with the Program Template

Learn how to use the Phenix Program Template to write a new Phenix tool.

Introduction

Let’s create a new Phenix program. To illustrate the steps involved in this process, we will create an example program that reads in a map and a model and calculates the map-model correlation. In particular, we will use the Program Template (see CCN newsletter article), which standardizes the interfaces between a user or the GUI and a program. This approach provides some consistent functionality for basic tasks, like file and parameter handling. It also helps ensure that the same code is executed regardless of the user interface.

The name of the program will be map_cc. While creating the program, we'll set up a unit test.

Here are the steps we are going to follow to create our program. Each step will be discussed in more detail further below.
First we make a prototype and a test that shows that the prototype works:

  1. Create a prototype of the program that runs from the command line. The file for the new program will be created in the directory modules/phenix/phenix/model_building/map_cc.py. Normally you will put the code to do real work in a directory that corresponds to the type of work to be done.

  2. Create a unit test to make sure that the program works now and that it continues to work later, when we do some modifications to it. The unit test will be put in modules/phenix_regression/model_building/tst_map_cc.py. The directory phenix_regression is a special directory in which any file that starts with “tst” will be considered a test to be run automatically. The subdirectory model_building matches the location of our program (phenix/model_building).

Next we create the high-level files that allow easy communication between the user, the program and the GUI:

  1. Create a file modules/phenix/phenix/programs/map_cc.py that is an edited version of the Program Template. This file is going to specify the inputs to our program that does the work and it is going to call our program to do the work
  2. Create the file modules/phenix/phenix/command_line/map_cc.py in the command_line directory. This is a special directory where the Phenix installer always looks and finds programs to run.
  3. Run bootstrap.py in the top-level phenix directory to rebuild Phenix, creating all the necessary relationships so that we can type phenix.map_cc on the command-line to run our program.

Then we’ll be set and we can run our program with the command phenix.map_cc, and the automatic regression testing will test our program using the tst_map_cc.py regression test.

When you write your own program, you can follow these same steps, substituting your code and locations for the ones shown here.

Let’s go through each step in detail now.

Set up the Phenix environment

We are going refer to files in the Phenix directory, and the name of this directory depends on your setup. However, you can always use the environment variable PHENIX to refer to this top-level directory. First let’s make sure your environment is set up. Open a terminal window and type:

which phenix

This should print out something that ends in “build/bin/phenix”. Now let’s check and see if the PHENIX environmental variable is set. Type:

echo “$PHENIX”

This should print out a directory name corresponding to your top-level Phenix directory.

If either of these does not work, please see installation and set up for how to install Phenix and set up the environment.

At the end of this procedure, we use bootstrap.py to rebuild your Phenix installation. If you do not have bootstrap.py, you can do everything in the directory of your new tool. You just won’t be able to type “phenix.map_cc”, so you will have to type “phenix.python $PHENIX/modules/phenix/phenix/command_line/map_cc.py” to get the same result.

Creating the prototype for the tool

We create our program as a new file. We put the code that does the real work in a directory that corresponds to the type of work to be done. Our program will use the cctbx DataManager to read in files and have keywords that we are going to set later with the Program Template.

Here is our program:

# -*- coding: utf-8 -*-

from __future__ import division, print_function
import sys
from libtbx import group_args

def run(
    map_model_manager= None,
    resolution = None,
    log = sys.stdout):
  '''
  Demo program to calculate map-model correlation

  Parameters:
    map_model_manager:   iotbx.map_model_manager holds map and model
    resolution:          nominal resolution of map
    log:                 output stream

  Returns:
    group_args object containing value of map_model correlation
  '''

  print ("Getting map-model correlation for the map_model_manager:\n %s" %(
     map_model_manager), file = log)
  print ("Resolution will be: %.3f A " %(resolution), file = log)

  # use the function map_model_cc in map_model_manager to calculate the cc
  cc = map_model_manager.map_model_cc(resolution = resolution)

  print ("Map-model correlation is: %.3f " %(cc), file = log)

  return group_args(
     map_model_cc = cc,
   )

if __name__ == "__main__":
  from iotbx.data_manager import DataManager
  dm = DataManager()
  args = sys.argv[1:]
  if (len(args)) != 3:  # print how to run program and quit
    print ("Usage: 'phenix.map_cc <model_file> <map_file> <resolution> '")
  else:
    model_file = args[0]
    map_file = args[1]
    resolution = float(args[2])
    log = sys.stdout
    mmm = dm.get_map_model_manager(model_file=model_file,
      map_files = map_file,
      log = log)
    result = run(map_model_manager = mmm, resolution = resolution)
    print ("Result was: %.3f" %(result.map_model_cc))

Put this program in a file called “map_cc.py” in the directory modules/phenix/phenix/model_building/ (where modules is the modules subdirectory in your top-level Phenix directory).

We can run our program right away if we have a map and model. We can get them by setting up a tutorials directory. Get into a working directory in a terminal window (doesn’t matter where) and type:

phenix.setup_tutorial tutorial_name=model-building-scripting

This creates a subdirectory “model-building-scripting” that contains files we can use. Get into this subdirectory and type:

phenix.python $PHENIX/modules/phenix/phenix/model_building/map_cc.py \
 short_model_main_chain_box.pdb short_model_box.ccp4 3

This should run our program and produce output ending with:

Resolution will be: 3.000 A
Map-model correlation is: 0.742
Result was: 0.742

Creating a unit-test

Now we have a program that seems to work. Let’s make a test to make sure it continues to work as we modify it.

Our unit test should be quick and it should test the important aspects of our program. It should give a hard fail if the test does not succeed (i.e., the test should crash on failure with an AssertionError or similar error).

When creating a test, keep in mind that later on, we and other programmers are going to do things that affect our program. We want our test to fail if anyone else does something that causes our program to not work the way we want it to. Our test is our protection against us or anyone else doing something that makes the program fail.

In our example, we can use the data files we just read in to our program as test data, and the result we just got as the expected result. Below is our testing program. Notice that it looks a lot like the code at the bottom of our original program that read in files and ran the “run” method of our program. The imports at the top are typical for a test program and set up functions that we are going to need:

# -*- coding: utf-8 -*-
from __future__ import division, print_function

import time, sys, os
from libtbx.test_utils import approx_equal
import libtbx.load_env

def tst_01():

  # Identify where the data are to come from
  data_dir = libtbx.env.under_dist(
    module_name="phenix_regression",
    path="model_building",
    test=os.path.isdir)

  # Identify the map and model files
  map_file=os.path.join(data_dir,"short_model_box.ccp4")
  model_file=os.path.join(data_dir,"short_model_main_chain_box.pdb")

  # Import that DataManager
  from iotbx.data_manager import DataManager
  dm = DataManager()

  # Set the log file
  log = sys.stdout

  # Read in the map and model and get a map_model_manager
  mmm = dm.get_map_model_manager(model_file=model_file,
      map_files = map_file,
      log = log)

  # Set the resolution
  resolution = 3

  # Import and test the program map_cc.py:
  from phenix.model_building.map_cc import run

  result = run(map_model_manager = mmm, resolution = resolution)
  print ("Result was: %.3f" %(result.map_model_cc))

  # Make sure we got the right answer. Fails if result.map_model_cc does not
  #   match result.map_model_cc within machine precision
  assert approx_equal (result.map_model_cc, 0.7424188591)

  # We are done...if we had the wrong answer it would have crashed above

if __name__=="__main__":
  t0 = time.time()
  tst_01()
  print("Time: %6.4f"%(time.time()-t0))
  print("OK")

Put this test program in $PHENIX/modules/phenix_regression/model_building/tst_map_cc.py. Anything you put there is going to be automatically run as part of the Phenix regression system.

Let’s run our test:

phenix.python $PHENIX/modules/phenix_regression/model_building/tst_map_cc.py

This should run and print out “OK” at the end.

Creating the Program Template that runs map_cc

We now have a working program and a working unit test that makes sure the program is ok. Now let’s use the Program Template to make it easy for a user to run this program, and also to make it easy to put this program into the Phenix GUI (we are not going to do that here but this does make it easy). Here is our Program Template file, edited to run our map_cc program:

from __future__ import absolute_import, division, print_function
from phenix.program_template import ProgramTemplate
from libtbx import group_args

# =============================================================================

class Program(ProgramTemplate):

  description = """
  Demo program to calculate map-model correlation
  Usage: phenix.map_cc model.pdb map.mrc 3

  """

  # Define the data types that will be used
  datatypes = ['model', 'phil', 'real_map']

  # Input parameters

  # Note: include of map_model_phil_str allows specification of
  #   full_map, half_map, another half_map, and model and Program Template
  #   produces input_files.map_model_manager containing these

  master_phil_str = """

  input_files {
    include scope iotbx.map_model_manager.map_model_phil_str
    }

  crystal_info {
    resolution = None
      .type = float
      .help = Nominal resolution of map
      .short_caption = Resolution
    }

  """

  # Define how to determine if inputs are ok
  def validate(self):

    # Expect exactly one map and one model. Stop if not the case
    self.data_manager.has_real_maps(
      raise_sorry = True,
      expected_n  = 1,
      exact_count = True)
    self.data_manager.has_models(
      raise_sorry = True,
      expected_n  = 1,
      exact_count = True)

  # Set any defaults
  def set_defaults(self):
    pass


  # Run the program
  def run(self):

    from phenix.model_building.map_cc import run as get_map_cc
    self.result = get_map_cc(
      map_model_manager = self.data_manager.get_map_model_manager(
         from_phil=True),
      resolution = self.params.crystal_info.resolution,
      log        = self.logger)

  # Get the results
  def get_results(self):
    return group_args(
      map_model_cc = self.result.map_model_cc,
     )

Put this text in the file “modules/phenix/phenix/programs/map_cc.py” and we are almost ready to go.

Creating a file in the command_line directory

We want to be able to type phenix.map_cc to run our program. The way we do this is to put a file in a special directory (modules/phenix/phenix/command_line/) that tells the installer to set this up. Let’s make a file in this directory and call it “map_cc.py”. Here are the contents of map_cc.py. It defines what the program is going to be called (phenix.map_cc) and what to run when this program is invoked (the Program unit in phenix/programs/map_cc.py)

# LIBTBX_SET_DISPATCHER_NAME phenix.map_cc
from __future__ import absolute_import, division, print_function

from iotbx.cli_parser import run_program
from phenix.programs import map_cc

if __name__ == '__main__':
  run_program(program_class=map_cc.Program)

Running bootstrap to install the program

Now we want to run bootstrap.py in the top-level phenix directory to rebuild Phenix, creating all the necessary relationships so that we can type phenix.map_cc on the command-line to run our program. If you have bootstrap.py in your top-level Phenix directory, you can type:

python bootstrap.py build  --builder=phenix --use_conda 

When it is done, you are ready to go. You will need to open a new terminal window or type “rehash” in an existing one. Then you can type “phenix.map_cc” to run your program. Do it like this after getting in your directory where you put the demo files:

phenix.map_cc short_model_main_chain_box.pdb short_model_box.ccp4 resolution=3

That should run the program and give you the same result as before.

If you do not have bootstrap.py you just won’t be able to type phenix.map_cc to run the program. You will have to type “phenix.python $PHENIX/modules/phenix/phenix/command_line/map_cc.py” instead to get the same result.