``direct_methods_light.py`` =========================== Overview -------- The ``direct_methods_light.py`` `Python`_ example is designed to read two `CIF`_ files from the `Acta Crystallographica Section C`_ web page as inputs: - required: reduced X-ray diffraction data: `vj1132Isup2.hkl`_ - optional: the corresponding refined structure: `vj1132sup1.cif`_ The ``iotbx.cif`` module is used to convert the diffraction data to a ``cctbx.miller.array`` object. Normalized structure factors ("E-values") are computed, and the largest E-values are selected for phase recycling with the Tangent Formula. The Miller indices of the largest E-values are used to construct index triplets ``h = k + h-k`` with the ``cctbx.dmtbx.triplet_generator``. The Tangent Formula is repeatedly applied to recycle a phase set, starting from random phases. After a given number of cycles, the resulting phase set is combined with the E-values. The resulting Fourier coefficients are used in a Fast Fourier Transformation to obtain an "E-map". The E-map is normalized and a symmetry-aware peak search is carried out; i.e. the resulting peak list is unique under symmetry. If the CIF file with the coordinates is given, it is first used to compute structure factors ``f_calc``. The correlation with the diffraction data is shown. Next, the CIF coordinates are compared with the E-map peak list using the Euclidean Model Matching procedure (Emma) implemented in the ``cctbx``. The resulting output can be used to quickly judge if the structure was solved with the simple Tangent Formula recycling procedure. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Recommended reading ------------------- The `unit_cell_refinement.py`_ example introduces some important basis concepts. Processing of ``vj1132Isup2.hkl`` --------------------------------- The first step is to get hold of the file name with the reduced diffraction data. The file name has to be specified as the first command-line argument:: iotbx.python direct_methods_light.py vj1132Isup2.hkl In the example script, the command line argument is extracted from the ``sys.argv`` list provided by Python's standard ``sys`` module (`libtbx.help sys`_):: import sys reflection_file_name = sys.argv[1] The ``reflection_file_name`` is used in the call of the ``reader()`` class provided by the `iotbx.cif`_ module:: import iotbx.cif miller_arrays = iotbx.cif.reader( file_path=reflection_file_name).as_miller_arrays() The ``iotbx.cif`` module is used to read CIF files. The ``iotbx.cif.reader()`` class returns a Python representation of the CIF file where all the data items are plain strings. The ``as_miller_arrays()`` method extracts the appropriate strings from the CIF Python object to construct instances of the ``cctbx.miller.array`` class, which is one of the `central types in the cctbx source tree`_. Since this returns a list of all the miller arrays found in the CIF file, we examine each array in turn to find the measured intensities or amplitudes:: for miller_array in miller_arrays: s = str(miller_array.info()) if '_meas' in s: if miller_array.is_xray_intensity_array(): break elif miller_array.is_xray_amplitude_array(): break The ``miller.array`` class has a very large number of methods (`libtbx.help cctbx.miller.array`_), e.g. the ``show_comprehensive_summary()`` method which is used to obtain this output:: Miller array info: vj1132Isup2.hkl:F_meas,F_sigma Observation type: xray.amplitude Type of data: double, size=422 Type of sigmas: double, size=422 Number of Miller indices: 422 Anomalous flag: False Unit cell: (12.0263, 6.0321, 5.8293, 90, 90, 90) Space group: P n a 21 (No. 33) Systematic absences: 0 Centric reflections: 83 Resolution range: 6.01315 0.83382 Completeness in resolution range: 1 Completeness with d_max=infinity: 1 We can see that the ``miller_array`` contains data and sigmas, both of type double. It also contains Miller indices, an anomalous flag, a unit cell and a space_group object. These are the primary data members. The observation type is an optional annotation which is typically added by the creator of the object, in this case the ``as_miller_arrays()`` function. The information in the last five lines of the output is calculated on the fly based on the primary information and discarded after the ``show_comprehensive_summary()`` call is completed. Two other ``cctbx.miller.array`` methods are used in the following statements in the script:: if (miller_array.is_xray_intensity_array()): miller_array = miller_array.as_amplitude_array() If the ``miller_array`` is an intensity array, it is converted to an amplitude array. The ``as_amplitude_array()`` method returns a new ``cctbx.miller.array`` instance. At some point during the evaluation of the statement the old and the new instance are both present in memory. However, after the ``miller_array = miller_array.as_amplitude_array()`` assignment is completed, the old ``miller_array`` instance is deleted automatically by the Python interpreter since there is no longer a reference to it, and the corresponding memory is released immediately. It is very important to understand that most ``miller.array`` methods do not modify the instance in place, but return new objects. The importance of minimizing the number of methods performing in-place manipulations cannot be overstated. In large systems, in-place manipulations quickly lead to unforeseen side-effects and eventually frustrating, time-consuming debugging sessions. It is much safer to create new objects. In most cases the dynamic memory allocation overhead associated with object creation and deletion is negligible compared to the runtime for the actual core algorithms. It is like putting on seat belts before a long trip with the car. The 10 seconds it takes to buckle up are nothing compared to the hours the seat belts protect you. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Computation of E-values ----------------------- Having said all the things about the dangers of in-place operations, the next statement in the script happens to be just that:: miller_array.setup_binner(auto_binning=True) However, the operation does not affect the primary data members of the ``miller_array`` (unit cell, space group, indices, data, sigmas). The ``setup_binner()`` call initializes or re-initializes a binner object to be used in subsequent calculations. The ``binner`` object is understood to be a secondary data member and its state only affects the results of future calculations. In situations like this in-place operations are perfectly reasonable. The result of the ``setup_binner()`` call is shown with this statement:: miller_array.binner().show_summary() The output is:: unused: - 6.0133 [ 0/0 ] bin 1: 6.0133 - 1.6574 [57/57] bin 2: 1.6574 - 1.3201 [55/55] bin 3: 1.3201 - 1.1546 [55/55] bin 4: 1.1546 - 1.0496 [45/45] bin 5: 1.0496 - 0.9747 [55/55] bin 6: 0.9747 - 0.9175 [55/55] bin 7: 0.9175 - 0.8717 [48/48] bin 8: 0.8717 - 0.8338 [52/52] unused: 0.8338 - [ 0/0 ] This means we are ready to calculate quasi-normalized structure factors by computing ``f_sq / `` in resolution bins:: all_e_values = miller_array.quasi_normalize_structure_factors().sort(by_value="data") This statement performs two steps at once. First, the ``quasi_normalize_structure_factors()`` method creates a new ``cctbx.miller.array`` instance with the same unit cell, space group, anomalous flag and Miller indices as the input ``miller_array``, but with a new data array containing the normalized structure factors. The ``sort()`` method is used immediately on this intermediate instance to sort the E-values by magnitude. By default, the data are sorted in descending order (largest first, smallest last). This is exactly what we want here. To convince yourself it is correct, insert ``all_e_values.show_array()``. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Generation of triplets ---------------------- In direct methods procedures it is typical to generate the ``h = k + h-k`` Miller index triplets only for the largest E-values. In the example script, the largest E-values are selected with this statement:: large_e_values = all_e_values.select(all_e_values.data() > 1.2) Again, this statement combines several steps into one expression. First, we obtain access to the array of all E-values via ``all_e_values.data()``. This array is a ``flex.double`` instance, which in turn has its own methods (`libtbx.help cctbx.array_family.flex.double`_). One of the ``flex.double`` methods is the overloaded ``>`` operator; in the ``libtbx.help`` output look for ``__gt__(...)``. This operator returns a ``flex.bool`` instance, an array with ``bool`` values, ``True`` if the corresponding E-value is greater than ``1.2`` and ``False`` otherwise. The ``flex.bool`` instance becomes the argument to the ``select()`` method of ``cctbx.miller.array``, which finally returns the result of the whole statement. ``large_e_values`` is a new ``cctbx.miller.array`` instance with the same unit cell, space group and anomalous flag as ``all_e_values``, but fewer indices and corresponding data. Of the 422 E-values only 111 are selected, as is shown by this ``print`` statement:: print "number of large_e_values:", large_e_values.size() At this point all the information required to generate the triplets is available:: from cctbx import dmtbx triplets = dmtbx.triplet_generator(large_e_values) The ``triplet_generator`` is based on the `cctbx::dmtbx::triplet_generator`_ C++ class which uses a very fast algorithm to find the Miller index triplets (see the references near the top of `triplet_generator.h`_). The ``triplets`` object manages all internal arrays automatically. It is not necessary to know very much about this object, but is is informative to print out the results of some of its methods, e.g.:: from cctbx.array_family import flex print "triplets per reflection: min,max,mean: %d, %d, %.2f" % ( flex.min(triplets.n_relations()), flex.max(triplets.n_relations()), flex.mean(triplets.n_relations().as_double())) print "total number of triplets:", flex.sum(triplets.n_relations()) Here the general purpose ``flex.min()``, ``flex.max()``, ``flex.mean()`` and ``flex.sum()`` functions are used to obtain summary statistics of the number of triplet phase relations per Miller index. ``triplets.n_relations()`` returns a ``flex.size_t()`` array with unsigned integers corresponding to the ANSI C/C++ ``size_t`` type. However, the ``flex.mean()`` function is only defined for ``flex.double`` arrays. Therefore ``n_relations()`` has to be converted via ``as_double()`` before computing the mean. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Tangent Formula phase recycling ------------------------------- Starting with `RANTAN`_, the predominant method for initiating Tangent Formula phase recycling is to generate random phases. In principle this is very easy. E.g. the ``flex.random_double()`` function could be used:: random_phases_rad = flex.random_double(size=large_e_values.size())-0.5 random_phases_rad *= 2*math.pi However, centric reflections need special attention since the phase angles are restricted to two values, ``phi`` and ``phi+180``, where ``phi`` depends on the space group and the Miller index. A proper treatment of the phase restrictions is implemented in the ``random_phases_compatible_with_phase_restrictions()`` method of ``cctbx.miller.array``:: input_phases = large_e_values.random_phases_compatible_with_phase_restrictions() The underlying random number generator is seeded with the system time, therefore the ``input_phases`` will be different each time the example script is run. The Tangent Formula recycling loop has this simple design:: result = input for i in xrange(10): result = function(result) In the example script the actual corresponding code is:: tangent_formula_phases = input_phases.data() for i in xrange(10): tangent_formula_phases = triplets.apply_tangent_formula( amplitudes=large_e_values.data(), phases_rad=tangent_formula_phases, selection_fixed=None, use_fixed_only=False, reuse_results=True) In this case ``function()`` is the ``apply_tangent_formula()`` method of the ``triplet`` object returned by the ``cctbx.dmtbx.triplet_generator()`` call. The function call looks more complicated than the simplified version because it requires a number of additional arguments customizing the recycling protocol. It may be interesting to try different settings as an exercise. See `cctbx::dmtbx::triplet_generator`_ for details. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] E-map calculation ----------------- Another ``cctbx.miller.array`` method is used to combine the ``large_e_values`` with the ``tangent_formula_phases`` obtained through the recycling procedure:: e_map_coeff = large_e_values.phase_transfer( phase_source=tangent_formula_phases) The ``phase_transfer()`` returns a ``flex.complex_double`` array of Fourier coefficients. A general proper treatment of phase restrictions is automatically included, although in this case it just corrects for rounding errors. Given the Fourier coefficients, an E-map could be obtained simply via ``e_map_coeff.fft_map()``. However, we have to think ahead a little to address a technical detail. A subsequent step will be a peak search in the E-map. For this we will use a peak search algorithm implemented in the ``cctbx.maptbx`` module, which imposes certain space-group specific restrictions on the gridding of the map. For all symmetry operations of the given space group, each grid point must be mapped exactly onto another grid point. E.g. in space group P222 the gridding must be a multiple of 2 in all three dimensions. To inform the ``fft_map()`` method about these requirements we use:: from cctbx import maptbx e_map = e_map_coeff.fft_map(symmetry_flags=maptbx.use_space_group_symmetry) The resulting e_map is normalized by first determining the mean and standard deviation ("sigma") of all values in the map, and then dividing by the standard deviation:: e_map.apply_sigma_scaling() Since maps tend to be large and short-lived, this is implemented as an in-place operation to maximize runtime efficiency. The ``statistics()`` method of the ``e_map`` object is used to quickly print a small summary:: e_map.statistics().show_summary(prefix="e_map ") This output is of the form:: e_map max 11.9224 e_map min -2.68763 e_map mean -2.06139e-17 e_map sigma 1 Due to differences in the seed for the random number generator, the ``max`` and ``min`` will be different each time the example script is run. However, the ``mean`` is always very close to ``0`` since the Fourier coefficient with index (0,0,0) is zero, and ``sigma`` is always very close to ``1`` due to the prior use of ``apply_sigma_scaling()``; small deviations are the accumulated result of floating-point rounding errors. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Peak search ----------- Given the normalized ``e_map``, the peak search is initiated with this statement:: peak_search = e_map.peak_search(parameters=maptbx.peak_search_parameters( min_distance_sym_equiv=1.2)) The only purpose of the ``maptbx.peak_search_parameters`` class is to group the fairly large number of parameters (`libtbx.help cctbx.maptbx.peak_search_parameters`_). This approach greatly simplifies the argument list of functions and methods involving peak search parameters. It also accelerates experimentation during the algorithm development process. Parameters can be added, deleted or renamed without having to modify all the functions and methods connected to the peak search. In the example, the minimum distance between symmetry-related sites is set to 1.2 A. This instructs the peak search algorithm to perform a cluster analysis. The underlying distance calculations are performed for symmetry-related pairs and pairs of peaks unique under symmetry ("cross peaks"). If the ``min_cross_distance`` peak search parameter is not specified explicitly (as in the example), it is assumed to be equal to the ``min_distance_sym_equiv`` parameter. The cluster analysis begins by adding the largest peak in the map as the first entry to the peak list. All peaks in a radius of 1.2 A around this peak are eliminated. The largest of the remaining peaks is added to the peak list, and all peaks in a radius of 1.2 A around this peak are eliminated etc., until all peaks in the map are considered or a predefined limit is reached. The example uses:: peaks = peak_search.all(max_clusters=10) to obtain up to 10 peaks in this way. The peaks are printed in this ``for`` loop:: for site,height in zip(peaks.sites(), peaks.heights()): print " (%9.6f, %9.6f, %9.6f)" % site, "%10.3f" % height See the `unit_cell_refinement.py`_ example for comments regarding the standard Python ``zip()`` function. The Python tutorial section on `Fancier Output Formatting`_ is useful to learn more about the ``print`` statement. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Processing of ``vj1132sup1.cif`` -------------------------------- Since it is not easy to quickly judge from the peak list if the structure was solved, the ``vj1132sup1.cif`` file is used for verification purposes. It is processed in very much the same way as the ``vj1132Isup2.hkl`` file before:: coordinate_file_name = sys.argv[2] xray_structure = iotbx.cif.reader( file_path=coordinate_file_name).build_crystal_structures( data_block_name="I") The ``build_crystal_structures()`` call requires the name of the CIF data block name in addition to the file name. This is because Acta C coordinate CIF files may contain multiple structures each within a different data block. The result is an instance of another `central type in the cctbx source tree`_, ``cctbx.xray.structure``. The ``xray_structure`` object is best understood by asking it for a summary:: xray_structure.show_summary() The output is:: Number of scatterers: 13 At special positions: 0 Unit cell: (12.0263, 6.0321, 5.829, 90, 90, 90) Space group: P n a 21 (No. 33) We can also ask it for a list of scatterers:: xray_structure.show_scatterers() The result is:: Label, Scattering, Multiplicity, Coordinates, Occupancy, Uiso O1 O 4 ( 0.5896 0.4862 0.6045) 1.00 0.0356 O2 O 4 ( 0.6861 0.2009 0.7409) 1.00 0.0328 C2 C 4 ( 0.6636 0.2231 0.3380) 1.00 0.0224 H2 H 4 ( 0.7419 0.1804 0.3237) 1.00 0.0270 C1 C 4 ( 0.6443 0.3133 0.5818) 1.00 0.0222 C3 C 4 ( 0.5923 0.0216 0.2916) 1.00 0.0347 H3A H 4 ( 0.6060 -0.0308 0.1387) 1.00 0.0520 H3B H 4 ( 0.5153 0.0605 0.3069) 1.00 0.0520 H3C H 4 ( 0.6104 -0.0930 0.3997) 1.00 0.0520 N1 N 4 ( 0.6395 0.3976 0.1659) 1.00 0.0258 H1A H 4 ( 0.6815 0.5160 0.1942) 1.00 0.0390 H1B H 4 ( 0.5680 0.4351 0.1741) 1.00 0.0390 H1C H 4 ( 0.6544 0.3463 0.0261) 1.00 0.0390 Instead of writing:: xray_structure.show_summary() xray_structure.show_scatterers() we can also write:: xray_structure.show_summary().show_scatterers() This approach is called "chaining". The trick is in fact very simple:: class structure: def show_summary(self): print "something" return self def show_scatterers(): print "more" return self Simply returning ``self`` enables chaining. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Correlation of F-obs and F-calc ------------------------------- Given the amplitudes F-obs as ``miller_array`` and the refined coordinates as ``xray_structure``, F-calc amplitudes are computed with this statement:: f_calc = abs(miller_array.structure_factors_from_scatterers( xray_structure=xray_structure, algorithm="direct").f_calc()) This expression can be broken down into three steps. The first step is:: miller_array.structure_factors_from_scatterers( xray_structure=xray_structure, algorithm="direct") This step performs the structure factor calculation using a direct summation algorithm (as opposed to a FFT algorithm). The result is an object with information about the details of the calculation, e.g. timings, or memory requirements if the FFT algorithm is used. If the details are not needed, they can be discarded immediately by extracting only the item of interest. In this case we use the ``f_calc()`` method to obtain a ``cctbx.miller.array`` instance with the calculated structure factors, stored in a ``flex.complex_double`` array. The outermost ``abs()`` function calls the ``__abs__()`` method of ``cctbx.miller.array`` which returns another new ``cctbx.miller.array`` instance with the structure factor amplitudes, stored in a ``flex.double`` array. The correlation of F-obs and F-calc is computed with this statement:: correlation = flex.linear_correlation(f_calc.data(), miller_array.data()) ``flex.linear_correlation`` is a C++ class (`libtbx.help cctbx.array_family.flex.linear_correlation`_) which offers details about the correlation calculation, similar in idea to the result of the ``structure_factors_from_scatterers()`` above. We could discard all the details again, but the correlation coefficient could be undefined, e.g. if all values are zero, or if all values in one of the two input arrays are equal. We ensure the correlation is well defined via:: assert correlation.is_well_defined() It is good practice to insert ``assert`` statements anywhere a certain assumption is made. The ``cctbx`` sources contain a large number of ``assert`` statements. They prove to be invaluable in flagging errors during algorithm development. In most situations errors are flagged close to the source. Time-consuming debugging sessions to backtrack from the point of a crash to the source of the problem are mostly avoided. Once we are sure the correlation is well defined, we can print the coefficient with confidence:: print "correlation of f_obs and f_calc: %.4f" % correlation.coefficient() It is amazingly high (0.9943) for the ``vj1132`` case. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Euclidean Model Matching (Emma) ------------------------------- Our goal is to match each peak site with a site in the ``vj1132sup1.cif`` file. To make this section less abstract, we start with an example result:: Match summary: Operator: rotation: {{-1, 0, 0}, {0, -1, 0}, {0, 0, -1}} translation: {0.5, 0, -0.136844} rms coordinate differences: 0.85 Pairs: 8 O1 peak01 0.710 O2 peak09 1.000 C2 peak03 0.896 C1 peak02 0.662 C3 peak04 0.954 H3B peak07 0.619 H3C peak08 0.979 H1C peak06 0.900 Singles model 1: 5 H2 H3A N1 H1A H1B Singles model 2: 2 peak00 peak05 This means, ``peak01`` corresponds to ``O1`` in the CIF file with a mismatch of 0.710 A, ``peak09`` corresponds to ``O2`` with a 1.000 A mismatch, etc. The match is obtained after inverting the hand of the ``peaks`` (the ``rotation``) and adding ``{0.5, 0, -0.136844}`` to the coordinates (the ``translation``). Some sites in the CIF files have no matching peaks (e.g. ``N1``) and some peaks have no matching site in the CIF file (e.g. ``peak00``). The overall RMS (root-mean-square) of the mismatches is 0.85. I.e. this match is not very good, except as a bad example. In general, the comparison of two coordinate sets via pair-wise association of sites is quite complex due to the underlying symmetry of the search space. In addition to the space group symmetry, allowed origin shifts and a change of hand have to be taken into consideration. This is described in detail by `Grosse-Kunstleve & Adams (2003)`_. The ``cctbx.euclidean_model_matching`` module is available for computing the pairs of matching sites. The search algorithm operates on specifically designed ``cctbx.euclidean_model_matching.model`` objects. I.e. we have to convert the ``xray_structure`` instance and the ``peaks`` to ``cctbx.euclidean_model_matching.model`` objects. Converting the ``xray_structure`` object is easy because the conversion is pre-defined as the ``as_emma_model()`` method:: reference_model = xray_structure.as_emma_model() Converting the ``peaks`` object is not pre-defined. We have to do it the hard way. We start with assertions, just to be sure:: assert reference_model.unit_cell().is_similar_to(e_map.unit_cell()) assert reference_model.space_group() == e_map.space_group() This gives us the confidence to write:: from cctbx import euclidean_model_matching as emma peak_model = emma.model(special_position_settings=reference_model) ``special_position_settings`` is a third `central type in the cctbx source tree`_. It groups the unit cell, space group, and the ``min_distance_sym_equiv`` parameter which defines the tolerance for the determination of special positions. ``emma.model`` inherits from this type, therefore we can use the ``reference_model`` (which is an ``emma.model`` object) anywhere a ``special_position_settings`` object is required. This is more convenient than constructing a new ``special_position_settings`` objects from scratch. At this stage the ``peak_model`` object does not contain any coordinates. We add them with this loop:: for i,site in enumerate(peaks.sites()): peak_model.add_position(emma.position(label="peak%02d" % i, site=site)) The loop construct is a standard idiom (`libtbx.help enumerate`_, `Looping Techniques`_). ``label="peak%02d" % i`` creates a label of the form ``peak000``, ``peak001``, etc. The label and the ``site`` are used to construct an ``emma.position`` object which is finally added to the ``peak_model`` via the ``add_position()`` method. The ``emma.model_matches()`` function computes a sorted list of possible matches:: matches = emma.model_matches( model1=reference_model, model2=peak_model, tolerance=1., models_are_diffraction_index_equivalent=True) The ``tolerance`` determines the maximum distance for a pair of a site in ``model1`` and a site in ``model2``. The ``models_are_diffraction_index_equivalent`` parameter is used in the determination of the symmetry of the search space and has to do with indexing ambiguities. It is always safe to use ``models_are_diffraction_index_equivalent=False``, but the search may be slower. If it is certain that the models are derived from the same diffraction data ``models_are_diffraction_index_equivalent=True`` can be used to reduce the runtime. In this case we are sure because the correlation between F-obs and F-calc is almost perfect. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Exercise (not very hard) ------------------------ Run the example script several times. Each time the results will be different due to different random seeds. You will observe that Emma is often mislead by the hydrogens in the CIF file. To solve this problem, modify the script to remove the hydrogens from the reference model. Hint: Find the implementation of ``cctbx.xray.structure.as_emma_model()`` (``cctbx/xray/__init__.py``). Note that ``scatterer`` has a ``scattering_type`` attribute. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Exercise (harder) ----------------- Compute the correlation between ``all_e_values`` and a ``cctbx.xray.structure`` constructed with the top 6 peaks, using ``"const"`` as the ``scattering_type``. Hint: Study ``iotbx/cif/builders.py`` to see how the ``xray_structure`` is constructed from the CIF file. However, use ``peak_structure = xray.structure(special_position_settings=xray_structure)``. Study ``cctbx.xray.structure.__init__()`` to see why this works. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] Exercise (advanced) ------------------- Refactor the example script by splitting it up into functions and possibly classes. Compute random starting phases a given number of times and repeat the Tangent formula recycling for each. Avoid duplicate work. I.e. don't read the inputs multiple time, don't create the Emma reference model multiple times. [`Complete example script`_] [`Example output`_] [`cctbx downloads`_] [`cctbx front page`_] [`Python tutorial`_] .. _`unit_cell_refinement.py`: http://cctbx.sourceforge.net/siena2005/unit_cell_refinement.html .. _`Complete example script`: http://cci.lbl.gov/cctbx_sources/iotbx/examples/direct_methods_light.py .. _`Example output`: http://cctbx.sourceforge.net/siena2005/direct_methods_light.out .. _`cctbx downloads`: http://cci.lbl.gov/cctbx_build/ .. _`cctbx front page`: http://cctbx.sourceforge.net/ .. _`Fancier Output Formatting`: http://docs.python.org/tut/node9.html#SECTION009100000000000000000 .. _`Looping Techniques`: http://docs.python.org/tut/node7.html#SECTION007600000000000000000 .. _`Python tutorial`: http://docs.python.org/tut/ .. _`Python`: http://www.python.org/ .. _`CIF`: http://www.iucr.org/iucr-top/cif/ .. _`Acta Crystallographica Section C`: http://journals.iucr.org/c/journalhomepage.html .. _`vj1132Isup2.hkl`: http://journals.iucr.org/c/issues/2001/05/00/vj1132/vj1132Isup2.hkl .. _`vj1132sup1.cif`: http://journals.iucr.org/c/issues/2001/05/00/vj1132/vj1132sup1.cif .. _`iotbx.cif`: http://cci.lbl.gov/cctbx_sources/iotbx/cif/__init__.py .. _`cctbx::dmtbx::triplet_generator`: http://cctbx.sourceforge.net/current/c_plus_plus/classcctbx_1_1dmtbx_1_1triplet__generator.html .. _`triplet_generator.h`: http://cci.lbl.gov/cctbx_sources/cctbx/include/cctbx/dmtbx/triplet_generator.h .. _`central type in the cctbx source tree`: http://cctbx.sourceforge.net/siena2005/cctbx_central_types.pdf .. _`central types in the cctbx source tree`: http://cctbx.sourceforge.net/siena2005/cctbx_central_types.pdf .. _`libtbx.help cctbx.array_family.flex.double`: http://cci.lbl.gov/cctbx/cctbx_web.cgi?target_module=libtbx_help&query=cctbx.array_family.flex.double .. _`libtbx.help cctbx.array_family.flex.linear_correlation`: http://cci.lbl.gov/cctbx/cctbx_web.cgi?target_module=libtbx_help&query=cctbx.array_family.flex.linear_correlation .. _`libtbx.help cctbx.maptbx.peak_search_parameters`: http://cci.lbl.gov/cctbx/cctbx_web.cgi?target_module=libtbx_help&query=cctbx.maptbx.peak_search_parameters .. _`libtbx.help cctbx.miller.array`: http://cci.lbl.gov/cctbx/cctbx_web.cgi?target_module=libtbx_help&query=cctbx.miller.array .. _`libtbx.help enumerate`: http://cci.lbl.gov/cctbx/cctbx_web.cgi?target_module=libtbx_help&query=enumerate .. _`libtbx.help sys`: http://cci.lbl.gov/cctbx/cctbx_web.cgi?target_module=libtbx_help&query=sys .. _`RANTAN`: http://www.ccp4.ac.uk/dist/html/rantan.html .. _`Grosse-Kunstleve & Adams (2003)`: http://cci.lbl.gov/~rwgk/my_papers/iucr/ba5051_reprint.pdf