Please see the Newsletter Article for initial details worked out in July 2010. A new multiprocessing Spotfinder for beamlines was released in July 2011 (see announcement), and the purpose of this page is to give current installation instructions.
http://localhost:8125/spotfinder/distl.signal_strength?distl.image=/data/raster_1_0319.cbf&distl.res.outer=4.9&distl.bins.verbose=TrueThe query string in the above URL is constructed with field=value pairs that exactly match the options available for the command-line version of the program. Therefore the above URL will return the same text as this command:
distl.signal_strength distl.image=/data/raster_1_0319.cbf distl.res.outer=4.9 distl.bins.verbose=TrueOf course, there is an important detail here: since it takes 1-2 seconds to analyze any given image, the client must get on with other work while waiting for the http reply; in other words the http requests must be issued asynchronously. This is efficiently handled by making the client multithreaded; each http request is handled by a separate thread in the client process. A straightforward example of how this can be done in Python (with just 40 lines of code) is given here.
64-bit Fedora 8 Intel Xeon (2.93 GHz) 16-cores 8.9 frames/sec (Pilatus-6M) 64-bit Fedora 13 AMD Opteron(2.20 GHz) 48-cores 25 frames/sec (Pilatus-6M)These tests were run with both data and code residing on a local disk (not NFS-mounted) on the server machine. Bragg spots were processed out to the corners of the Pilatus detector. For the Fedora 13 server it was only possible to obtain these optimal throughput rates when the multithreaded client was executed on a separate machine on the local subnet; indicating that the client actually utilizes significant resources.
Apache/Mod-Python Server | Python Server | |
Architecture | The Apache httpd Web server is used to implement a multiprocess server. The mod-python package is dynamically linked in to provide a Python interpreter within the Apache process. The LBL code (Python/C++) is executed within this mod-python/Python interpreter. | Built-in code from Python's BaseHTTPServer module is used to implement a multiprocess server. |
Starting and Stopping the Server |
The server process is started and stopped with the apachectl command distributed with Apache. However,
since the CCTBX environment must first be sourced, a compound command is used:
/bin/sh build/env_run httpd/bin/apachectl [start|stop]The exact form of the command is printed out by the installer. Configuration details such as the port number are predefined by the installer, but can be edited later in httpd/conf/httpd.conf. |
The CCTBX environment is sourced & the server process is started from the command line:
source cctbx_build/setpaths.csh distl.mp_spotfinder_server_read_file distl.bins.verbose=\ True distl.port=8125 distl.processors=8^C stops the server from within this shell; alternatively the command distl.thin_client EXIT localhost 8125kills the server from a remote shell. |
Parallel Processes | The Apache server automatically creates new child processes to handle the computational load in response to client requests, so as to fully utilize available CPU cores. Unused child processes are then terminated automatically when the load is reduced. | The number of child processes is fixed and must be specified explicitly at run time with the command line option shown above (e.g., distl.processors=8). |
Data Processing Options | For each client request, special processing directives must be built in to the URL, such as giving an outer resolution limit with distl.res.outer=2.0. The syntax is explained in the Quick Synopsis section above. |
In addition to giving special processing directives in the URL for each request, global processing
directives can be given on the command line at run time. For example, running
distl.mp_spotfinder_server_read_file distl.res.outer=5will impose a general resolution cutoff of 5 Angstroms, unless an override value is given in the URL of a particular client request. |
Timing and Robustness | The Apache server automatically handles performance tuning. Any number of client requests can be issued, and requests will be queued until the server's CPU resources are available. Client code should use asynchronous requests, as discussed in the Synopsis section above. | Performance tuning is the responsibility of the client. If client requests are generated faster than the server can handle, the server will hang in an unpredictable way, severely impairing the throughput. As a demonstration the example client includes a "sleep()" command whose duration can be tuned. Decreasing the sleep will improve performance up to a point, beyond which the throughput rate is dramatically degraded. Experimentation is required for proper tuning. |
wget http://cci.lbl.gov/cctbx_build/results/last_published/cctbx_python_273_bundle.selfx perl cctbx_python_273_bundle.selfxTake note of the message at the end giving the exact command for sourcing the CCTBX environment.
wget http://cci.lbl.gov/cctbx_build/results/last_published/cctbx_python_273_bundle.tar.gz tar zxf cctbx_python_273_bundle.tar.gzSecond, get the underlying Apache & mod-python services.
wget http://cci.lbl.gov/apache_services/apache_services.tar.gz tar zxf apache_services.tar.gz apache/install.cshTake note of the message at the end giving the exact command for starting and stopping the Apache server, as well as for running the example client. This message is also saved in the file README_customized.