hybrid-36: PDB serial and sequence numbers

Support for PDB files with

ATOM  99998  SD  MET L9999      48.231 -64.383  -9.257  1.00 11.54           S
ATOM  99999  CE  MET L9999      49.398 -63.242 -10.211  1.00 14.60           C
ATOM  A0000  N   VAL LA000      52.228 -67.689 -12.196  1.00  8.76           N
ATOM  A0001  CA  VAL LA000      53.657 -67.774 -12.458  1.00  3.40           C

The hybrid-36 counting system accommodates up to 87440031 ATOM serial numbers and up to 2436111 residue sequence numbers per chain. With the hybrid-36 system the distinction between "traditional" and "extended" PDB files becomes evident only if there are more than 99999 atoms to be stored.

Programs that are updated to support the hybrid-36 system will continue to interoperate with programs that do not as long as there are less than 100000 atoms. Updating existing programs to support the hybrid-36 system requires very little effort since the decoded numbers are still stored as 4-byte integers. The only change is to replace a few "normal" read and write statements with calls of the hy36decode() and hy36encode() functions.

Slide show: PDB working format evolution adopted by PHENIX

Portable hy36encode() and hy36decode() implementations

All implementations have NO external dependencies!

Overview and Python reference implementation: (as text) (view)
approx. 60 lines of real code

Java version of hy36encode() and hy36decode() (as text) (view)
approx. 190 lines of real code

ANSI C (1989) version of hy36encode() and hy36decode() (as text) (view)
Optional: hybrid_36_c.h (as text) (view)
approx. 240 lines of real code

Fortran 77 version of hy36encode() and hy36decode() (as text) (view)
approx. 290 lines of real code

Open Source License and Copyright

Contact: cctbx@cci.lbl.gov