Proposal: reducing self.x=x; self.y=y; self.z=z boilerplate code

Hi fellow Python coders,

I often find myself writing:

class grouping:

    def __init__(self, x, y, z):
        self.x = x
        self.y = y
        self.z = z
        # real code, finally

This becomes a serious nuisance in complex applications with long argument lists, especially if long variable names are essential for managing the complexity. Therefore I propose that Python includes built-in support for reducing the self.x=x clutter. Below are arguments for the following approach (please don't get too agitated about the syntax right here, it really is a secondary consideration):

class grouping:

    def __init__(self, .x, .y, .z):
        # real code right here

Emulation using existing syntax:

def __init__(self, x, y, z):
    self.x = x
    del x
    self.y = y
    del y
    self.z = z
    del z

Is it really that important?

For applications of non-trivial size, yes. Here is a real-world example (one of many in that source tree):

http://cvs.sourceforge.net/viewcvs.py/cctbx/cctbx/cctbx/geometry_restraints/manager.py?view=markup

Fragment from this file:

class manager:

  def __init__(self,
        crystal_symmetry=None,
        model_indices=None,
        conformer_indices=None,
        site_symmetry_table=None,
        bond_params_table=None,
        shell_sym_tables=None,
        nonbonded_params=None,
        nonbonded_types=None,
        nonbonded_function=None,
        nonbonded_distance_cutoff=None,
        nonbonded_buffer=1,
        angle_proxies=None,
        dihedral_proxies=None,
        chirality_proxies=None,
        planarity_proxies=None,
        plain_pairs_radius=None):
    self.crystal_symmetry = crystal_symmetry
    self.model_indices = model_indices
    self.conformer_indices = conformer_indices
    self.site_symmetry_table = site_symmetry_table
    self.bond_params_table = bond_params_table
    self.shell_sym_tables = shell_sym_tables
    self.nonbonded_params = nonbonded_params
    self.nonbonded_types = nonbonded_types
    self.nonbonded_function = nonbonded_function
    self.nonbonded_distance_cutoff = nonbonded_distance_cutoff
    self.nonbonded_buffer = nonbonded_buffer
    self.angle_proxies = angle_proxies
    self.dihedral_proxies = dihedral_proxies
    self.chirality_proxies = chirality_proxies
    self.planarity_proxies = planarity_proxies
    self.plain_pairs_radius = plain_pairs_radius
    # real code, finally

Not exactly what you want to see in a high-level language.

Is there a way out with Python as-is?

Yes. If you take the time to look at the file in the CVS you'll find that I was cheating a bit. To reduce the terrible clutter above, I am actually using a simple trick:

adopt_init_args(self, locals())

For completeness, the implementation of adopt_init_args() is here:

http://cvs.sourceforge.net/viewcvs.py/cctbx/scitbx/scitbx/python_utils/misc.py?view=markup

While this obviously goes a long way, it has several disadvantages:

  • The solution doesn't come with Python -> everybody has to reinvent.
  • People are reluctant to use the trick since scripts become dependent on a non-standard feature.
  • It is difficult to remember which import to use for adopt_init_args (since everybody has a local version/variety).
  • The adopt_init_args(self, locals()) incantation is hard to remember and difficult to explain to new-comers.
  • Inside the __init__() method, the same object has two names, e.g. x and self.x. This lead to subtle bugs a few times when I accidentally assigned to x instead of self.x or vice versa in the wrong place (the bugs are typically introduced while refactoring).
  • In some cases the adopt_init_args() overhead was found to introduce a significant performance penalty (in particular the enhanced version discussed below).
  • Remember where Python comes from: it goes back to a teaching language, enabling mere mortals to embrace programming. adopt_init_args(self, locals()) definitely doesn't live up to this heritage.

Minimal proposal

My minimal proposal is to add an enhanced version of adopt_init_args() as a standard Python built-in function (actual name secondary!):

class grouping:

    def __init__(self, x, y, z):
        adopt_init_args()
        # real code

Here is a reference implementation:

http://cvs.sourceforge.net/viewcvs.py/cctbx/libtbx/libtbx/introspection.py?rev=1.2&view=markup

Implementation of this proposal would remove all the disadvantages listed above. However, there is another problem not mentioned before: It is cumbersome to disable adoption of selected variables. E.g.:

class grouping:

    def __init__(self, keep_this, and_this, but_not_this, but_this_again):
        self.keep_this = keep_this
        self.and_this = and_this
        self.but_this_again = but_this_again
        # real code, finally

would translate into:

class grouping:

    def __init__(self, keep_this, and_this, but_not_this, but_this_again):
        adopt_init_args(exclusions=["but_not_this"])
        # real code

Enhanced syntax proposal

The exclusion problem suggests these alternatives:

class grouping:

    def __init__(self, self.keep_this, self.and_this, but_not_this, self.but_this_again):
        # real code right here

This is conceptually similar to the existing automatic unpacking of tuples.

A shorter alternative (my personal favorite since minimally redundant):

class grouping:

    def __init__(self, .keep_this, .and_this, but_not_this, .but_this_again):
        # real code right here

I guess both versions could be implemented such that users don't incur a performance penalty compared to the self.x=x alternative. At the danger of being overly optimistic: I can imagine that my favorite alternative will actually be faster (and the fastest).

Enhanced __slot__ semantics proposal

When __slots__ are used (cool feature!) the boilerplate problem becomes even worse:

class grouping:

    __slots__ = ["keep_this", "and_this", "but_this_again"]

    def __init__(self, keep_this, and_this, but_not_this, but_this_again):
        self.keep_this = keep_this
        self.and_this = and_this
        self.but_this_again = but_this_again
        # real code, finally

Each variable name appears four times! Imagine yourself having to do this exercise given the real-world example above. Ouch.

Based on the "Enhanced syntax proposal" above I see this potential improvement:

class grouping:

    __slots__ = True

    def __init__(self, .keep_this, .and_this, but_not_this, .but_this_again):
        # real code right here

Each variable name appears only once. Phew!

Author: rwgk@yahoo.com, July 02, 2005