Git Product home page Git Product logo

simplestruct's Introduction

SimpleStruct

(Supports Python 3.3 and up)

This small library makes it easier to create "struct" classes in Python without writing boilerplate code. Structs are similar to the standard library's collections.namedtuple but are more flexible, relying on an inheritance-based approach instead of eval()ing a code template. If you like using namedtuple classes but wish they were more composable and extensible, this project is for you.

Example

Writing struct classes by hand is tedious and error prone. Consider a simple point class. The bare minimum we can write in Python is

class Point2D:
    def __init__(self, x, y):
        self.x = x
        self.y = y

We'll likely want to compare points for equality and pretty-print them for debugging.

class Point2D:
    def __init__(self, x, y):
        self.x = x
        self.y = y
    def __repr__(self):
        # Separate __str__() would be nice too
        return 'Point2D({!r}, {!r})'.format(self.x, self.y)
    def __eq__(self, other):
        # Should check other's type too
        return self.x == other.x and self.y == other.y
    def __hash__(self):
        # Required because we're overriding __eq__().
        return hash(self.x) ^ hash(self.y)

Already the code is becoming pretty verbose for such a simple concept. Worse, it violates the DRY principle in that the x and y fields appear many times. This isn't very robust. If we decide to turn this into a Point3D class, we'll have to upgrade each method to accommodate a new z coordinate. We could be in for an infuriating bug if we forget to update __eq__() or __hash__(). Adding more utilities like a copy/replace method will exacerbate the situation.

Then there's the added code for consistency checking. Maybe you're the sort of heathen who prefers dynamic type checking over blindly trusting Mama Ducktype. Or maybe you want to disallow overwriting x and y so as to avoid changing its hash value. Either way you'd need to use descriptors or properties to intercept writes.

SimpleStruct provides a simple alternative. Here is a Point2D class that provides everything described above.

from numbers import Number      # standard library abstract base class
from simplestruct import Struct, Field, TypedField

class Point2D(Struct):
    # Note that field declaration order matters.
    x = TypedField(Number)
    y = TypedField(Number)

Of course, customizations are possible. Type checking is by no means required, objects may be mutable so long as they are not hashed, and you can add your own non-Field attributes and properties.

class Point2D(Struct):
    _immutable = False
    x = Field
    y = Field
    
    # magnitude won't be considered when hashing or testing equality
    @property
    def magnitude(self):
        return (self.x**2 + self.y**2) ** .5

For more usage examples, see the sample files:

File Purpose
point.py introduction, basic use
typed.py type-checked fields
vector.py advanced features
abstract.py mixing structs and metaclasses

Comparison and feature matrix

The most important problems mentioned above are solved by using namedtuple, but this approach begins to break down when you start to customize classes. To add a property to a namedtuple, you must define a subclass:

BasePerson = namedtuple('BasePerson', 'fname lname age')
class Person(BasePerson):
    @property
    def full_name(self):
        return self.fname + ' ' + self.lname

If on the other hand you want to extend an existing namedtuple with new fields, it's a bit harder. You need to regenerate (not inherit) the boilerplate methods so they recognize the new fields. This can be done using multiple inheritance:

BaseEmployee = namedtuple('BaseEmployee', Employee._fields + ('salary',))
class Employee(BaseEmployee, Person):
    pass

Implementation wise, namedtuple works by dynamically evaluating a templated class definition based on the built-in tuple type. This gives it a speed advantage, but is also the main reason why it is less extensible (and unable to handle mutable values).

In contrast, SimpleStruct is based on metaclasses, descriptors, and dynamic dispatch. The below matrix summarizes the feature comparison.

Feature Avoids boilerplate for Supported by namedtuple?
easy construction __init__()
extra attributes on self subclasses only
pretty printing __str()__, __repr()__
structural equality __eq__()
easy inheritance
optional mutability
hashing (if immutable) __hash__()
pickling / deep-copying
tuple decomposition __len__, __iter__
optional type checking __init__(), @property
_asdict() / _replace()

MacroPy's case classes provide similar functionality, but is implemented in a very different way. Instead of metaclass hacking or source code templating, it relies on syntactic transformation of the module's AST. This allows for a syntax that's very different from what we've seen above. So different, in fact, that we might view MacroPy as an extension to the Python language rather than as just a library. MacroPy case classes are subject to limitations on inheritance and class members.

Installation

As with most Python packages, SimpleStruct is available on PyPI:

python -m pip install simplestruct

Or grab a development version if you're so inclined:

python -m pip install https://github.com/brandjon/simplestruct/tree/tarball/develop

Python 3.3 and 3.4 are supported. There are no additional dependencies.

Developers

Tests can be run with python setup.py test, or alternatively by installing Tox and running python -m tox in the project root. Tox has the advantage of automatically testing under both Python 3.3 and 3.4. Building a source distribution (python setup.py sdist) requires the setuptools extension package setuptools-git.

References

[1] The standard library's namedtuple feature

[2] Li Haoyi's case classes (part of MacroPy)

[3] Reece Hart's blog post on inheriting from namedtuple

simplestruct's People

Contributors

brandjon avatar sadboy avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.