Creating Abstractions

Creating abstractions

In my current project, called folio I am trying to implement and keep SOLID principles and good Design Patterns. One of the main things to do is to follow the Dependency Inversion Principle (DIP), which states that details should depend on abstractions and high-level modules should not depend on low level modules.

While doing this, I am also trying to follow Domain Driven Design. The idea is to have a series of models that define the core objects of my application, and to do that without any reference to other structures or systems. In the case of folio, this means that Book, Work, Address, and Employment are abstract constructs that hold data about what they represent, but should be blissfully unaware about anything else in the codebase.

Now the first thing is to create an Abstraction to represent all of these objects. Since the code is primarily made to store records, a Record abstraction seems appropriate:

class Record[R: "Record"](ABC):
	...

Now, the base Record class is very useful, since I can use it to keep my code DRY while allowing each subclass to customize its own behavior. To do this, I implemented a few dunder methods and imported the functools total_ordering decorator:

...
from dataclasses import dataclass
from functools import total_ordering
...
 
 
@total_ordering
@dataclass
class Record[R: "Record"](ABC):
    """
    Abstract base class for all records
    """
    @abstractmethod
    def __str__(self) -> str: ...
 
    @abstractmethod
    def _identity_fields(self) -> tuple: ...
 
    @abstractmethod
    def _ordering_fields(self) -> tuple: ...
 
    def __eq__(self) -> bool:
        return (
            isinstance(other, self.__class__)
            and self._identity_fields() == other._identity_fields()
        )
 
    def __lt__(self, other) -> bool:
        if isinstance(other, self.__class__):
            return NotImplemented
        return self._ordering_fields() < other._ordering_fields()
 
    def __hash__(self) -> hash:
        return hash(self._identity_fields())

This structure allows me to be very minimal about the code I write, and the use of abstractmethods ensures that all child classes will implement the required methods for everything to work. Thus, for example, my Work class can be implemented as:

 
class Work(Record["Work"]):
    """
    Represents a literary work (the concept of a book, rather than a specific instance)
    """
 
    title: str
    author: str
    year: int | None
    genre: str | None
    is_read: bool
 
    def __str__(self):
        read = "Read" if self.is_read else "Not read"
        return f"{self.title} by {self.author} (year: {self.year}, {read})"
 
    def _identity_fields(self):
        """
        Equality and hasing use title, author, and year.
        """
        return (
            self.author.lower().strip(),
            self.title.lower().strip(),
            (self.year is None, self.year),  # None years sort last
        )
 
    def _ordering_fields(self):
        """
        Sort by author, then title, then year.
        """
        return self._identity_fields()

Notice how the implementation of equality, hashing, and comparison operators is now accomplished by creating sets of tuples. This, I think, is far more readable than the traditional method of comparing each field individually, such as:

def __lt__(self):
return (
	    self.author.lower().strip(),
	    self.title.lower().strip(),
	    (self.year is None, self.year),
	) < (
	    other.author.lower().strip(),
	    other.title.lower().strip(),
	    (other.year is None, other.year),
	)

By abstracting away the comparisons, I can avoid writing other.property for every single property on every single comparison operator. Additionally, this allows me to maintain the Liskov Substitution Principle (LSP), since every subclass has the same interface.

Dependency Injection

One thing that is not included into the code at all is the use of data validation. Take the definition of Book, for example:

class Book(Record["Book"]):
    """
    Represents a specific book instance
    """
 
    class Format(Enum):
        PRINT = "print"
        AUDIO = "audio"
        EBOOK = "ebook"
 
    work: Work
    pages: int | None
    format: Format
    isbn: str | None
 
    def __str__(self):
        return f"{self.work.title} - {self.work.author}: {self.pages}, {self.isbn}, ({self.format})"
 
    def _identity_fields(self):
        """
        Equality and hashing use Work, format, and ISBN
        """
        return (
            self.work,  # uses Work's identity fields
            self.format.value,
            self.isbn or "",
        )
 
    def _ordering_fields(self):
        """
        Sort by Work, format, ISBN, then pages.
        """
        return (
            *self._identity_fields(),
            (self.pages is None, self.pages),  # None pages sort last
        )

ISBNs are 10- or 13- digit strings, but the current implementation does not do any sort of validation. In theory, a Book could be created with the isbn field set to a novella, for all it cares. I thought about this issue, however, and figured that including validation in each of the model classes would be a bad idea - it would mean that I am mixing layers needlessly, and would cause testing to become far more difficult. Importantly, this would also violate the Single Responsibility Principle (SRP) and the Open-Closed Principle (OCP): to include validation logic in the class would be to add externalities to the very identity of the Book, but validation isn’t part of what it means to be a book.

However, I can use Dependency Injection to make sure that things work: in the Record class I implemented the following:

    @classmethod
    def deserialize(cls: R, data: O, serializer: SerializeStrategy[R]) -> R:
        """
        Deserialize data into a Record using the provided serializer.
        """
        return serializer.deserialize(data, cls)
 
    def serialize(self, serializer: SerializeStrategy[R]) -> O:
        """
        Serialize this Record using the provided Serializer
        """
        return serializer.serialize(self)
 
    def validate(self, validator: Validator) -> ValidationResult:
        """
        Validate this Record using the provided Validator
        """
        return validator.validate(self)

This means that every sublcass of a Record can serialize, validate (and eventually format) its data by using an injected dependency that isn’t part of the object itself. Doing so also means I can test the validators and serializers on their own, and have separate integration tests withot creating extremely complex mock objects.

Finally, the real benefit of this is I can change validators, serializers, or formatters at run time. I will explore this at a different time, but suffice to say that with the implementation of serialize and desearlize I can have logic to export an object into JSON, a Dict, SQL, etc. without having to modify the object itself at all.

That’s pretty cool.

arvb.net

Explorer

Creating Abstractions

Creating abstractions

Dependency Injection

Graph View

Table of Contents

Backlinks