When I was thinking about folio one of the very early decisions I made was to separate a Work from a Book. Conceptually this is a very useful distinction - a Work is the literary effort an author made, while a Book is a concrete edition created by a publisher and then distributed to the public.
The semantics here matter because when I say “I read Hyperion by Dan Simmons” I mean I consumed the Hyperion Work, and not a specific edition of that Work. This facilitates thinking about the content of the literary effort, rather than getting lost in the debate of whether listening to an audiobook counts as reading. Consider the case where you tell a friend about a book (I use book here to demonstrate the colloquial use) you just finished: when you are extolling the virtues of said book, you are likely speaking about the virtues of the work. When your friend buys Hyperion and reads it, you don’t question which edition1 they read, you discuss the work itself.
The result of this decision is that in folio a Work is defined with the following properties, which will remain true of a work despite any changes to specific editions2:
class Work:
"""
Represents a literary work (the concept of a book, rather than a specific instance)
"""
title: str
author: str
year: int
genre: str
is_read: boolThe Work entity is the base around which the rest of folio grows, but I also do care about book editions and some other data. For example, I primarily read using KOReader on my (jailbroken) Kindle; yet for each book I finish, if it meets an entirely arbitrary threshold of a nebulously defined Quality, the work will earn itself an enshrined place as a print edition on my bookshelf.
The result is that a Book entry in the program implies I own that specific edition of a Work. Many different books, then, can reference a singular work (but a work needn’t know anything about the books themselves).
The process of creating a Book object, then, requires a few things: a Work to reference, format data, and, because I like them, an ISBN:
class Book:
"""
Represents a specific book instance
"""
work: Work
format_data: BookFormat
isbn: ISBNEach of the attributes in a Book is of a different type, rather than simple primitives like str or int. This of course complicates the definition and creation of a Book, but also gives me some nice, built-in validation functionality.
An early problem of Book was that I had an attribute called pages. This made perfect sense - a specific edition of a book, in my experience, tends to have a finite, a priori number of pages. This attribute allowed me to compare book lengths, calculate total number of pages read, and other such things.
It didn’t work.
Not because pages isn’t a useful concept, but because I made the horrible (or so my code proclaims) mistake of also consuming audiobooks. And thus began a spiral of trying to figure out how to map audiobook length to pages: use an average number of minutes for each page? It feels dirty and doesn’t really describe the domain properly. Get page counts from a physical edition of that audiobook? Sure, but then I’m contaminating my specific edition Book model!
In the end, I did the reasonable thing: created a FormatType definition that holds print, ebook, and audio, an abstract BookFormat class implemented by a TextFormat class and an AudioFormat class, and then ensured that those two classes use FormatType as part of their definition. A Book, then, references the abstract BookFormat and is therefore able to be a text-format book, or an audio-format book3.
I am told this is a prime example of dependency inversion!
@dataclass
class TextFormat(BookFormat):
"""
A textual book, either print or ebook
"""
pages: int | None
format_type: FormatType
def __post_init__(self):
if self.format_type not in (FormatType.PRINT, FormatType.EBOOK):
raise ValueError("Invalid format type for text")
@property
def length(self) -> int | None:
return self.pages
@property
def format(self):
return self.format_type.value
def __str__(self):
return f"{self.length} ({self.format})"
@dataclass
class AudioFormat(BookFormat):
"""
An audiobook
"""
duration: timedelta
narrator: str | None
format_type: FormatType = FormatType.AUDIO
@property
def length(self) -> int:
return int(self.duration.total_seconds() // 60)
@property
def format(self):
return self.format_type.value
def __str__(self):
return f"{self.length} ({self.format}) (Narrator: {self.narrator})"It is more code than I intended to write for a toy project, but it does come with some niceties: a an audiobook, by definition, has a length property defined in minutes, while a textual book has a length defined in pages. Additionally, because audiobooks are the special case, the use of AudioFormat means I can easily include narrator data only when instantiating an audiobook - a property that really does matter, since an audiobook is as much a literary effort as it is a dramatic performance.
This kind of logic is also very useful for the ISBN field: by using class composition rather than a simple string, I can encapsulate validation and other such things that are relevant only to ISBN without really modifying the signature of my Book model, and therefore without being afraid of breaking the entire program.
One of the big things here is to ensure that data is not tightly coupled - that is, that each piece of code can exist as independently as possible (with exceptions, of course: conceptually a Book cannot exist without a Work, in code or in the real world). Thus, notice that the entire model design has zero references to saving, to specific id fields, or anything like that: they are individual units that will be used by a repository to interact with storage services, be it SQL databases, CSV files, or what have you..
Footnotes
-
There are obviously glaring counter-examples to this. Academic work really does care about specific editions; translated books might have relevant nuances in one translation that are shown differently in another - I’m looking at you Fagles/Lattimore. ↩
-
Again, there are edge-cases: a brand new edition of a classic book that, happily, re-writes the entire book! ↩
-
In fancy, software-speak: “This polymorphism lets me treat all Books uniformly while delegating format-specific behavior to the right class.” Thanks, ChatGPT. ↩