Iterator
The iterator pattern provides a standard way to access the elements of an aggregate object sequentially, without exposing its internal representation - this means a client can traverse a collection of items without knowing how those objects are represented.
TL;DR
You need multiple iteration behaviors for the same set of objects or you want to separate structure traversal logic from the collection itself.
Problem
Suppose you have a massive log file, and you need to go over it in possibly different ways: sometimes you need to extract lines that include ERROR , sometimes by timestamp, or perhaps by error level.
Because there are multiple ways to traverse the file, you have multiple algorithms for this, but you want to adhere to the Single Responsibility Principle (SRP) and avoid adding each method inside the LogFile object itself.
Additionally, to keep Open-Closed Principle (OCP) and establish a proper separation of concerns, the LogFile object does not need to know anything about the structure of the file itself - its responsibility is simply to open and close the file, and perhaps other things.
Solution
The Iterator pattern suggests that the traversal behavior should be extracted into a separate object called an iterator, such as LogFileIterator. This new object encapsulates all the traversal details and, due to a unified interface can be easily extended without modifying the LogFile class itself.
from __future__ import annotations
from collections.abc import Iterable, Iterator
from typing import Optional
class BaseLogIterator(Iterator):
def __init__(self, logfile: LogFile):
self._logfile = logfile
self._file = open(self._logfile.path, "r")
self._lines = self._filtered_lines()
def __iter__(self) -> BaseLogIterator:
return self
def __next__(self) -> str:
return next(self._lines)
def _filtered_lines(self):
raise NotImplementedError("Subclasses must implement _filtered_lines()")
def __del__(self):
try:
self._file.close()
except Exception:
pass
class AllLinesIterator(BaseLogIterator):
def _filtered_lines(self):
for line in self._file:
yield line.strip()
class ErrorLinesIterator(BaseLogIterator):
def _filtered_lines(self):
for line in self._file:
if "ERROR" in line:
yield line.strip()
class LevelLinesIterator(BaseLogIterator):
def __init__(self, logfile: LogFile, level: str):
super().__init__(logfile)
self._level = level
def _filtered_lines(self):
for line in self._file:
if line.startswith(self._level):
yield line.strip()
class LogFile(Iterable):
def __init__(self, path: str):
self.path = path
def __iter__(self) -> AllLinesIterator:
return AllLinesIterator(self)
def iter_errors(self) -> ErrorLinesIterator:
return ErrorLinesIterator(self)
def iter_by_level(self, level: str) -> LevelLinesIterator:
return LevelLinesIterator(self, level)
if __name__ == "__main__":
log = LogFile("example.log")
print("All log lines:")
for line in log:
print(line)
print("\nERROR log lines:")
for line in log.iter_errors():
print(line)
print("\nOnly WARNING lines:")
for line in log.iter_by_level("WARNING"):
print(line)