Timothy had written his hundredth Book class. Each time, the same tedious pattern: write __init__, define each attribute, write __repr__, write __eq__, write comparison methods. His simple data-holding classes had become exercises in repetitive boilerplate.
class Book: def __init__(self, title, author, year, pages): self.title = title self.author = author self.year = year self.pages = pages def __repr__(self): return f'Book(title={self.title!r}, author={self.author!r}, year={self.year}, pages={self.pages})' def __eq__(self, other): if not isinstance(other, Book): return NotImplemented return (self.title == other.title and self.author == other.author and self.year == other.year and self.pages == other.pages) def __lt__(self, other): if not isinstance(other, Book): return NotImplemented return (self.title, self.author, self.year) < (other.title, other.author, other.year) # 30 lines of boilerplate for 4 attributes!
Margaret found him copying and pasting __init__ for the fifth time that day. “You’re hand-crafting every blueprint,” she observed. “Come to the Blueprint Factory—where Python generates the boring parts automatically.”
Margaret showed Timothy Python’s modern shortcut:
from dataclasses import dataclass @dataclass class Book: title: str author: str year: int pages: int # That's it! Python generates __init__, __repr__, __eq__ automatically dune = Book("Dune", "Herbert", 1965, 412) print(dune) # Book(title='Dune', author='Herbert', year=1965, pages=412) foundation = Book("Foundation", "Asimov", 1951, 255) print(dune == foundation) # False - different books print(dune == Book("Dune", "Herbert", 1965, 412)) # True - same values
“The @dataclass decorator generates methods automatically,” Margaret explained. “Type hints become the attributes. No more writing self.title = title repeatedly.”
What Dataclasses Generate
Timothy learned what the decorator created:
@dataclass class Book: title: str author: str year: int pages: int # Python automatically generates: # __init__(self, title, author, year, pages) # __repr__(self) # __eq__(self, other) # You get this behavior for free: dune = Book("Dune", "Herbert", 1965, 412) # Readable representation print(repr(dune)) # Book(title='Dune', author='Herbert', year=1965, pages=412) # Value equality another_dune = Book("Dune", "Herbert", 1965, 412) print(dune == another_dune) # True # But NOT __hash__ - dataclasses are mutable by default # Can't use as dict keys or in sets without frozen=True
“Dataclasses are optimized for data storage,” Margaret noted. “Python generates the most common methods, saving you from writing boilerplate.”
Type Hints Are Documentation, Not Enforcement
Margaret clarified an important limitation:
@dataclass class Book: title: str author: str pages: int # Python doesn't enforce types at runtime! book = Book("Dune", "Herbert", "not an int") # No error! print(book.pages) # "not an int" - type hint ignored # Type hints are: # - Documentation for humans # - Used by type checkers (mypy, pyright) # - Used by IDEs for autocomplete # - NOT enforced at runtime
“Type hints document intent,” Margaret explained. “They help tools catch errors before runtime, but Python won’t stop you from passing wrong types. Use type checkers in your development workflow for safety.”
Default Values
Timothy learned to provide defaults:
@dataclass class Book: title: str author: str year: int = 2024 # Default value pages: int = 0 # Default value isbn: str = "" # Default value # Can omit fields with defaults recent_book = Book("New Release", "Modern Author") print(recent_book.year) # 2024 # Or provide all values classic = Book("Dune", "Herbert", 1965, 412, "978-0441013593")
“Fields with defaults must come after fields without defaults,” Margaret cautioned. “Python requires non-default parameters before default parameters.”
The field() Function for Advanced Defaults
Margaret showed Timothy how to customize fields:
from dataclasses import dataclass, field @dataclass class Book: title: str author: str year: int pages: int tags: list = field(default_factory=list) # Mutable default metadata: dict = field(default_factory=dict) # Each instance gets its own list/dict book1 = Book("Dune", "Herbert", 1965, 412) book2 = Book("Foundation", "Asimov", 1951, 255) book1.tags.append("scifi") print(book1.tags) # ["scifi"] print(book2.tags) # [] - separate list!
“Never use mutable defaults directly,” Margaret warned. “Use default_factory to create a new instance for each object. This avoids the shared mutable default trap.”
The Dangerous Mutable Default
Timothy saw what happens without default_factory:
# WRONG - shared mutable default @dataclass class Book: title: str tags: list = [] # ERROR! This list is shared! book1 = Book("Dune") book2 = Book("Foundation") book1.tags.append("scifi") print(book2.tags) # ["scifi"] - OOPS! Shared list! # RIGHT - use default_factory @dataclass class Book: title: str tags: list = field(default_factory=list) book1 = Book("Dune") book2 = Book("Foundation") book1.tags.append("scifi") print(book2.tags) # [] - separate lists!
“This is the same trap from regular classes,” Margaret explained. “Always use default_factory for lists, dicts, sets, or any mutable default.”
Frozen Dataclasses: Immutability
Margaret showed Timothy immutable dataclasses:
@dataclass(frozen=True) class Book: title: str author: str year: int pages: int dune = Book("Dune", "Herbert", 1965, 412) # Can't modify - raises FrozenInstanceError # dune.pages = 500 # Error! # But frozen dataclasses are hashable book_ratings = { Book("Dune", "Herbert", 1965, 412): 5, Book("Foundation", "Asimov", 1951, 255): 4 } # Can use in sets unique_books = { Book("Dune", "Herbert", 1965, 412), Book("Dune", "Herbert", 1965, 412), # Duplicate removed } print(len(unique_books)) # 1
“Frozen dataclasses are immutable like tuples,” Margaret explained. “They can’t be modified after creation, but they gain __hash__ automatically—enabling use as dict keys and in sets.”
The Danger of unsafe_hash
Margaret warned Timothy about a treacherous option:
# DANGEROUS - mutable dataclass with hash @dataclass(unsafe_hash=True) class Book: title: str pages: int # Mutable field! book = Book("Dune", 412) books_set = {book} # Add to set using hash # Mutation breaks the hash invariant! book.pages = 500 print(book in books_set) # May be False - set can't find it! # The hash was computed with pages=412 # Now pages=500 but the hash is stale # The set is corrupted!
“Never use unsafe_hash=True with mutable dataclasses,” Margaret cautioned. “Python calls it ‘unsafe’ for good reason. If you hash a mutable object and then mutate it, sets and dictionaries break. Only use hashing with frozen=True, where immutability guarantees the hash stays valid.”
Creating Modified Copies with replace()
Timothy learned to create modified copies of frozen dataclasses:
from dataclasses import dataclass, replace @dataclass(frozen=True) class Book: title: str author: str year: int pages: int dune = Book("Dune", "Herbert", 1965, 412) # Can't modify frozen dataclass # dune.pages = 500 # FrozenInstanceError! # But can create modified copy updated = replace(dune, pages=500) print(updated) # Book(title='Dune', author='Herbert', year=1965, pages=500) print(dune.pages) # 412 - original unchanged # Can change multiple fields revised = replace(dune, year=1966, pages=450)
“The replace() function creates a copy with specified fields changed,” Margaret explained. “It’s like string methods that return new strings—the original stays unchanged. This is how you ‘modify’ immutable dataclasses.”
Ordering and Comparison
Timothy learned to make dataclasses sortable:
from dataclasses import dataclass @dataclass(order=True) class Book: title: str author: str year: int pages: int books = [ Book("Foundation", "Asimov", 1951, 255), Book("Dune", "Herbert", 1965, 412), Book("1984", "Orwell", 1949, 328), ] # Now sortable! sorted_books = sorted(books) for book in sorted_books: print(f"{book.title} by {book.author} ({book.year})") # 1984 by Orwell (1949) # Dune by Herbert (1965) # Foundation by Asimov (1951)
“With order=True, Python generates comparison methods,” Margaret noted. “Books compare field-by-field in declaration order: title first, then author, then year, then pages.”
Keyword-Only Arguments for Safety
Margaret showed Timothy how to prevent positional argument mistakes:
from dataclasses import dataclass @dataclass(kw_only=True) class Book: title: str author: str year: int pages: int # Must use keyword arguments book = Book(title="Dune", author="Herbert", year=1965, pages=412) # OK # Positional arguments don't work # book = Book("Dune", "Herbert", 1965, 412) # TypeError!
“The kw_only=True parameter forces keyword arguments,” Margaret explained. “If you later reorder fields or add new ones, calls won’t break silently. The argument names document what each value means.”
Memory Optimization with slots
Timothy learned about Python 3.10’s major optimization:
# Regular dataclass - uses __dict__ @dataclass class Book: title: str author: str year: int pages: int # With slots - 50%+ less memory @dataclass(slots=True) class CompactBook: title: str author: str year: int pages: int # Benefits of slots=True: # - Significantly less memory per instance # - Faster attribute access # - Prevents adding attributes dynamically # - Cannot use __dict__-based features # For thousands of instances, slots saves substantial memory books = [CompactBook(f"Book{i}", "Author", 2024, 300) for i in range(10000)] # Uses ~50% less memory than without slots
“For classes with many instances,” Margaret advised, “use slots=True. It trades flexibility for efficiency—you can’t add attributes dynamically, but you save memory and gain speed.”
Customizing Comparison Order
Timothy discovered he could control which fields mattered for sorting:
from dataclasses import dataclass, field @dataclass(order=True) class Book: sort_index: int = field(init=False, repr=False) title: str = field(compare=False) author: str = field(compare=False) year: int pages: int = field(compare=False) def __post_init__(self): # Sort by year only self.sort_index = self.year books = [ Book("Foundation", "Asimov", 1951, 255), Book("Dune", "Herbert", 1965, 412), Book("1984", "Orwell", 1949, 328), ] sorted_books = sorted(books) for book in sorted_books: print(f"{book.title} ({book.year})") # 1984 (1949) # Foundation (1951) # Dune (1965)
“The compare=False parameter excludes fields from comparison,” Margaret explained. “The init=False parameter means the field isn’t part of __init__. The repr=False parameter excludes it from the string representation.”
Post-Init Processing with post_init
Margaret showed Timothy validation and computed fields:
@dataclass class Book: title: str author: str year: int pages: int def __post_init__(self): # Validation after initialization if self.pages < 0: raise ValueError("Pages cannot be negative") if self.year < 1000: raise ValueError("Year seems unrealistic") # Normalize title self.title = self.title.strip() # Validation runs automatically try: bad_book = Book("Test", "Author", 2024, -100) except ValueError as e: print(e) # "Pages cannot be negative" # Normalization happens automatically book = Book(" Dune ", "Herbert", 1965, 412) print(book.title) # "Dune" - whitespace stripped
“The __post_init__ method runs after __init__ completes,” Margaret explained. “Use it for validation, normalization, or computing derived fields.”
Converting to Dictionaries and Tuples
Margaret showed Timothy how to serialize dataclasses:
from dataclasses import dataclass, asdict, astuple @dataclass class Book: title: str author: str year: int pages: int dune = Book("Dune", "Herbert", 1965, 412) # Convert to dictionary book_dict = asdict(dune) print(book_dict) # {'title': 'Dune', 'author': 'Herbert', 'year': 1965, 'pages': 412} # Convert to tuple (in field order) book_tuple = astuple(dune) print(book_tuple) # ('Dune', 'Herbert', 1965, 412) # Useful for: # - JSON serialization: json.dumps(asdict(book)) # - Database inserts: cursor.execute(sql, astuple(book)) # - CSV writing: writer.writerow(astuple(book))
“The asdict() function creates a dictionary of field names to values,” Margaret explained. “astuple() creates a tuple of values in field order. Both work recursively with nested dataclasses.”
Computed Fields with post_init
Timothy learned to create fields based on other fields:
@dataclass class Book: title: str author: str year: int pages: int reading_time_minutes: int = field(init=False) def __post_init__(self): # Compute reading time based on pages self.reading_time_minutes = self.pages * 2 dune = Book("Dune", "Herbert", 1965, 412) print(dune.reading_time_minutes) # 824 - computed automatically
Inheritance with Dataclasses
Margaret showed Timothy dataclass inheritance:
@dataclass class Book: title: str author: str year: int pages: int @dataclass class Audiobook(Book): narrator: str duration_minutes: int # Child inherits parent's fields audiobook = Audiobook( title="Dune", author="Herbert", year=1965, pages=0, narrator="Scott Brick", duration_minutes=1233 ) print(audiobook) # Audiobook(title='Dune', author='Herbert', year=0, pages=0, # narrator='Scott Brick', duration_minutes=1233)
“Child dataclasses inherit parent fields,” Margaret noted. “Parent fields come first in __init__, then child fields. All the generated methods work with the combined fields.”
Converting Regular Classes to Dataclasses
Timothy learned when to use dataclasses:
# Before - regular class with boilerplate class Book: def __init__(self, title, author, year, pages): self.title = title self.author = author self.year = year self.pages = pages def __repr__(self): return f'Book(title={self.title!r}, author={self.author!r}, year={self.year}, pages={self.pages})' def __eq__(self, other): if not isinstance(other, Book): return NotImplemented return (self.title, self.author, self.year, self.pages) == (other.title, other.author, other.year, other.pages) def get_reading_time(self): return self.pages * 2 # After - dataclass with method @dataclass class Book: title: str author: str year: int pages: int def get_reading_time(self): return self.pages * 2
“Replace classes that are primarily data containers,” Margaret advised. “Keep the dataclass for structure, add methods for behavior.”
When to Use Dataclasses
Margaret clarified when dataclasses made sense:
Use dataclasses when:
- The class primarily holds data
- You need
__init__,__repr__,__eq__automatically - You want type hints on attributes
- The class is relatively simple (not complex behavior)
Don’t use dataclasses when:
- The class has complex initialization logic
- You need custom
__init__with non-trivial processing - The class is primarily behavior, not data
- You need fine control over magic methods
Dataclass Options Summary
Margaret showed Timothy all available options:
@dataclass( init=True, # Generate __init__ (default: True) repr=True, # Generate __repr__ (default: True) eq=True, # Generate __eq__ (default: True) order=False, # Generate comparison methods (default: False) unsafe_hash=False, # Generate __hash__ - DANGEROUS with mutable! (default: False) frozen=False, # Make immutable (default: False) slots=False, # Use __slots__ for memory efficiency (default: False, Python 3.10+) kw_only=False # Require keyword arguments (default: False, Python 3.10+) ) class Book: title: str
Real-World Example: Configuration Class
Margaret demonstrated a practical pattern:
from dataclasses import dataclass, field from typing import Optional @dataclass(frozen=True) class DatabaseConfig: host: str port: int = 5432 database: str = "library" username: str = "admin" password: str = field(repr=False) # Don't print password ssl_enabled: bool = True pool_size: int = 10 timeout: Optional[int] = None def __post_init__(self): if self.port < 1 or self.port > 65535: raise ValueError(f"Invalid port: {self.port}") if self.pool_size < 1: raise ValueError("Pool size must be positive") # Create configuration config = DatabaseConfig( host="localhost", password="secret123" ) print(config) # DatabaseConfig(host='localhost', port=5432, database='library', # username='admin', ssl_enabled=True, pool_size=10, timeout=None) # Notice password is hidden! # Immutable - can't accidentally modify # config.port = 3306 # FrozenInstanceError # Can use as dict key configs = { DatabaseConfig(host="prod.db", password="prod123"): "production", DatabaseConfig(host="dev.db", password="dev456"): "development" }
Timothy’s Dataclass Wisdom
Through exploring the Blueprint Factory, Timothy learned essential principles:
@dataclass generates boilerplate: Automatically creates __init__, __repr__, __eq__.
Type hints define attributes: Each typed attribute becomes a field.
Type hints are not enforced: They’re documentation and tool guidance, not runtime checks.
Default values come after non-defaults: Python requirement for parameters.
Use field(default_factory=…) for mutables: Never use mutable defaults directly—always use default_factory for lists, dicts, sets.
frozen=True makes immutable: Can’t modify after creation, gains __hash__ automatically.
unsafe_hash=True is dangerous: Only use with frozen dataclasses—mutable + hash corrupts sets and dicts.
replace() creates modified copies: The way to “change” frozen dataclasses without mutation.
order=True enables sorting: Generates comparison methods for sorting.
kw_only=True forces keyword arguments: Prevents positional mistakes, makes code clearer (Python 3.10+).
slots=True saves memory: 50%+ less memory, faster access, but less flexible (Python 3.10+).
compare=False excludes fields: Control which fields matter for equality and ordering.
init=False excludes from init: For computed or internal fields.
repr=False hides from string: For sensitive data like passwords.
post_init runs after init: Use for validation, normalization, or computed fields.
asdict() converts to dictionary: For JSON serialization, APIs, databases.
astuple() converts to tuple: For CSV writing, database inserts, ordered data.
Dataclasses support inheritance: Child inherits parent fields.
Dataclasses can have methods: Add behavior alongside data.
Use for data-heavy classes: Replace boilerplate-heavy classes with dataclasses.
Don’t use for behavior-heavy classes: Complex logic needs regular classes.
Frozen dataclasses work as dict keys: Immutability enables hashing.
Python’s Blueprint Factory
Timothy had discovered Python’s Blueprint Factory—the @dataclass decorator that eliminated repetitive boilerplate for data-holding classes. By declaring attributes with type hints, Python generated all the standard methods automatically. He learned to use field() for customization, frozen=True for immutability, order=True for sorting, and __post_init__ for validation. Modern Python 3.10+ features like slots=True offered dramatic memory savings, while kw_only=True prevented positional argument mistakes. He discovered asdict() and astuple() for serialization, and replace() for creating modified copies of frozen objects. Yet he also learned the dangers—unsafe_hash=True with mutable data corrupts sets, and type hints are documentation, not enforcement. The Blueprint Factory revealed that modern Python didn’t require hand-crafting every class—for simple data containers, the decorator handled the tedious parts, letting Timothy focus on the unique logic that mattered.
Aaron Rose is a software engineer and technology writer at tech-reader.blog and the author of Think Like a Genius.
Source: DEV Community.

Leave a Reply