How to write efficient Python data classes

Photo by author

# Introduction

Standard Python objects store attributes in a dictionary for example. They are not hashable unless you implement hashing manually, and they compare all attributes by default. This default behavior is sensible but not optimal for applications that generate many instances or require objects as cache keys.

Data classes Address these limitations through configuration rather than custom code. You can use parameters to change how instances behave and how much memory they use. Field-level settings also allow you to exclude attributes from comparisons, specify safe defaults for variable values, or control methods.

This article focuses on the key capabilities of the Data class that improve performance and maintainability without adding complexity.

You can find the code on GitHub.

# 1. Frozen data classes for hashability and security

Making your data classes immutable provides hashability. This allows you to use for example as dictionary keys or store them in sets, as shown below:

from dataclasses import dataclass

@dataclass(frozen=True)
class CacheKey:
    user_id: int
    resource_type: str
    timestamp: int
    
cache = {}
key = CacheKey(user_id=42, resource_type="profile", timestamp=1698345600)
cache(key) = {"data": "expensive_computation_result"}

frozen=True The parameter makes all fields immutable after initialization and applies it automatically __hash__(). Without it, you will suffer TypeError For example when trying to use keys as a dictionary.

This pattern is essential for building caching layers, deduction logic, or any data structure that requires hashable types. Immutability also prevents entire types of bugs where state is modified unpredictably.

# 2. Slots for memory performance

When you instantiate thousands of objects, the memory overhead compounds quickly. Here is an example:

from dataclasses import dataclass

@dataclass(slots=True)
class Measurement:
    sensor_id: int
    temperature: float
    humidity: float

slots=True The parameter terminates per win __dict__ That python usually creates. Instead of storing attributes in a dictionary, slots use a more compact fixed-size array.

For a simple data class like this, you Save several bytes per instance and get faster attribute access. The trade-off is that you can’t add new attributes dynamically.

# 3. Custom equations with field parameters

You often don’t need every field to participate in equality checking. This is especially true when dealing with metadata or timestamps, as in the following example:

from dataclasses import dataclass, field
from datetime import datetime

@dataclass
class User:
    user_id: int
    email: str
    last_login: datetime = field(compare=False)
    login_count: int = field(compare=False, default=0)

user1 = User(1, "alice@example.com", datetime.now(), 5)
user2 = User(1, "alice@example.com", datetime.now(), 10)
print(user1 == user2)

Output:

compare=False A parameter in a field excludes it from being auto-generated __eq__() method

Here, two users are considered equal if they share the same ID and email, regardless of when they log in or how often. This prevents extreme inequality when comparing objects that represent the same logical entity but have different tracking metadata.

# 4. Works with the factory default factory

Using variable defaults in function signatures is one Python gotcha. Data classes provide a neat solution:

from dataclasses import dataclass, field

@dataclass
class ShoppingCart:
    user_id: int
    items: list(str) = field(default_factory=list)
    metadata: dict = field(default_factory=dict)

cart1 = ShoppingCart(user_id=1)
cart2 = ShoppingCart(user_id=2)
cart1.items.append("laptop")
print(cart2.items)

default_factory The parameter takes a call that generates a new default value for each instance. Without it, using items: list = () All instances will create the same shared list – classic mutable default gotcha!

This pattern works for lists, tuples, sets, or any mutable type. You can also pass custom factory functions for more complex initialization logic.

# 5. Post-Initialization Processing

Sometimes you need to fetch fields or validate data after auto-generate __init__ Here’s how you can use it post_init Hooks:

from dataclasses import dataclass, field

@dataclass
class Rectangle:
    width: float
    height: float
    area: float = field(init=False)
    
    def __post_init__(self):
        self.area = self.width * self.height
        if self.width <= 0 or self.height <= 0:
            raise ValueError("Dimensions must be positive")

rect = Rectangle(5.0, 3.0)
print(rect.area)

__post_init__ The procedure runs immediately after creation __init__ complete init=False The parameter on the region prevents it from becoming one __init__ Parameter

This pattern is perfect for calculating fields, validation logic, or normalizing input data. You can also use it to change fields or set up attackers that depend on multiple fields.

# 6. Ordering with the order parameter

Sometimes, you need your data class instances to configure. Here is an example:

from dataclasses import dataclass

@dataclass(order=True)
class Task:
    priority: int
    name: str
    
tasks = (
    Task(priority=3, name="Low priority task"),
    Task(priority=1, name="Critical bug fix"),
    Task(priority=2, name="Feature request")
)

sorted_tasks = sorted(tasks)
for task in sorted_tasks:
    print(f"{task.priority}: {task.name}")

Output:

1: Critical bug fix
2: Feature request
3: Low priority task

order=True Generates parameter comparison methods (__lt__for , for , for , . __le__for , for , for , . __gt__for , for , for , . __ge__) based on field order. Fields are compared from left to right, so priority takes precedence over the instance name.

This feature allows you to perform natural aggregation without writing custom comparison logic or key functions.

# 7. Field order and Inishwar

When initialization logic requires values that should not become instance attributes, you can use InitVaras shown below:

from dataclasses import dataclass, field, InitVar

@dataclass
class DatabaseConnection:
    host: str
    port: int
    ssl: InitVar(bool) = True
    connection_string: str = field(init=False)
    
    def __post_init__(self, ssl: bool):
        protocol = "https" if ssl else "http"
        self.connection_string = f"{protocol}://{self.host}:{self.port}"

conn = DatabaseConnection("localhost", 5432, ssl=True)
print(conn.connection_string)  
print(hasattr(conn, 'ssl'))

Output:


False

InitVar The type pointer marks a parameter that is passed __init__ And __post_init__ But the field is not made. This keeps your instance clean while still allowing for complex initialization logic. ssl The flag affects how we generate connection strings but doesn’t need to be maintained afterwards.

# When not to use a data class

Data classes are not always the right tool. When not to use data classes when:

You need a complex hierarchy of custom inheritance __init__ Logic on multiple levels
You are creating a class with important behavior and methods (use regular classes for domain objects)
You need the validation, serialization, or parsing features that libraries like pydantic or attrs provide
You are working with classes that have state management or lifecycle requirements

Data classes work best as lightweight data containers rather than full-featured domain objects.

# The result

Writing effective data classes is about understanding how their options interact, not memorizing them all. to know when And Why? Using each feature is more important than remembering each parameter.

As discussed in the article, using features like immutability, slots, field customization, and post-ink hooks allows you to write pillow, predict, and save matrimonial objects. These patterns help prevent bugs and reduce memory overhead without adding complexity.

With these approaches, data classes let you write clean, efficient, and maintainable code. Happy coding!

Bala Priya c is a developer and technical writer from India. She loves working at the intersection of mathematics, programming, data science, and content creation. His areas of interest and expertise include devops, data science, and natural language processing. She enjoys reading, writing, coding and coffee! Currently, she is working on learning and sharing her knowledge with the developer community by authoring tutorials, how-to guides, opinion pieces and more. Bala also engages resource reviews and coding lessons.

# Introduction

# 1. Frozen data classes for hashability and security

# 2. Slots for memory performance

# 3. Custom equations with field parameters

# 4. Works with the factory default factory

# 5. Post-Initialization Processing

# 6. Ordering with the order parameter

# 7. Field order and Inishwar

# When not to use a data class

# The result

Editor's pick

Get latest news

How to write efficient Python data classes

# Introduction

# 1. Frozen data classes for hashability and security

# 2. Slots for memory performance

# 3. Custom equations with field parameters

# 4. Works with the factory default factory

# 5. Post-Initialization Processing

# 6. Ordering with the order parameter

# 7. Field order and Inishwar

# When not to use a data class

# The result

“AI Is Going to Replace Devs” Hype Is Dead – 22-Year Developer Veteran Jason Langstorff (Podcast #201)

How to use the Optimistic UI pattern with the UseOptimistic() hook in React

You may also like

Leave a Comment Cancel Reply

Editor's pick

Get latest news