Believemy logo purple

All about dataclass in Python

Dataclasses make it possible to create immutable classes in a more condensed way than traditional classes.
Believemy logo

The @dataclass decorator appeared with Python 3.7 to address a clear need: structuring data simply, without writing repetitive code.

It's a native and elegant solution for creating readable, concise, typed and powerful classes, while significantly reducing the code to write (we'll get to that right after). It's a tool praised by both beginner and expert developers.

When applied to a class, Python automatically generates several special methods such as:

  • __init__() for attribute initialization;
  • __repr__() for a readable representation;
  • __eq__() for object comparison.

and even other methods if desired (__lt__, __le__, etc).

This allows you to focus on the data, while having a fully functional object.

Unlike namedtuple, dataclass are more flexible, accept typing, mutability or immutability, and can contain custom methods.

 

Basic syntax and example

To use a dataclass, simply add @dataclass (which is a decorator) above a class to activate its features.

Here's a small example of dataclass to illustrate all its power:

PYTHON
from dataclasses import dataclass

@dataclass
class Product:
    name: str
    price: float

item = Product("Keyboard", 49.99)

print(item.name)   # Keyboard
print(item.price)  # 49.99
print(item)        # Product(name='Keyboard', price=49.99)

In just 4 lines, we have a typed class with an automatic constructor. Magical, isn't it? 😋

No need for __init__, __str__, or manually defining types in methods: @dataclass does it for us.

 

Key parameters of the @dataclass decorator

The @dataclass decorator can be configured using optional parameters, according to the desired behavior:

ParameterDescription
init=TrueAutomatically generates the __init__() method
repr=TrueGenerates __repr()__ to create a representation
eq=TrueGenerates __eq__() to make equality comparisons
order=FalseAllows comparing inferiority and superiority
frozen=FalseMakes the object immutable (if True)

 

Default values and optional fields

With dataclass, you can easily specify default values for certain fields, like in a classic function:

PYTHON
from dataclasses import dataclass

@dataclass
class Article:
    name: str
    price: float = 0.0

In this example, if we create an Article("USB Cable"), the price will automatically be 0.0.

The dataclasses module also provides the field() function to handle advanced cases:

PYTHON
from dataclasses import dataclass, field

@dataclass
class Order:
    items: list = field(default_factory=list)

Here, default_factory is very useful to avoid pitfalls related to shared mutable values like lists or dictionaries.

 

Making a dataclass immutable (frozen=True)

When we add frozen=True, the object becomes immutable: its attributes can no longer be modified after creation.

PYTHON
from dataclasses import dataclass

@dataclass(frozen=True)
class Client:
    name: str
    age: int

c = Client("John", 30)
# c.age = 31  ❌ Causes an error: cannot assign to field

Moreover, frozen dataclass are automatically hashable if their fields are. You can thus use them as keys in a dictionary or in a set.

 

Comparison and sorting with order=True

By default, a dataclass cannot be sorted or compared with <, >, <=, >=.

To activate these operations, we can specify the order=True parameter.

PYTHON
from dataclasses import dataclass

@dataclass(order=True)
class Product:
    price: float
    name: str

The sort order is based on the field order in the declaration: here, instances will be sorted according to price, then according to name if prices are equal.

 

The special __post_init__() method

The __post_init__() method is called after the execution of __init__() automatically generated by @dataclass. It allows you to perform custom processing or validations on attributes.

PYTHON
from dataclasses import dataclass

@dataclass
class Person:
    name: str
    age: int

    def __post_init__(self):
        if self.age < 0:
            raise ValueError("Age cannot be negative.")

In this example, even though __init__() is automatic, we add specific business logic without having to rewrite it.

We often use __post_init__() to validate, round or transform input data. 😉

 

Strong typing and type hints

One of the great advantages of dataclass is their native integration with type annotations.

Each field is typed with a standard annotation, which offers several benefits:

  • Clear documentation;
  • Better IDE support (auto-completion, verification);
  • Integration with static verification tools (MyPy, Pyright).

Here's how to do it:

PYTHON
@dataclass
class Account:
    identifier: int
    balance: float
    active: bool

Thanks to these types, tools can detect if a wrong value is passed at instantiation:

PYTHON
c = Account("abc", 50.0, True)  # 🚫 Error detectable with MyPy

Typing is recommended, but not mandatory. However, without types, some dataclass features will not work correctly (like the generation of __init__()).

 

dataclass vs classic class: what are the differences?

Let's now see a concrete comparison between a classic class and a dataclass.

The objective is to highlight the code reduction and readability gain.

Classic class:

PYTHON
class Product:
    def __init__(self, name, price):
        self.name = name
        self.price = price

    def __repr__(self):
        return f"Product(name={self.name!r}, price={self.price!r})"

    def __eq__(self, other):
        return isinstance(other, Product) and self.name == other.name and self.price == other.price

 

Version dataclass:

PYTHON
from dataclasses import dataclass

@dataclass
class Product:
    name: str
    price: float

Result: same functionality, 5 times less code, clearer, cleaner.

 

dataclass vs namedtuple

Before dataclass, we often used namedtuple to create lightweight objects with field naming. Let's compare them:

Criteriadataclassnamedtuple
Requires import✅ Yes (dataclasses)✅ Yes (collections)
Mutable?✅ Yes❌ No (immutable)
Typed?✅ Yes❌ No
Custom methods?✅ Yes❌ Not really
Inheritance✅ Yes❌ Complex
Sortable?✅ Yes✅ Yes (by default)

In summary: dataclass is more modern and more flexible, while namedtuple remains useful if you want simple immutability without overhead.

 

Advanced options with field()

As we've seen, the dataclasses module provides the field() utility to finely customize the behavior of each attribute.

Let's see its parameters together:

OptionDescription
default=Default value
default_factory=Dynamically generates a default value (great for lists and dictionaries)
init=False

Don't include in __init__()

repr=FalseExclude from __repr__()
compare=FalseExclude from __eq__() and __lt__()

Let's take this example:

PYTHON
from dataclasses import dataclass, field

@dataclass
class Counter:
    name: str
    history: list = field(default_factory=list, repr=False, compare=False)

In this example:

  • history is invisible in __repr__();
  • It is not used for comparison between two objects;
  • It receives a new list for each instance, without reference sharing (avoids pitfalls).

As we can see, field() is a very powerful tool to refine the rules of our dataclass objects, especially in API-oriented contexts, serialization or business logic.

 

Using asdict() and astuple()

dataclass can be easily converted to dictionary or tuple thanks to the utility functions asdict() and astuple() from the dataclasses module.

PYTHON
from dataclasses import dataclass, asdict, astuple

@dataclass
class Product:
    name: str
    price: float

p = Product("Pen", 2.50)

print(asdict(p))   # {'name': 'Pen', 'price': 2.5}
print(astuple(p))  # ('Pen', 2.5)

These conversions are useful:

For serialization to JSON;

For debug display;

Or to send data via an API.

asdict() performs recursive conversion: if a field contains another dataclass, it will also be converted.

 

Inheritance and nested dataclasses

Dataclass inheritance

Like "classic" Python classes, dataclass support inheritance. This allows factoring common attributes or enriching specialized classes.

PYTHON
from dataclasses import dataclass

@dataclass
class Person:
    name: str
    age: int

@dataclass
class Employee(Person):
    position: str

Here, Employee inherits fields from Person and adds a position field.

 

Nested dataclasses

Nested dataclasses are ideal for representing complex or hierarchical objects.

PYTHON
@dataclass
class Address:
    city: str
    postal_code: str

@dataclass
class Client:
    name: str
    address: Address

c = Client("Chloe", Address("Paris", "75001"))
print(c.address.city)  # Paris

 

Performance and limitations of dataclasses

Performance

dataclass are as fast as classic classes for most uses, but it should be remembered that automatic methods add a slight overhead at instantiation (it's really negligible).

Also, frozen=True objects are a bit slower because they are obviously hashable.

 

Limitations

Dataclasses do not replace a complete ORM or business model, they are also poorly suited to very dynamic objects or for multiple inheritance.

Finally: they are only compatible from Python 3.7 onwards.

 

Practical use cases

dataclass are used in many concrete contexts. Let's take a quick tour with some examples.

With simple data

PYTHON
@dataclass
class User:
    name: str
    email: str
    active: bool = True

Ideal for handling user objects in an API.

 

With configurations

PYTHON
@dataclass
class Config:
    debug: bool
    path: str
    version: float

 

With business data

PYTHON
@dataclass
class Order:
    product: str
    quantity: int
    unit_price: float

    def total(self) -> float:
        return self.quantity * self.unit_price

 

Frequently asked questions about dataclasses

Let's make a quick point about the most frequently asked questions about dataclasses in Python!

Can I add methods in a dataclass?

Yes! dataclass are classic Python classes, so you can add methods like in any other class.

 

Can I modify the attributes of a dataclass?

Yes, unless you have defined frozen=True. In this case, instances become immutable.

 

Is it compatible with versions < 3.7?

No. The dataclasses module is native from Python 3.7. For earlier versions, a backport exists: pip install dataclasses.

 

Where to learn to master Python?

With our training!

Discover our python glossary

Browse the terms and definitions most commonly used in development with Python.