Datafiles

Datafiles is a bidirectional serialization library for Python dataclasses to synchronizes objects to the filesystem using type annotations. It supports a variety of file formats with round-trip preservation of formatting and comments, where possible. Object changes are automatically saved to disk and only include the minimum data needed to restore each object.

Popular use cases include:

  • Coercing user-editable files into the proper Python types
  • Storing program configuration and data in version control
  • Loading data fixtures for demonstration or testing purposes
  • Prototyping data models agnostic of persistance backends

Installation

Install it directly into an activated virtual environment:

1
$ pip install datafiles

or add it to your Poetry project:

1
$ poetry add datafiles

Quick Start

Decorate a type-annotated class with a directory pattern to synchronize instances:

1
2
3
4
5
6
7
8
from datafiles import datafile

@datafile("samples/{self.key}.yml")
class Sample:

    key: int
    name: str
    value: float = 0.0

By default, all member variables will be included in the serialized file except for those:

  • Included in the directory pattern
  • Set to default values

So, the following instantiation:

1
>>> sample = Sample(42, "Widget")

produces samples/42.yml containing:

1
name: Widget

and the following instantiation restores the object:

1
2
3
4
>>> from datafiles import Missing
>>> sample = Sample(42, Missing)
>>> sample.name
Widget

Type Checking

If using mypy, enable the plugin via the mypy.ini configuration file:

1
2
3
[mypy]

plugins = datafiles.plugins:mypy

Resources