v0.8.2

API Reference

This page documents the public API for the Luminara framework. All asynchronous methods must be awaited.


Module: luminara.pipeline

class Pipeline

The main orchestrator for data flows.

class Pipeline(source: Source, stages: List[Stage] = [], sink: Sink = None, **kwargs)

Parameters:

source: Source
The data source object. Must inherit from luminara.stages.Source.
stages: List[Stage]
A list of transformation stages to apply in sequence.
sink: Sink | List[Sink]
The destination for processed data. Can be a single Sink or a list (for fan-out).
**kwargs
Additional configuration (e.g., buffer_size, concurrency).

Methods:

async run() -> None
Starts the pipeline execution. This method blocks until the source is exhausted and all buffers are drained.
async stop() -> None
Gracefully shuts down the pipeline, allowing in-flight messages to complete.

Module: luminara.stages

class Source(abc.ABC)

Base class for all data inputs.

Methods:

async read() -> AsyncGenerator[Dict, None]
Abstract. Must yield dictionary records. This is the entry point for data into the pipeline.

class Transform(abc.ABC)

Base class for data manipulation.

Methods:

async process(record: Dict) -> Dict | None
Abstract. Receives a record and returns a modified record. Return None to drop the record (filtering).

class Sink(abc.ABC)

Base class for data outputs.

Methods:

async write(record: Dict) -> None
Abstract. Receives a fully processed record and handles side effects (writing to DB, API, etc).
async write_batch(records: List[Dict]) -> None
Optional. Override for batch processing. Default implementation calls write() for each record.

Module: luminara.schema

class Schema

Data validation utility built on Pydantic.

class Schema(model: Type[pydantic.BaseModel], on_error: str = "raise")

Parameters:

model
A Pydantic model class defining the expected data structure.
on_error
Action to take on validation failure: "raise" (default), "drop", or "log".