API Reference
This page documents the public API for the Luminara framework. All asynchronous methods must be awaited.
Module: luminara.pipeline
class Pipeline
The main orchestrator for data flows.
class Pipeline(source: Source, stages: List[Stage] = [], sink: Sink = None, **kwargs)
Parameters:
- source: Source
- The data source object. Must inherit from
luminara.stages.Source. - stages: List[Stage]
- A list of transformation stages to apply in sequence.
- sink: Sink | List[Sink]
- The destination for processed data. Can be a single Sink or a list (for fan-out).
- **kwargs
- Additional configuration (e.g.,
buffer_size,concurrency).
Methods:
- async run() -> None
- Starts the pipeline execution. This method blocks until the source is exhausted and all buffers are drained.
- async stop() -> None
- Gracefully shuts down the pipeline, allowing in-flight messages to complete.
Module: luminara.stages
class Source(abc.ABC)
Base class for all data inputs.
Methods:
- async read() -> AsyncGenerator[Dict, None]
- Abstract. Must yield dictionary records. This is the entry point for data into the pipeline.
class Transform(abc.ABC)
Base class for data manipulation.
Methods:
- async process(record: Dict) -> Dict | None
- Abstract. Receives a record and returns a modified record. Return
Noneto drop the record (filtering).
class Sink(abc.ABC)
Base class for data outputs.
Methods:
- async write(record: Dict) -> None
- Abstract. Receives a fully processed record and handles side effects (writing to DB, API, etc).
- async write_batch(records: List[Dict]) -> None
- Optional. Override for batch processing. Default implementation calls
write()for each record.
Module: luminara.schema
class Schema
Data validation utility built on Pydantic.
class Schema(model: Type[pydantic.BaseModel], on_error: str = "raise")
Parameters:
- model
- A Pydantic model class defining the expected data structure.
- on_error
- Action to take on validation failure: "raise" (default), "drop", or "log".