Introduction
Krishiv — one engine for batch SQL, streaming pipelines, and incremental processing.
What is Krishiv?
Krishiv is a Rust-native compute framework that unifies batch SQL, streaming pipelines, and incremental view maintenance under a single execution model. It uses Apache Arrow RecordBatch as the internal columnar data model and DataFusion for SQL parsing, planning, expressions, and local execution.
The same session, plan, and scheduler/executor runtime works across embedded (in-process), single-node daemon, and distributed cluster deployments.
Key Properties
- Unified execution: batch and streaming share Arrow batches, planning, runtime routing, and scheduler/executor boundaries.
- Rust-native: Rust 2024 + Tokio; typed IDs, typed plans, typed errors, explicit durability profiles.
- Three interfaces: SQL, Rust API (
krishiv-api), Python bindings (krishiv-pythonvia PyO3). - Iceberg-first lakehouse: Apache Iceberg is the primary certified lakehouse platform.
- Incremental processing:
DeltaBatch(weighted Arrow rows) andIncrementalFlowfor incremental view maintenance.
Architecture at a Glance
SQL / Rust API / Python API
└─ Session + catalog
└─ DataFusion + Krishiv plan + optimizer
└─ ExecutionRuntime
Embedded → in-process
SingleNode → local Flight/gRPC daemon
Distributed → remote Flight/gRPC cluster
└─ Coordinator
└─ ExecutorTaskRunner
└─ Arrow/DataFusion ops, shuffle, state, checkpoints, connectors
Workspace Crate Map
| Crate | Responsibility |
|---|---|
krishiv | User-facing facade and CLI binary. |
krishiv-api | Session, DataFrame, Stream, IncrementalFlow, and all public Rust API types. |
krishiv-sql | DataFusion integration, SQL execution, catalog and table-provider abstractions. |
krishiv-plan | Logical/physical plans, expression AST, UDF contracts, governance/policy, CEP. |
krishiv-runtime | Embedded, single-node, and remote runtime routing. |
krishiv-dataflow | Arrow operator runtime, queues, barriers, windows, joins, stateful ops. |
krishiv-scheduler | Coordinator, job/task lifecycle, metadata stores, leadership, gRPC server. |
krishiv-executor | Executor process, task runner, shuffle/checkpoint hooks. |
krishiv-state | In-memory and RocksDB-backed keyed state, TTL, migration, checkpoint/savepoint. |
krishiv-connectors | Source/sink contracts, Parquet/Kafka/S3 paths, Iceberg-first lakehouse helpers. |
krishiv-python | PyO3 Python bindings. |
krishiv-shuffle | In-memory, local disk, object-store, and Flight-oriented shuffle support. |