ProductDocsArchitectureBlogGitHubGitHubGet Started
Feature Maturity

Capability status, backed by codebase evidence.

Each status below is derived from inspecting Rust sources, tests, examples, and public APIs in the Krishiv workspace. Statuses reflect what is implemented, not what is intended.

Available

Implemented, tested, and used in core workflows. APIs are stable within minor versions.

Batch SQL

DataFusion-backed SQL over Apache Arrow RecordBatches and registered sources.

Apache Arrow data model

RecordBatch is the internal and IPC columnar format across all runtime paths.

Rust Session / DataFrame API

Session, DataFrame, and Stream types are the primary Rust-facing API surface.

DataFusion SQL planning

SQL parsing, logical planning, expression evaluation, and local execution via DataFusion.

Embedded runtime mode

Runs all components in-process; no network endpoints required. Used in tests and local API calls.

Single-node runtime mode

Runs coordinator, executor, and Flight/gRPC endpoints on one host with local filesystem and RocksDB.

Python bindings (core)

PyO3 bindings expose Session, DataFrame, and streaming APIs. Optional connector features are feature-gated.

Explicit durability profiles

dev-local, single-node-durable, and distributed-durable profiles control metadata, shuffle, state, and checkpoint storage.

Experimental

Implemented and functional. APIs and semantics may change. Not certified for production use.

Delta Batch / IVM

DeltaBatch (weighted Arrow rows) and IncrementalFlow (view maintenance across ticks) are implemented with partitioning, snapshots, and checkpoint hooks. Distributed executor-side IVM execution is deferred.

Python connector features

Kafka, Iceberg, and vector sink bindings exist as optional Cargo features. API surface is not yet stable.

Preview

Scaffolding and initial implementation exist. End-to-end certification work is ongoing. Use with caution.

Distributed runtime mode

Remote coordinator and executor transport with bearer-token auth. Requires explicit Flight endpoint; no silent local fallback.

Iceberg catalog integration

REST, Hive, and Glue catalog paths. Iceberg is the primary lakehouse target; certification work continues.

Kafka connector

Source and transactional sink via rdkafka. End-to-end exactly-once depends on certified checkpoint combinations.

Parquet / S3 / ADLS connectors

Connector contracts and implementations exist; end-to-end guarantees depend on certified combinations.

Shuffle service

In-memory, local disk, object-store, and Flight-oriented shuffle paths behind the krishiv-shuffle crate API.

Checkpoint storage

Async checkpoint primitives with sync compatibility wrappers. Scheduler gRPC checkpoint acks use the async path.

State management

In-memory and RocksDB-backed keyed state, TTL, migration, and incremental state behind the krishiv-state crate API.

Kubernetes operator / CRD

CRD and operator integration in the krishiv-operator crate. Manifests live in k8s/.

Scheduler fault tolerance

Job/task lifecycle, metadata stores, and leadership coordination via krishiv-scheduler. Failure handling foundations are in place.

Planned

On the roadmap but not yet implemented. Do not rely on these without maintainer confirmation.

Distributed IVM

Executor-side incremental view maintenance across a distributed cluster. Requires distributed IVM protocol design.

Full exactly-once guarantees

End-to-end exactly-once across arbitrary source/sink/checkpoint combinations. Currently scoped to certified combinations only.

Krishiv Cloud

Managed compute offering. Not yet implemented.

Maintainer note: Statuses marked Preview or Planned require maintainer confirmation before use in documentation or marketing materials. Capability descriptions are based on codebase inspection and may not reflect in-flight work on development branches.