Architecture
Crate boundaries, runtime routing, and design invariants.
Architecture Invariants
- Do not build separate engines for batch and streaming.
- One active job coordinator per job; executors are replaceable data-plane workers.
- Shuffle, state, checkpoint, metadata, and connector behavior live behind crate APIs.
- Prefer typed IDs, typed fragments, typed errors, and capability flags over stringly-routed public contracts.
Plan Flow
SQL / API input
└─ Session.sql() / session.dataframe()
└─ SqlEngine (DataFusion parse + optimize + plan)
└─ krishiv-plan: LogicalPlan / PhysicalPlan
└─ ExecutionRuntime.accept_plan()
└─ Coordinator.submit_job()
└─ Scheduler → tasks → ExecutorTaskRunner
└─ Dataflow operators (Arrow, windowing, joins, state)
└─ Shuffle, checkpoints, connectors
Session and Catalog
Each Session owns a SqlEngine backed by a DataFusion SessionContext. The catalog bridges Krishiv's InMemoryCatalog (or an Iceberg KrishivCatalog) into DataFusion's catalog provider interface, making registered tables available to SQL queries.
Scheduler / Executor Boundary
The coordinator is the single authoritative owner of job state. Executors receive task assignments and execute Arrow/DataFusion operators. Task retries may replay records; the coordinator fence prevents stale task completions from being accepted. Executors are stateless between task assignments except for state retrieved from checkpoint/state stores.
Distributed Auth
Production coordinator and executor task-control gRPC require bearer-token auth:
KRISHIV_COORDINATOR_BEARER_TOKEN— client token sent on every call.KRISHIV_COORDINATOR_BEARER_TOKENS— comma/newline-separated accepted server tokens for rotation windows.KRISHIV_COORDINATOR_BEARER_TOKEN_FILE— file-based token for long-lived servers.KRISHIV_COORDINATOR_AUTH_RELOAD_INTERVAL_SECS— live reload interval for token files.KRISHIV_EXECUTOR_TASK_BEARER_TOKEN— token for executor gRPC.