Lakehouse
LiveTable, MemoryLakehouseTable, IcebergRestCatalog, and HudiWriteResult.
LiveTable
Obtained via session.live_table(name). Provides row-level ingestion into a live SQL-queryable table.
| Method | Returns | Description |
|---|---|---|
name() -> str | str | Return the table name. |
ingest_row(row: dict) | None | Append a single row (dict of column → value). |
refresh() | None | Flush pending inserts into the queryable snapshot. |
change_feed() -> ChangeFeedIter | ChangeFeedIter | Get an async iterator of change records. |
drop() | None | Drop and unregister the live table. |
MemoryLakehouseTable
An in-memory Iceberg-like table that supports snapshot-based DML. Useful for testing lakehouse patterns without a real Iceberg catalog.
MemoryLakehouseTable(schema: Schema, name: str = "")
| Method | Returns | Description |
|---|---|---|
append(batches) | None | Append Arrow batches as a new snapshot. |
overwrite(batches) | None | Replace all data with new batches. |
delete_where(predicate: str) | int | Delete rows matching a SQL predicate. Returns deleted count. |
update_where(predicate, assignments) | int | Update matching rows. Returns updated count. |
merge(source_batches, condition, actions) | None | Apply MERGE logic (insert/update/delete) from a source. |
evolve_schema(new_schema) | None | Evolve the table schema (add nullable columns). |
current_snapshot_id() -> int | int | Return the current snapshot ID. |
IcebergRestCatalog
IcebergRestCatalog(uri: str, warehouse: str = None, token: str = None)
| Method | Returns | Description |
|---|---|---|
list_tables(namespace: str) -> list[str] | list[str] | List all table names in a namespace. |
load_table_metadata(namespace, table) -> dict | dict | Load raw Iceberg table metadata JSON. |
Top-Level Lakehouse Functions
| Function | Description |
|---|---|
read_iceberg(uri, catalog_uri=None) -> DataFrame | Read an Iceberg table (requires iceberg feature). |
read_delta(path, version=None) -> DataFrame | Read a Delta Lake table directory (requires delta feature). |
read_hudi(path, query_type='snapshot') -> DataFrame | Read a Hudi table. |
write_hudi_append(df, path) -> HudiWriteResult | Append a DataFrame to a Hudi table. |
write_hudi_upsert(df, path, key_col) -> HudiWriteResult | Upsert a DataFrame into a Hudi table by key column. |
HudiWriteResult
| Method | Returns | Description |
|---|---|---|
instant() -> str | str | Hudi commit instant timestamp. |
rows_inserted() -> int | int | Number of rows inserted. |
rows_updated() -> int | int | Number of rows updated. |
snapshot_rows() -> int | int | Total rows in the table after the write. |