ProductDocsArchitectureBlogGitHubGitHubGet Started
Preview

Lakehouse

LiveTable, MemoryLakehouseTable, IcebergRestCatalog, and HudiWriteResult.

LiveTable

Obtained via session.live_table(name). Provides row-level ingestion into a live SQL-queryable table.

MethodReturnsDescription
name() -> strstrReturn the table name.
ingest_row(row: dict)NoneAppend a single row (dict of column → value).
refresh()NoneFlush pending inserts into the queryable snapshot.
change_feed() -> ChangeFeedIterChangeFeedIterGet an async iterator of change records.
drop()NoneDrop and unregister the live table.

MemoryLakehouseTable

An in-memory Iceberg-like table that supports snapshot-based DML. Useful for testing lakehouse patterns without a real Iceberg catalog.

MemoryLakehouseTable(schema: Schema, name: str = "")
MethodReturnsDescription
append(batches)NoneAppend Arrow batches as a new snapshot.
overwrite(batches)NoneReplace all data with new batches.
delete_where(predicate: str)intDelete rows matching a SQL predicate. Returns deleted count.
update_where(predicate, assignments)intUpdate matching rows. Returns updated count.
merge(source_batches, condition, actions)NoneApply MERGE logic (insert/update/delete) from a source.
evolve_schema(new_schema)NoneEvolve the table schema (add nullable columns).
current_snapshot_id() -> intintReturn the current snapshot ID.

IcebergRestCatalog

IcebergRestCatalog(uri: str, warehouse: str = None, token: str = None)
MethodReturnsDescription
list_tables(namespace: str) -> list[str]list[str]List all table names in a namespace.
load_table_metadata(namespace, table) -> dictdictLoad raw Iceberg table metadata JSON.

Top-Level Lakehouse Functions

FunctionDescription
read_iceberg(uri, catalog_uri=None) -> DataFrameRead an Iceberg table (requires iceberg feature).
read_delta(path, version=None) -> DataFrameRead a Delta Lake table directory (requires delta feature).
read_hudi(path, query_type='snapshot') -> DataFrameRead a Hudi table.
write_hudi_append(df, path) -> HudiWriteResultAppend a DataFrame to a Hudi table.
write_hudi_upsert(df, path, key_col) -> HudiWriteResultUpsert a DataFrame into a Hudi table by key column.

HudiWriteResult

MethodReturnsDescription
instant() -> strstrHudi commit instant timestamp.
rows_inserted() -> intintNumber of rows inserted.
rows_updated() -> intintNumber of rows updated.
snapshot_rows() -> intintTotal rows in the table after the write.