File append helpers#
DuckPlus wraps DuckDB’s file-writer APIs so append workflows stay safe and predictable. Because the helpers operate on immutable relations, you can rehearse the exact rows that will be written, validate schemas, and even dry-run the transforms using the sampling utilities described in Immutable relation helpers.
CSV appends#
:meth:duckplus.relation.Relation.append_csv appends rows to an existing CSV
file. The helper enforces schema compatibility and offers opt-in deduplication:
relation.append_csv(
"reports/output.csv",
mode="append",
match_all_columns=True,
)
Enabling match_all_columns ensures duplicate rows are removed based on all
columns before writing. When mode="overwrite" the helper truncates the file
before writing new rows. Pair the helper with
:meth:Relation.null_ratios <duckplus.relation.Relation.null_ratios> to detect
unexpected null patterns prior to exporting.
if relation.null_ratios()["total"] > 0:
raise ValueError("Totals cannot contain nulls before export")
Parquet appends#
Parquet writes require rewriting files safely. DuckPlus handles this by writing to a temporary file and replacing the target atomically.
relation.append_parquet(
"warehouse/sales.parquet",
mode="append",
unique_id_column="id",
)
The helper supports both append and overwrite modes, validates columns
up front, and surfaces helpful errors when the relation originates from a closed
connection. When unique_id_column is provided, DuckPlus performs an anti-join
against the target file to skip rows that already exist—ideal when you are
reprocessing incremental extracts. Set match_all_columns=True to compare the
entire row structure instead.
Table utilities#
Managed tables share the same API guarantees. See duckplus.table for
helpers that create or overwrite DuckDB tables while verifying schema metadata.
These utilities lean on the same immutability principles, keeping the API open
for future extension without breaking callers.
Incremental refresh pattern#
Combine the file append helpers with :class:duckplus.table.Table to create an
end-to-end incremental refresh pipeline:
Ingest raw files with
duckplus.ioreaders.Apply typed-expression transforms and validations.
Write the curated slice to Parquet with
append_parquet.Insert the same rows into a DuckDB table using :meth:
duckplus.table.Table.insert, enabling fast local analytics.
Because each step works with immutable relations, you can add assertions after every operation without worrying about accidental mutation.