duckplus.io#

File-based relation helpers built on top of duckplus.DuckCon.

Functions#

duckcon_helper(→ HelperFunction)

Attach DuckDB I/O helper helper to DuckCon.

read_csv(→ duckplus.relation.Relation)

Load a CSV file into a Relation.

read_parquet(→ duckplus.relation.Relation)

Load a Parquet file into a Relation.

read_json(→ duckplus.relation.Relation)

Load a JSON document or JSON Lines file into a Relation.

read_odbc_query(→ duckplus.relation.Relation)

Execute an ODBC query via nano-ODBC and return a relation.

read_odbc_table(→ duckplus.relation.Relation)

Scan an ODBC table via nano-ODBC and return a relation.

read_excel(→ duckplus.relation.Relation)

Load an Excel workbook via DuckDB's excel extension.

Package Contents#

duckplus.io.duckcon_helper(helper: HelperFunction) HelperFunction#

Attach DuckDB I/O helper helper to DuckCon.

duckplus.io.read_csv(duckcon: duckplus.duckcon.DuckCon, source: pathlib.Path | collections.abc.Sequence[pathlib.Path], *, header: bool | None = None, delimiter: str | None = None, delim: str | None = None, quotechar: str | None = None, quote: str | None = None, escapechar: str | None = None, escape: str | None = None, sample_size: int | None = None, auto_detect: bool | None = None, columns: object | None = None, dtype: object | None = None, names: collections.abc.Sequence[str] | None = None, na_values: collections.abc.Sequence[str] | None = None, null_padding: bool | None = None, force_not_null: collections.abc.Sequence[str] | None = None, files_to_sniff: int | None = None, decimal: str | None = None, decimal_separator: str | None = None, date_format: str | None = None, dateformat: str | None = None, timestamp_format: str | None = None, timestampformat: str | None = None, encoding: str | None = None, compression: str | None = None, hive_types_autocast: bool | None = None, all_varchar: bool | None = None, hive_partitioning: bool | None = None, comment: str | None = None, max_line_size: int | None = None, maximum_line_size: int | None = None, store_rejects: bool | None = None, rejects_table: str | None = None, rejects_limit: int | None = None, rejects_scan: str | None = None, union_by_name: bool | None = None, filename: bool | None = None, normalize_names: bool | None = None, ignore_errors: bool | None = None, allow_quoted_nulls: bool | None = None, auto_type_candidates: collections.abc.Sequence[str] | str | None = None, parallel: bool | None = None, skiprows: int | None = None, skip: int | None = None) duckplus.relation.Relation#

Load a CSV file into a Relation.

duckplus.io.read_parquet(duckcon: duckplus.duckcon.DuckCon, source: pathlib.Path | collections.abc.Sequence[pathlib.Path], *, binary_as_string: bool | None = None, file_row_number: bool | None = None, filename: bool | None = None, hive_partitioning: bool | None = None, union_by_name: bool | None = None, compression: str | None = None, directory: bool = False, partition_id_column: str | None = None, partition_glob: str | collections.abc.Sequence[str] = '*.parquet') duckplus.relation.Relation#

Load a Parquet file into a Relation.

duckplus.io.read_json(duckcon: duckplus.duckcon.DuckCon, source: pathlib.Path | collections.abc.Sequence[pathlib.Path], *, columns: object | None = None, sample_size: object | None = None, maximum_depth: object | None = None, records: str | None = None, format: str | None = None, date_format: object | None = None, timestamp_format: object | None = None, compression: object | None = None, maximum_object_size: object | None = None, ignore_errors: object | None = None, convert_strings_to_integers: object | None = None, field_appearance_threshold: object | None = None, map_inference_threshold: object | None = None, maximum_sample_files: object | None = None, filename: object | None = None, hive_partitioning: object | None = None, union_by_name: object | None = None, hive_types: object | None = None, hive_types_autocast: object | None = None) duckplus.relation.Relation#

Load a JSON document or JSON Lines file into a Relation.

duckplus.io.read_odbc_query(duckcon: duckplus.duckcon.DuckCon, connection_string: str, query: str, *, parameters: collections.abc.Iterable[Any] | None = None) duckplus.relation.Relation#

Execute an ODBC query via nano-ODBC and return a relation.

duckplus.io.read_odbc_table(duckcon: duckplus.duckcon.DuckCon, connection_string: str, table: str) duckplus.relation.Relation#

Scan an ODBC table via nano-ODBC and return a relation.

duckplus.io.read_excel(duckcon: duckplus.duckcon.DuckCon, source: str | os.PathLike[str], *, sheet: str | int | None = None, header: bool | None = None, skip: int | None = None, skiprows: int | None = None, limit: int | None = None, names: collections.abc.Sequence[str] | None = None, dtype: collections.abc.Mapping[str, str] | collections.abc.Sequence[str] | None = None, all_varchar: bool | None = None) duckplus.relation.Relation#

Load an Excel workbook via DuckDB’s excel extension.