Sales analytics pipeline demo#

The duckplus.examples.sales_pipeline module exercises the core DuckPlus primitives in a realistic reporting scenario. It seeds a managed :class:~duckplus.duckcon.DuckCon with deterministic orders and returns relations, derives enriched metrics, and aggregates the results for leadership reporting. The example returns a :class:~duckplus.examples.sales_pipeline.SalesDemoReport dataclass so that tests and documentation can embed the generated artefacts directly. The module ships with docstrings that mirror this page, making it easy to jump between prose and code while investigating.

Running the walkthrough#

Execute the module to build the in-memory dataset and print the captured summaries::

python -m duckplus.examples.sales_pipeline

The command prints region-level and channel-level results followed by a sample SELECT statement emitted by the typed builder.

Tip

The demo requires no external data sources—the dataset is synthesised from Python literals so it runs identically on every machine. This makes it ideal for onboarding sessions or quick smoke tests when you upgrade DuckDB.

Preview rows#

The helper stores a compact preview to make doc examples reproducible. The first five enriched rows are::

(1, 2024-06-01, 'north', 'acme', 'online', False, 120.0, 18.5, None,
 101.5, 7.105, 94.395, False, 'starter', False)
(2, 2024-06-01, 'north', 'acme', 'field', True, 240.0, 22.0,
 'Damaged packaging', 218.0, 15.26, 202.74, False, 'growth', True)
(3, 2024-06-02, 'west', 'venture', 'field', False, 310.0, 35.0, None,
 275.0, 19.25, 255.75, True, 'growth', False)
(4, 2024-06-02, 'west', 'venture', 'online', False, 180.0, 15.0, None,
 165.0, 11.55, 153.45, False, 'starter', False)
(5, 2024-06-03, 'south', 'nomad', 'online', True, 95.0, 9.0,
 'Late delivery', 86.0, 6.02, 79.98, False, 'starter', True)

The values mirror the tuples stored in :attr:SalesDemoReport.preview_rows <duckplus.examples.sales_pipeline.SalesDemoReport.preview_rows>. Because the dataclass captures both the enriched relation and its metadata, you can assert on report.preview_columns in tests to confirm column order and retain deterministic docs.

Region performance#

SalesDemoReport.region_rows summarises return rates and revenue by sales region. The deterministic output enables the documentation and tests to agree on the same numbers. The aggregation uses typed expressions for sum and count_if to demonstrate how numeric helpers compose:

region

total_orders

net_revenue

high_value_orders

return_rate

east

2

301.0

1

0.50

north

2

319.5

0

0.50

south

2

448.0

1

0.50

west

2

440.0

1

0.00

Channel performance#

The channel summary surfaces repeat behaviour and contribution averages::

('field', 2, 1, 229.245)
('online', 4, 1, 166.12125)
('partner', 2, 1, 139.965)

These rows correspond to :attr:SalesDemoReport.channel_rows <duckplus.examples.sales_pipeline.SalesDemoReport.channel_rows>. Call summarise_by_channel from the module when you need to recompute the relation for exploratory analysis.

Typed projection example#

The demo emits the typed SELECT used to showcase if_exists clauses. It replaces the service_tier column with a computed label while falling back to fulfilled when return_reason is absent::

SELECT * REPLACE (
    CASE WHEN "is_returned" THEN 'service'
         WHEN "is_high_value" THEN 'priority'
         ELSE "service_tier" END AS "service_tier",
    CASE WHEN "return_reason" IS NULL THEN 'fulfilled'
         ELSE "return_reason" END AS "return_reason"
),
sum("net_revenue") AS "cumulative_net"
FROM enriched_orders

Because the SELECT builder is dependency-aware, the optional clauses disappear if an upstream relation omits return_reason or net_revenue. Reuse the build_enriched_orders helper in your own scripts when you want to add new metrics or persist the intermediate relation to disk.