xlfilldown: the complete guide ( Excel Fill Down )

xlfilldown streams an Excel sheet into SQLite or a new Excel sheet, forward-filling selected columns by header name (or letters), optionally keeping original Excel row numbers and computing a stable SHA-256 row hash. It works in constant memory, so it’s great for big workbooks.

0) Install

pip install xlfilldown
# or
pipx install xlfilldown

Python ≥ 3.9. Depends on openpyxl.

1) Why “fill-down” is tricky (and how xlfilldown solves it)

Typical spreadsheets use tiered headers: a high-level “group” column appears once, and lower tiers appear below it. Naïve tutorials copy values down independently, which can accidentally leak stale context.

xlfilldown supports two modes:

hierarchical (default): higher tiers reset lower tiers when they change
(great for “Region > Country > City” style data)
independent: pandas-style ffill; each listed column carries on its own

You choose the mode with --fill-mode hierarchical|independent. Order matters only in hierarchical mode (left = highest tier).

2) Your first 3-minute success

Say Sheet1 looks like this:

Region	Country	City	Value
Europe
	Germany
		Bonn	10
		Berlin	12

Asia
	Japan
		Osaka	7
		Kyoto	8

Goal: fill Region/Country/City so every row has context, then write to SQLite.

xlfilldown db \
  --infile data.xlsx \
  --insheet "Sheet1" \
  --header-row 1 \
  --fill-cols '["Region","Country","City"]' \
  --require-non-null '["Value"]' \
  --db out.db \
  --table fact_locations \
  --row-hash \
  --excel-row-numbers \
  --drop-blank-rows

What you get (conceptually):

row_hash  excel_row  Region  Country  City   Value
...       "4"        Europe  Germany  Bonn    "10"
...       "5"        Europe  Germany  Berlin  "12"
...       "9"        Asia    Japan    Osaka   "7"
...       "10"       Asia    Japan    Kyoto   "8"

row_hash	excel_row	Region	Country	City	Value
aa93f736faca44b3734f2784f5beac08eb2a8c0049cda70c6ec5a736159848e6	4	Europe	Germany	Bonn	10
2ef0162949e29c587b1071c6435be56e1cb338986edc541974f353e1d7b69f56	5	Europe	Germany	Berlin	12
9c9f43c630a1a3ecf50ff26a8b07dc2cd005adfeb2d179bd2c32ee8e0e232fd7	9	Asia	Japan	Osaka	7
55d85bf23152d147c61fef8349a386c4e62edc847797e072909d3d2313164044	10	Asia	Japan	Kyoto	8

Notes:

All destination columns are stored as TEXT (including excel_row).
Numbers and dates are written as canonical strings (e.g., 1.0 → "1").
row_hash is a SHA-256 over the data columns (in header order) after fill.

3) Using column letters (A, C, AE…)

Don’t want to type headers? Point to the header row and pass letters:

xlfilldown db \
  --infile data.xlsx \
  --insheet "Sheet1" \
  --header-row 1 \
  --fill-cols-letters A C \
  --db out.db

Letters are mapped to header names in the header row you provide.
If a letter points to an empty header, xlfilldown will error clearly.
--fill-cols and --fill-cols-letters are mutually exclusive.

4) Hierarchical vs Independent (with examples)

Hierarchical (default)

--fill-cols '["Region","Country","City"]'
--fill-mode hierarchical   # default; this line can be omitted

Behavior:

When Region changes, both Country and City carries reset on that row.
When Country changes, City resets on that row.

Independent

--fill-cols '["Region","Country","City"]'
--fill-mode independent

Behavior:

Each of the three columns carries forward independently (like pandas ffill).
Order does not matter.

Tip: If you saw “leaking” of old City names into a new Country, you probably wanted hierarchical.

5) Dropping rows you don’t want

Two powerful filters apply after padding:

Spacer rows (drop rows blank across your fill columns):

--drop-blank-rows

Useful when you have visual separators in your source sheet.

Required non-null (drop if any listed headers are blank after fill):

--require-non-null '["Region","City","Value"]'
# or by letters, resolved to names via header row:
--require-non-null-letters A D

You can combine name- and letter-based lists; they’re merged and de-duplicated.

6) Writing to Excel instead of SQLite

xlfilldown xlsx \
  --infile data.xlsx \
  --insheet "Sheet1" \
  --header-row 1 \
  --fill-cols '["Region","Country","City"]' \
  --outfile processed.xlsx \
  --outsheet "Filled" \
  --if-exists replace \
  --row-hash \
  --excel-row-numbers

Sheet-level --if-exists:

fail (default) – error if the sheet exists
replace – recreate the sheet fresh
append – append after validating headers match exactly (including optional columns like excel_row and row_hash).

Pro move: When creating a new workbook, xlfilldown uses write_only=True for streaming speed.

7) Append semantics (exact-schema match)

Appending only works if destination columns match exactly and in order:

SQLite:
[row_hash?] [excel_row?] + headers… as TEXT, exact order.
Excel:
Header row must equal the expected list (including optional columns if enabled).

This rigor prevents silent data drift.

8) Verifying results (quick queries)

SQLite:

-- Check first 5 rows
SELECT excel_row, Region, Country, City, Value FROM fact_locations LIMIT 5;

-- Find potentially duplicated logical rows
SELECT row_hash, COUNT(*) c FROM fact_locations GROUP BY 1 HAVING c > 1;

-- Spot rows missing required context (should be none if you used --require-non-null)
SELECT * FROM fact_locations WHERE Region IS NULL OR Country IS NULL;

Excel:

Open processed.xlsx, inspect the header row.
If you used --row-hash/--excel-row-numbers, they appear before data headers.

9) Python API (same power, full control)

from xlfilldown.api import ingest_excel_to_sqlite, ingest_excel_to_excel

## Python API

```python
from xlfilldown.api import ingest_excel_to_sqlite, ingest_excel_to_excel

# → SQLite
summary = ingest_excel_to_sqlite(
    file="data.xlsx",
    sheet="Sheet1",
    header_row=1,
    # choose one:
    fill_cols=["columnname1", "columnname2", "anothercolumn,3"],   # by header names
    # fill_cols_letters=["A", "B", "C"],                           # or by Excel letters
    db="out.db",
    table=None,
    drop_blank_rows=True,
    # choose one (or both, merged & de-duped):
    require_non_null=["columnname1", "columnname2"],               # by header names
    # require_non_null_letters=["A", "B"],                         # or by Excel letters
    row_hash=True,
    excel_row_numbers=True,
    if_exists="replace",
    batch_size=1000,
    fill_mode="hierarchical",    # default hierarchical fill
    # fill_mode="independent",   # independent (pandas-style) fill
)

# → Excel
summary = ingest_excel_to_excel(
    file="data.xlsx",
    sheet="Sheet1",
    header_row=1,
    fill_cols=["columnname1", "columnname2", "anothercolumn,3"],
    # or: fill_cols_letters=["A", "B", "C"],
    outfile="out.xlsx",
    outsheet=None,
    drop_blank_rows=True,
    require_non_null=["columnname1", "columnname2"],
    # or: require_non_null_letters=["A", "B"],
    row_hash=True,
    excel_row_numbers=True,
    if_exists="replace",
    fill_mode="independent",     # independent (pandas-style) fill
    # fill_mode="hierarchical",  # hierarchical (default)
)

Return values include row counts and flags indicating whether hash/row-numbers were written.

10) Recipes you’ll actually use

A. Group header rows + detail rows → keep only detail

Your sheet uses “group header” rows with just Region, then detail rows with Value.

xlfilldown db \
  --infile data.xlsx --insheet "Sheet1" --header-row 1 \
  --fill-cols '["Region","Country","City"]' \
  --require-non-null '["Value"]' \  # drop group headers lacking Value
  --drop-blank-rows \
  --db out.db --table facts

B. Only certain columns matter

You have many headers but only want a subset in the destination?
Create a view (SQLite) or a second pass to slim the data. (xlfilldown ingests all non-empty headers by design; this keeps hashing stable and predictable.)

C. Use letters when header names are messy

Headers like "anothercolumn,3" are awkward to type:

xlfilldown xlsx \
  --infile messy.xlsx --insheet "Raw" --header-row 3 \
  --fill-cols-letters A C AE \
  --outfile clean.xlsx --outsheet "Filled"

D. Strict filtering after fill

Drop any row where Region or City is missing after padding:

--require-non-null '["Region","City"]'

11) Performance tips

Batching (SQLite): --batch-size 1000 is a good default. Increase for very wide tables or slow disks.
Indexes: When --row-hash or --excel-row-numbers are enabled, helpful indexes are created automatically.
Streaming I/O: Source is opened with read_only=True, data_only=True. New Excel workbooks use write_only=True.
Memory: Processing is row-streamed; works well on large sheets.

12) Troubleshooting (messages you’ll see and what to do)

“Use only one of –fill-cols or –fill-cols-letters”
Pick one input style.
“Column letter X refers to an empty header cell”
Your header cell is blank (or whitespace/“nan”). Either populate it in Excel or switch to --fill-cols with explicit names.
“Duplicate header names found”
Rename duplicates in your sheet; xlfilldown requires unique headers (after trim/“nan” normalization).
“Header row N exceeds sheet max row M”
You pointed at the wrong header row.
Append failed with schema mismatch
You changed optional columns (e.g., added row_hash) or header order. Use --if-exists replace or make the destination match.

13) When to use hierarchical vs independent

Hierarchical

Data has tiers of context: Region → Country → City
When higher tier changes, you want lower tiers to reset (avoid stale carry)

Independent

Columns are unrelated (e.g., “Last Seen Date” and “Owner”)
You explicitly want each column to “carry forward” separately

14) FAQ

Q: Do you fill completely empty rows?
A: No. Rows that are empty across all headers are preserved (or dropped if --drop-blank-rows). The carry persists past them.

Q: Why is everything TEXT in SQLite/Excel output?
A: Determinism. Values are canonicalized to stable text for storage and hashing. You can cast in downstream tools.

Q: Does hashing treat 1, 1.0, and Decimal('1.00') the same?
A: Yes. They canonicalize to "1".

Q: Can I keep Excel’s original row numbers?
A: Yes: --excel-row-numbers adds an excel_row TEXT column (1-based).

Q: Can I preview before writing?
A: Not built-in yet. A common pattern is writing to a temp SQLite and inspecting with sqlite3/DB Browser.

15) Copy-paste cheat sheets

CLI (SQLite)

# basic
xlfilldown db --infile in.xlsx --insheet S --header-row 1 \
  --fill-cols '["A","B"]' \
  --db out.db

# hierarchical tiers + filters + extras
xlfilldown db --infile in.xlsx --insheet S --header-row 1 \
  --fill-cols '["Region","Country","City"]' \
  --drop-blank-rows \
  --require-non-null '["City","Value"]' \
  --row-hash --excel-row-numbers \
  --db out.db --table facts --if-exists replace

CLI (Excel)

xlfilldown xlsx --infile in.xlsx --insheet S --header-row 1 \
  --fill-cols-letters A C AE \
  --outfile processed.xlsx --outsheet Filled \
  --if-exists replace --row-hash --excel-row-numbers

Python

from xlfilldown.api import ingest_excel_to_sqlite

summary = ingest_excel_to_sqlite(
    file="in.xlsx",
    sheet="S",
    header_row=1,
    # choose one:
    fill_cols=["Tier1","Tier2","Tier3"],
    # fill_cols_letters=["A","B","C"],
    db="out.db",
    drop_blank_rows=True,
    require_non_null=["Tier3"],
    row_hash=True,
    excel_row_numbers=True,
    if_exists="replace",
    fill_mode="hierarchical",    # or "independent"
)
print(summary)

Raw Ingest Examples

Sometimes you may want to ingest a sheet exactly as-is, without applying any fill-down logic.
This is useful if you just want to load the data into SQLite or Excel with audit columns (row_hash, excel_row) and handle nulls/blanks later.

CLI: Raw ingest to SQLite

xlfilldown db \
  --infile data.xlsx --insheet Sheet1 --header-row 1 \
  --db out.db --table raw_ingest \
  --ingest-mode raw \
  --row-hash --excel-row-numbers --if-exists replace

CLI: Raw ingest to Excel

xlfilldown xlsx \
  --infile data.xlsx --insheet Sheet1 --header-row 1 \
  --outfile out.xlsx --outsheet RawSheet \
  --ingest-mode raw \
  --row-hash --excel-row-numbers --if-exists replace

Python API: Raw ingest to SQLite

from xlfilldown.api import ingest_excel_to_sqlite

summary = ingest_excel_to_sqlite(
    file="data.xlsx", sheet="Sheet1", header_row=1,
    ingest_mode="raw",                      # skip fill-down
    db="out.db", table="raw_ingest",
    row_hash=True, excel_row_numbers=True,
    if_exists="replace",
)
print(summary)

Python API: Raw ingest to Excel

from xlfilldown.api import ingest_excel_to_excel

summary = ingest_excel_to_excel(
    file="data.xlsx", sheet="Sheet1", header_row=1,
    ingest_mode="raw",                      # skip fill-down
    outfile="out.xlsx", outsheet="RawSheet",
    row_hash=True, excel_row_numbers=True,
    if_exists="replace",
)
print(summary)

One thought on “xlfilldown: the complete guide ( Excel Fill Down )”

The Secret Hierarchies of Excel: Why Managers Merge, and Engineers Cry - Rex Bytes says:

October 13, 2025 at 9:47 am

[…] xlfilldown: the complete guide ( Excel Fill Down ) One Million Rows: Testing xlfilldown Performance and Constant-Memory Design […]

Loading...

Log in to Reply

ByRex Bytes

0) Install

1) Why “fill-down” is tricky (and how xlfilldown solves it)

2) Your first 3-minute success

3) Using column letters (A, C, AE…)

4) Hierarchical vs Independent (with examples)

Hierarchical (default)

Independent

5) Dropping rows you don’t want

Spacer rows (drop rows blank across your fill columns):

Required non-null (drop if any listed headers are blank after fill):

6) Writing to Excel instead of SQLite

7) Append semantics (exact-schema match)

8) Verifying results (quick queries)

9) Python API (same power, full control)

10) Recipes you’ll actually use

A. Group header rows + detail rows → keep only detail

B. Only certain columns matter

C. Use letters when header names are messy

D. Strict filtering after fill

11) Performance tips

12) Troubleshooting (messages you’ll see and what to do)

13) When to use hierarchical vs independent

Hierarchical

Independent

14) FAQ

15) Copy-paste cheat sheets

CLI (SQLite)

CLI (Excel)

Python

Raw Ingest Examples

CLI: Raw ingest to SQLite

CLI: Raw ingest to Excel

Python API: Raw ingest to SQLite

Python API: Raw ingest to Excel

Share this:

Like this:

ByRex Bytes

Related Post

One thought on “xlfilldown: the complete guide ( Excel Fill Down )”

Leave a Reply Cancel reply

You missed