How do I populate spec_example_transactions from my core banking system?¶
Engineering walkthrough — Data Integration Team. Foundational.
The story¶
You've got an upstream core banking system with a gl_postings
table (or its local equivalent — general_ledger.entry,
accounting.posting_detail, etc.). It carries one row per posting
leg already, which is the natural granularity of our
spec_example_transactions table. You need to write the ETL job that
lands it in our two-table schema by the morning cut so the four
L2-fed dashboards (L1 Reconciliation, L2 Flow Tracing,
Investigation, Executives) work.
The good news: it's mostly a column-rename. The contract is small (11 mandatory columns + a handful of conditional ones — see Schema_v6.md → ETL contract / minimum viable feed). The bad news: skip the wrong column and a downstream check goes silent. So this walkthrough covers the canonical projection plus the per-column failure modes.
The question¶
"For my core banking system's gl_postings table, what's the
canonical projection that maps it to spec_example_transactions? What
columns must I populate, and what columns can wait until v2?"
Where to look¶
Two reference points:
docs/Schema_v6.md— column-level contract and per-column failure modes ("If you skip this, what dashboard breaks?").common/l2/schema.py::emit_schema(l2_instance)— the source of truth for the prefixed DDL. Every base table, view, and matview your dashboards read is emitted here, all under the L2 instance prefix. Call it from Python (or apply directly viaquicksight-gen schema apply --execute) to see the rendered output for your L2 instance.
The spec_example in this walkthrough's SQL is your L2 instance name
(e.g., spec_example); your ETL substitutes it once when wiring
the projection.
What you'll see in the demo¶
Run (substitute your own L2 path for the bundled fixture below):
from quicksight_gen.common.l2.loader import load_instance
from quicksight_gen.common.l2.schema import emit_schema
l2 = load_instance("tests/l2/spec_example.yaml")
print(emit_schema(l2)[:4000])
The first CREATE TABLE block is spec_example_transactions itself
— the column list, types, and constraints your projection has to
satisfy. The second is spec_example_daily_balances. Read both
end-to-end before writing the projection; they're the contract.
For an end-to-end mapping from core_banking.gl_postings →
spec_example_transactions, see Example 1 in docs/Schema_v6.md
(the SQL block under "Populating customer DDA postings from core
banking"). It's a real INSERT INTO ... SELECT FROM against a
hypothetical core-banking source schema.
What it means¶
For every row your ETL writes, you're committing to a contract:
- The 11 mandatory columns (per Schema_v6.md → minimum viable feed) get the row visible on the dashboards at all.
transfer_parent_idpopulated only for chained transfers (e.g.,external_txn → payment → settlement → saleor any reversal chain your L2 declares). Skip it and pipeline-traversal walkthroughs (Money Trail, Account Network) silently return nothing for the affected rows.origin = 'ExternalForcePosted'on Fed / processor force-posts. Skip it and the L1 drift split between bank-initiated activity and force-posted activity collapses (rows look like normal operator postings, drift checks under-fire).metadataJSON — the universal extras container. Skip it on day 1 if your downstream consumer doesn't need it; populate it in priority order (sourcefirst, then per-transfer_typekeys per the catalog). The catalog tables in Schema_v6 list the keys + what each one drives.
Everything else (memo, external_system, account_parent_role)
is conditional — populate when the downstream consumer demands it.
Drilling in¶
The mapping pattern looks like this for a customer-DDA posting (from Schema_v6 Example 1, abbreviated):
INSERT INTO spec_example_transactions (
transaction_id, transfer_id, transfer_type, origin,
account_id, account_name, account_parent_role, account_role,
account_scope, amount_money, amount, status,
posting, business_day_start, memo, metadata
)
SELECT
p.posting_id, -- your PK
p.transfer_id, -- your transfer grouping
p.transfer_type, -- map your enum to ours
'InternalInitiated' AS origin, -- or ExternalForcePosted for Fed
p.account_number AS account_id,
a.account_name,
a.parent_role AS account_parent_role,
a.account_role,
a.account_scope, -- Internal / External
p.signed_amount AS amount_money,
ABS(p.signed_amount) AS amount,
CASE WHEN p.posting_status = 'P' THEN 'Posted' ELSE 'Failed' END,
p.posting_timestamp AS posting,
p.posting_timestamp::date AS business_day_start,
p.memo,
JSON_OBJECT('source' VALUE 'core_banking')
FROM core_banking.gl_postings p
JOIN core_banking.accounts a ON a.account_number = p.account_number
WHERE p.posting_date >= CURRENT_DATE - INTERVAL '7 days';
A few things to note about this projection:
business_day_startis denormalized frompostingdeliberately — fast date filters in the dashboard datasets need a column they can range-scan without an expression cast. It's redundancy for query speed; the cost is your ETL writes one extra column.statusmaps from your status enum to ours. Anything that's notPostedMUST bePendingorFailed(no fourth state) — the drift check and net-zero check bothWHERE status = 'Posted'to exclude in-flight or rejected legs.amount_moneyis+for money flowing INTO the account (adebitin bank's-bookkeeping terms),−for money flowing OUT (acredit).spec_example_daily_balances.moneyfor any account-day equalsSUM(amount_money)up to that day, so getting this sign right is what makes the drift check honest. If your upstream uses the opposite sign convention, flip it here, not later in a view. Every check assumes our sign convention.metadatacarriessourceon every row from this projection (driven by theJSON_OBJECT(... VALUE 'core_banking')literal). That single key is enough to satisfy the Investigation provenance walks on day 1.
Next step¶
Once your projection is wired up:
- Populate a small slice — one day, one source system. Don't try to backfill 90 days on the first run.
- Run the validation walkthrough (How do I prove my ETL is working before going live?) — it walks you through the net-zero, drift-recompute, and parent-chain integrity checks you should run before declaring the load complete.
- Open the L1 Reconciliation Dashboard's Today's Exceptions sheet — if the KPI reads 0 with no detail rows, your feed landed and the contract holds. If KPIs spike unexpectedly, see What do I do when the demo passes but my prod data fails? for the symptom-organized debug recipes.
- Iterate on metadata — once the minimum feed is stable,
layer in
transfer_parent_idand the per-transfer_typemetadata keys per the Metadata JSON columns contract.
If your upstream source isn't a gl_postings table — say it's a
processor report, a Fed statement file, or a sweep-engine log —
the same projection shape applies, but the inbound columns differ.
Schema_v6.md's examples cover Fed-statement and processor-feed
ingest; the same INSERT INTO spec_example_transactions pattern
applies regardless of source.
Related walkthroughs¶
- How do I prove my ETL is working before going live? — the next step after writing the projection. Validates the invariants the dashboards depend on.
- How do I tag a force-posted external transfer correctly? —
the canonical pattern for Fed-statement ingest, which is the one
case where
origin = 'ExternalForcePosted'. - How do I add a metadata key without breaking the dashboards? — the extension contract for when your team needs a new metadata field.
- Schema_v6 → ETL contract / minimum viable feed — the column-by-column day-1 minimum the projection must satisfy.
- Investigation: Where did this transfer originate? —
a downstream consumer walkthrough: what an analyst does with
the
spec_example_transactionsrows your projection lands.