Skip to content
Fiscal Receipts

Methodology

Last updated: 2026-06-12

1. What this is

Fiscal Receipts connects four things that live in separate government silos: what agencies said money was for (budget documents), what was actually obligated and to whom (federal and state spending records), who ultimately received it (contractors, grantees, and their corporate parents), and what auditors said about it (GAO findings, improper-payment estimates). Every number on this site carries a citation to the exact source document, page, or API endpoint it came from. If we cannot cite it, we do not publish it.

2. Where every number comes from

Federal awards — USAspending.gov

The official federal award database (contracts, grants, loans, and subawards), mandated by the DATA Act. We download bulk archive ZIP files from files.usaspending.gov/award_data_archive/, convert them to compressed Parquet, and record the exact file name, URL, and SHA-256 hash of every file. Current scope: Department of Defense agencies, FY2017 onward. Update cadence: monthly.

DoD budget justification books (“J-books”)

The detailed budget submissions the Pentagon sends to Congress each spring, published at comptroller.defense.gov. These cover every R&D program (R-2 exhibits) and procurement budget line (P-40 exhibits) with program narratives, project-level cost tables, and congressional justifications.

The Pentagon's budget system embeds its own database inside these PDFs: the full structured XML is attached inside the PDF itself. We extract that XML directly. For the rare document that lacks an XML attachment, we use a deterministic PDF parser as a fallback, with an accuracy gate requiring ≥98% numeric-field agreement against XML-backed ground truth. We also download the official R-1 and P-1 Excel rollups, which serve as independent control totals. Update cadence: annual.

Improper-payment estimates — paymentaccuracy.gov

Agencies are legally required to estimate and report payment-error rates. The dollar exposure figure we show per agency is derived by multiplying the published rate by the published outlay figure. The federal government reported approximately $186 billion in improper payments in FY2025. This is a derived estimate and is labeled as such. Update cadence: annual.

GAO high-risk list — gao.gov/high-risk-list

The GAO's biennial list of federal programs at high risk for fraud, waste, or mismanagement. Update cadence: biennial.

Senate lobbying disclosures — lda.senate.gov

The Senate Lobbying Disclosure Act database (lda.senate.gov/api/v1) contains filings for 2025 and prior years, each with a permanent UUID, registrant, client company, dollar amounts, agencies lobbied, and issue text that frequently names specific programs. We have linked LDA client names to our company-family database: 32,780 program mentions across 245 programs connect filings to budget lines. Lobbying income and expenditure by year are shown alongside federal obligations received — influence is presented side by side with outcomes, never as a causal claim.

State checkbooks — California and Connecticut (pilot)

California's Open Fi$Cal and Connecticut's OpenCheckbook publish transaction-level government spending. We aggregate by department, spending category, and fiscal year, and use Census Bureau population estimates (NST-EST series) for per-capita comparisons. The category mappings that bridge both states' classification systems are published alongside the data.

California data covers FY2025 only — CA Open Fi$Cal updates on a lag; prior years not yet ingested. why →

3. How we verify

Reconciliation

Every figure extracted from a J-book clears two arithmetic checks. Check A: project-level amounts within an exhibit must sum to the program-element total in that same exhibit (tolerance: ±$0.001M). Check B: that program-element total must match the corresponding row in the official R-1 or P-1 Excel rollup. Failures do not get published — they go to a human review queue.

Zero-absent rule

When a program has no funding for a given fiscal year, the official rollup workbook simply omits the row. Our reconciliation code recognizes this so zero-funded programs are not incorrectly flagged as failures.

Coverage gate

At least 99% of R-1 program elements must have either extracted detail or an explicit gap record. Silent holes are a build failure.

Provenance spot-check

Each build randomly samples 50 served facts and mechanically verifies that the cited source document exists on disk, its SHA-256 matches the download manifest, and the XML element path resolves to a real node. A number whose citation chain breaks does not render.

Per-build automated checks

197 automated test functions across 42 test modules, plus phase-level verification gates, plus 21 dbt data-model assertions run on every build. The analyst-agent evaluation set (45 question-answer pairs) requires ≥41 correct answers and 100% citation resolution before shipping.

4. How confident to be

Citation tiers

Every rendered figure is in one of three states. Cited (underlined, clickable): a fact_id resolves to a source document in our citation index — J-book PDF page-and-bbox, or a specific workbook cell. XML-path chip: a zero-dollar budget line that exists in the structured XML but has no corresponding citation row — the XML element path is displayed. Citation tier pending (⁂): figure is from a dataset (USAspending, LDA filings, trajectory workbooks) for which row-level citation linkage is not yet complete — methodology work in progress.

Company families — registry fact vs. name inference

High confidence (registry fact): SAM.gov records a common registered parent name for the subsidiaries. Medium confidence (name inference): slightly different legal-name variants normalize to the same string. Both tiers appear on screen; the method is always disclosed.

Budget-to-contract links

Connecting a budget program element to the contracts that funded it is an inference. We use three tiers. High: the award's federal account code matches the budget line's appropriation, and program-title keywords overlap substantially. Medium: account matches and the contracting sub-agency matches the budget organization. Low: only the account matches. Low-tier links are useful for exploration but are not evidence of a program-to-program connection.

Derived figures are labeled derived

Any figure computed from published rates or published subtotals — rather than directly reported in a source document — is labeled as derived wherever it appears.

5. Known limitations

  • FY attribution is approximate.Contracts execute across multiple fiscal years; our current method assigns links based on which fiscal years' award transactions share the same federal account code.
  • Losing bidders are not in federal data. FPDS records offer counts and competition type; it does not name unsuccessful bidders.
  • Company family grouping by name inference can err. Acquired, divested, or renamed subsidiaries may be grouped incorrectly. Corrections create superseding records; the original is retained.
  • Improper-payment dollar figures are derived from OMB-published rates and carry the same uncertainty as the underlying rate estimates.
  • State comparables depend on category mappings. Treat cross-state comparisons as directional.
  • Classified programs are absent. DoD classified budget lines are not in public J-books or USAspending.
  • Data-as-of dates vary by source. USAspending updates monthly; GAO high-risk is biennial; J-books are annual. Numbers on the same page may reflect different time periods.
  • Lobbying-obligations correlation is not causation. Lobbying expenditure appearing alongside federal obligations is presented for transparency, not to imply that lobbying caused any particular award.

Coverage & limits

Several surfaces on this site are deliberately partial: we show a link only when we can defend it, and we say so where the data renders instead of burying the caveat here. Each block below is the “why” behind one of those inline scope notes.

Follow the dollar — 17 of 326 programs

The follow-the-dollar view draws a budget line's path to specific awards, recipient families, and districts. That link is an inference (§4): we render the flow only for the high-confidence crosswalk tier, where the award's federal account matches the budget line's appropriation and program-title keywords overlap substantially. Today that covers 17 of 326 programs, concentrated in DARPA lines whose account structure makes matching reliable. Program pages outside the crosswalk say so in place of the flow — absence of a diagram means we could not defend the link, not that no money moved.

Research dossiers — 50 of 326 programs

Dossiers exist for 50 of 326programs, selected by ranking FY2026 requested dollars — the top 50 by money at stake, not by editorial judgment. Every dossier sentence must carry a resolvable citation or the build fails (“cited-or-absent”), so programs without a dossier show a one-line note instead of unsourced prose. Coverage grows as the research pipeline is run against more programs.

Company award linkage — 18 of 200 profiled companies

Company profiles cover the top 200 contractor families by DoD obligations. Award rows on those profiles come from the budget→award crosswalk, which currently contains R&D performers rather than primes — so only 18 of 200 profiled companies show linked awards. The remaining profiles still carry obligation totals and lobbying activity; a family-level awards mart is on the roadmap.

District lens — 106 of 435 districts

106 of 435congressional districts appear in the district lens. A district gets a page only when at least one high-confidence budget→award link places obligated dollars there — a consequence of the crosswalk's current 17-program scope, not evidence that other districts receive no defense money. District totals therefore understate true defense spending everywhere they appear.

California — FY2025 only

California figures come from CA Open Fi$Cal and cover FY2025 only — the state publishes on a lag and prior years have not been ingested yet. Connecticut (OpenCheckbook) is the only other state in the pilot. Cross-state comparisons rely on our published category mappings and should be treated as directional, not exact.

FY2026 is a partial year

FY2026 does not close until September 30, 2026, and USAspending reports awards on a rolling basis — any FY2026 award total shown is a partial-year figure that will grow. FY2026 budget figures are the requested amounts from the FY2026 J-books, not enacted appropriations. Comparing partial FY2026 award totals against complete prior years will always understate FY2026.

Anomaly Feed — signal types and thresholds

The /feed page surfaces automated signals computed from the defense budget and award data. Each signal type has a defined threshold; all figures carry citations.

Year-over-Year Swings (yoy_swing)

Programs where FY2025 total is ≥ $50M and the absolute percentage change to FY2026 is ≥ 50%. Budget figures come from the fct_budget_trajectory mart (trajectory pivot of FY2025 and FY2026 enacted/requested budget workbook lines). The figure shown is the percentage change; the delta in dollar terms is the cited trajectory figure.

Zeroed in FY2026 (zeroed_fy2026)

Programs that had a positive FY2025 total but show zero or null in FY2026. The figure shown is the last known FY2025 amount. No FY2025 floor — any positive amount qualifies. Programs may be cancelled, transferred, or restructured into another line item.

Award Concentration Shifts (concentration_shift)

Programs whose Herfindahl-Hirschman Index (HHI), computed from high-confidence award transactions grouped by fiscal year, is non-trivial. The HHI floor is $5M in matched obligations. HHI = sum(share² × 10,000) where share = family_obligation / total_obligation; only positive obligations are included (negative/recoupment flows are excluded). An HHI above 2,500 indicates near-monopoly concentration; above 1,500 is moderately concentrated.

New Defense Contractors (new_entrant)

Entity families whose first award year in the DoD transaction data is FY2024 or later and whose cumulative positive obligations exceed $1M. “First award year” is determined from the USAspending award archive (FY2017 onward). Entities that received their first award before FY2017 may appear as new entrants due to archive coverage limits — treat as a weak signal. The figure shown is total cumulative obligations (USD).

6. Corrections

If you find a number that appears wrong, send us the citation that contradicts it and we will investigate. We follow a supersede-not-delete policy: a corrected record is marked superseded and a new record takes its place. The old record is retained and accessible. Permalinks continue to resolve permanently; they show the current best value alongside the correction history if one exists.

7. Cite us / bulk data

When citing a specific figure, include the source citation displayed alongside it: document title, fiscal year, page or XML element path, and the date we retrieved the file. USAspending-derived figures cite the archive file name and SHA-256 hash; J-book figures cite the PDF title, page number, and XML element path.

Bulk data exports (Parquet files with data dictionaries) are available — see Downloads. The export schema includes the same provenance metadata that backs every on-screen figure.