1.300 Public Finance Modeling#

Explainer

Domain Explainer: Public Finance Modeling#

What is Public Finance Modeling?#

Public finance modeling refers to the computational simulation of tax and benefit policies to understand their impacts on individuals, households, and populations. Unlike general-purpose financial modeling (which might analyze corporate cash flows or investment portfolios), public finance modeling specifically deals with government revenue systems and social programs.

Why It Matters#

Policy decisions affect millions of lives. When legislators propose changing tax rates, creating new credits, or modifying benefit eligibility, they need to understand:

Who wins and who loses? Distributional analysis shows which income groups benefit or pay more
How much will it cost? Revenue estimates project fiscal impacts
What are the incentive effects? Marginal tax rate calculations reveal work incentives
Will it reduce poverty? Benefit simulations estimate poverty reduction

Without computational models, policy analysis relies on back-of-the-envelope estimates or small samples that may not represent the full population.

How It Works: Microsimulation#

The core technique is microsimulation:

Start with representative microdata (census, IRS data, surveys)
- Each record represents a household or individual
- Contains income, family structure, state of residence, etc.
Encode tax/benefit rules as code
- Federal income tax: brackets, deductions, credits
- State income tax: rates, conformity with federal rules
- Benefits: eligibility criteria, phase-outs
Apply rules to every record
- Calculate tax liability or benefit eligibility
- Weight by population to get national/state estimates
Compare baseline vs. reform
- Baseline: Current law
- Reform: Proposed policy change
- Difference: Who gains/loses, revenue impact

Example:

Baseline: Child Tax Credit is $2,000 per child
Reform: Increase to $4,000 per child
Model: Apply both rules to 200,000 household records
Output: “Reform reduces poverty by 1.2M people, costs $100B/year”

Who Uses These Tools?#

Government Agencies#

Congressional Budget Office (CBO): Scores legislation for fiscal impact
Treasury Department: Revenue estimates for tax proposals
UK HM Treasury: Uses PolicyEngine for official modeling
French Government: Uses OpenFisca for tax-benefit analysis

Think Tanks & Policy Organizations#

Tax Policy Center: Analyzes tax reform proposals
Center on Budget and Policy Priorities: Evaluates anti-poverty programs
American Enterprise Institute: Conservative policy analysis
(Tools are used across political spectrum)

Academic Researchers#

Study tax incidence (who really pays taxes)
Analyze behavioral responses (do tax changes affect work decisions?)
Evaluate program effectiveness (does EITC reduce poverty?)
Publish in journals like National Tax Journal, Journal of Public Economics

Advocacy Groups#

Labor unions: Analyze wage tax interactions
Business groups: Corporate tax burden analysis
Anti-poverty advocates: Benefit program expansions
State-level organizations: State budget analysis

Technical Challenges#

1. Data Quality#

Microdata is expensive/restricted: IRS Public Use File costs money, has privacy limitations
Survey underreporting: High-income households underrepresented
Imputation needed: Link multiple datasets to get full picture

2. Rule Complexity#

Federal tax code: 6,000+ pages of rules
Interactions: EITC + Child Tax Credit + SNAP + Medicaid + housing assistance all interact
State variations: 50 different state tax codes
Temporal changes: Rules change every year, sometimes multiple times per year

3. Validation Difficulty#

Aggregate statistics: Can compare model output to published IRS totals
Individual accuracy: Hard to validate individual-level calculations
Behavioral responses: Models are often “static” (assume no behavior change)

Why Open Source Matters#

Transparency in policymaking:

Tax laws affect everyone - models should be publicly auditable
Proprietary models (like ITEP) are black boxes - can’t verify methodology
Open-source models can be peer-reviewed by academics

Reproducibility:

Academic papers should provide replication code
Policy organizations should show their work
Open models enable independent validation

Ecosystem effects:

Open tools lower barriers to entry for new researchers
Collaboration improves quality (many eyes on the code)
Government adoption is easier (no licensing fees, vendor lock-in)

Current State of Open Source#

Strong foundation:

Tax-Calculator: US federal, public domain, widely used
PolicyEngine: US + UK, all 50 states, web interface
OpenFisca: Multi-country, government-adopted (France)

Gaps:

Property tax: No open-source solution
Sales tax: Commercial APIs exist (TaxJar, Avalara) but not research tools
Local income taxes: NYC, Philadelphia underserved
Benefits: SNAP, Medicaid less mature than tax modeling

Comparison to Corporate Finance#

Aspect	Public Finance	Corporate Finance
Focus	Taxes, benefits, equity	Investment, valuation, risk
Users	Government, think tanks	Companies, investors
Data	Microdata (census, IRS)	Financial statements
Rules	Tax code, eligibility criteria	Accounting standards
Goals	Equity, revenue, poverty	Profit, shareholder value
Tools	Tax-Calculator, OpenFisca	Excel, Bloomberg Terminal

Why separate? Tax rules are qualitatively different from financial modeling:

Discontinuities (phase-outs, cliffs) not smooth curves
Legal complexity (6,000-page tax code vs. GAAP)
Distributional focus (who pays) vs. aggregate focus (total profit)
Public interest (open source, transparency) vs. competitive advantage (proprietary)

Example: Child Tax Credit Reform#

Context: Current Child Tax Credit (CTC) is $2,000 per child under 17. Proposal: increase to $4,000.

Questions policymakers ask:

How much does it cost?
Who benefits?
Does it reduce poverty?
What are the work incentive effects?

How a model answers:

Output:

Cost: $120B per year
Poverty reduction: 1.4M people (especially children)
Gains: $2,000/year for families with children, $0 for childless households
Work incentives: Minimal impact (CTC doesn’t phase in with earnings)

Policy debate:

Proponents: Reduces child poverty significantly
Critics: Expensive, could prioritize other anti-poverty programs
Model doesn’t resolve debate, but quantifies trade-offs

Key Terms#

Microsimulation: Applying rules to individual records and aggregating
Marginal Tax Rate (MTR): Extra tax paid on next dollar earned
Effective Tax Rate (ETR): Total tax / total income
Distributional Analysis: Who pays / who benefits by income group
Revenue Estimate: Projected government revenue under a policy
Baseline vs. Reform: Current law vs. proposed change
Static Model: Assumes no behavioral response to policy changes
Dynamic Model: Estimates behavioral responses (labor supply, saving)
Incidence: Who ultimately bears the economic burden of a tax

S1: Problem Overview#

The Core Problem#

How do we know what a tax or benefit policy will do before we enact it?

Policymakers face a dilemma:

Pass legislation → wait months/years → see actual effects
OR: Model the policy beforehand → identify problems → adjust

Traditional approach (pre-2000s):

Government agencies build proprietary models
Black-box calculations, no public access
Researchers can’t verify claims
Each state/country duplicates effort

Problem: No transparency, no reproducibility, wasteful duplication

Why This Is Hard#

1. Complexity of Tax Codes#

US federal income tax alone:

~100+ forms and schedules
Hundreds of interacting provisions
Phase-ins, phase-outs, cliffs, kinks
Different definitions of income (AGI, MAGI, earned income)
Credits vs. deductions vs. exemptions
Alternative Minimum Tax (parallel tax system)

Example interaction complexity:

EITC (Earned Income Tax Credit) depends on:
  → Earned income (wages, self-employment)
  → AGI (for phase-out)
  → Number of qualifying children
  → Filing status
  → Investment income limit ($11,000 threshold)

Change CTC (Child Tax Credit) →
  → Changes AGI →
  → Changes EITC phase-out →
  → Changes net refund

2. Multi-Level Government#

In the US:

Federal income tax
State income tax (43 states)
Local income tax (cities: NYC, Philadelphia)
Property tax (3,000+ counties)
Sales tax (11,000+ jurisdictions)

Interactions:

State taxes deductible on federal return (SALT cap)
Some states conform to federal rules, others don’t
Credits for taxes paid to other states
Reciprocity agreements (commuter states)

3. Data Requirements#

To model policies accurately, need:

Microdata: Representative sample of population (Census, IRS)
Tax units: Convert individuals to filing units (married/single)
Income components: Wages, capital gains, dividends, etc.
Demographics: Age, kids, disability status
Weights: Scale sample to full population

Problem: Privacy laws limit access to detailed data

4. Behavioral Responses#

Tax changes affect behavior:

Higher marginal rates → work less (labor supply response)
Tax credits for kids → more childbearing? (demographic response)
Corporate rate changes → investment decisions (capital response)

Static vs. dynamic modeling:

Static: Assume behavior doesn’t change (simpler)
Dynamic: Model behavioral responses (complex, uncertain)

Most libraries are static (documented limitation)

What Public Finance Modeling Solves#

1. Revenue Estimation#

Question: “Will this policy pay for itself?”

Process:

Load representative sample (e.g., 200,000 tax returns representing 150M filers)
Apply current law rules → aggregate revenue
Apply reformed rules → aggregate revenue
Difference = revenue impact

Output:

Baseline revenue: $2.1 trillion
Reformed revenue: $1.9 trillion
Cost: $200 billion

2. Distributional Analysis#

Question: “Who wins and loses?”

Process:

Calculate current tax for each household
Calculate reformed tax for each household
Group by income quintile (bottom 20%, next 20%, …)
Average change by group

Output:

Income Group       Average Change    % Benefit
Bottom 20%         +$2,000           +15%
Second 20%         +$500             +2%
Middle 20%         $0                0%
Fourth 20%         -$300             -1%
Top 20%            -$5,000           -2%

Conclusion: Progressive (helps lower incomes)

3. Policy Reform Testing#

Question: “What if we change X?”

Scenarios:

Increase standard deduction by $5,000
Expand EITC to childless workers
Add new child allowance
Change capital gains rate
Phase out deductions for high earners

Output: Revenue cost, distributional impact, administrative complexity

4. Marginal Tax Rate Analysis#

Question: “What’s my incentive to earn one more dollar?”

Why it matters:

High MTRs discourage work
Phase-outs can create 50%+ MTRs
Benefits phase-outs add to tax MTRs

Output:

Income Level    Federal MTR    State MTR    EITC Phase-out    Effective MTR
$25,000         12%           5%           21%               38%
$50,000         22%           5%           0%                27%
$500,000        37%           10%          0%                47%

Scope of This Research#

This survey covers:

In Scope#

Microsimulation engines: Apply rules to population samples
Tax-benefit systems: Income tax, payroll tax, refundable credits
Country-specific models: US, UK, France (major implementations)
Open-source libraries: Publicly available, reproducible

Out of Scope#

Proprietary models: Government/think tank internal tools
Commercial tax software: TurboTax, H&R Block (compliance, not modeling)
Sales tax APIs: TaxJar, Avalara (transaction-level, not policy analysis)
Spreadsheet models: Ad-hoc Excel/Google Sheets calculators
Dynamic scoring: Behavioral response models (future research)

Target Users#

Primary Users#

Policy analysts (government, think tanks)
Academic researchers (economics, public finance)
Advocacy organizations (evaluating proposals)
Journalists (fact-checking, explainers)

User Needs#

Transparency: See how calculations work
Reproducibility: Others can verify results
Flexibility: Test novel policy ideas
Accuracy: Match official government projections
Performance: Simulate 150M+ people in reasonable time

Required Skills#

Python programming (intermediate)
Tax policy knowledge (understand terms like AGI, MAGI, credits)
Statistics (survey weighting, sampling)
Microdata experience (Census, IRS data)

Learning curve: 2-3 months to become productive

Why Existing Solutions Fall Short#

Problem 1: Multi-State Complexity#

Most tools focus on federal taxes
State income taxes have 43 different rule sets
Federal-state interactions (SALT deduction)
Cross-border workers (NY resident, NJ job)

Gap: No comprehensive open-source multi-state model (until PolicyEngine 2024)

Problem 2: Property Tax#

1/3 of state/local revenue
3,000+ counties with unique rules
Complex exemptions (homestead, senior, agricultural)
Gap: No open-source property tax library exists

Problem 3: Sales Tax Research#

11,000+ jurisdictions
Product-specific exemptions
Gap: Commercial APIs exist (TaxJar), but not for policy research
- Too expensive for academics
- Not designed for counterfactual analysis

Problem 4: Integration#

Comprehensive tax burden = income + payroll + property + sales
Each tax type has separate tools (if any)
Gap: No unified household tax burden calculator

Success Criteria#

A successful public finance modeling library should:

Encode official rules accurately (validation against published examples)
Handle edge cases (AMT, child tax credit phase-outs, NIIT)
Scale to full population (150M+ tax units in US)
Support counterfactual reforms (easy to modify rules)
Provide distributional outputs (by income, age, geography)
Be maintainable (annual tax law changes)
Have comprehensive tests (known correct answers)
Offer good documentation (examples, not just API reference)

Cross-Cutting Concerns#

Data Privacy#

Microdata contains sensitive info (income, family structure)
Public Use Files (PUFs) have reduced detail
Some models use synthetic data (algorithmically generated)

Computational Performance#

150M tax units × 100+ calculations each = 15B operations
Need efficient vectorized operations (NumPy)
Typically 5-30 seconds per full simulation

Version Control#

Tax laws change every year (TCJA sunset in 2026)
Need to model historical years (for research)
Parameters vs. structure (rate changes vs. new provisions)

International Portability#

OpenFisca approach: Core engine + country packages
Challenge: Each country has unique concepts (France: “quotient familial”, US: “filing status”)

These research areas intersect with public finance modeling:

1.094 Constraint Solving: Budget optimization (maximize benefits, minimize tax)
1.101 PDF Processing: Extract tables from tax forms (IRS instructions)
1.301 Government Data Access: APIs for Census, IRS, BLS data
1.302 Budget Document Parsing: Extract spending data from CAFRs
1.303 Civic Entity Resolution: Match taxpayers across datasets

Why This Matters#

Quote from Tax Policy Center:

“Microsimulation models have become the standard for analyzing tax proposals. Without them, policy debates would rely on guesswork and ideology rather than evidence.”

Real impact:

UK Treasury uses PolicyEngine for Universal Credit analysis (8M households)
US CBO uses Tax-Calculator for official revenue scores
France runs its social benefit system on OpenFisca code

Benefit: Better-informed policy, fewer unintended consequences, transparent debate.

S2: Comprehensive

S2: Prior Art - Existing Tools#

Overview#

This section catalogs existing public finance modeling libraries, their capabilities, limitations, and adoption patterns.

1. OpenFisca#

Links: Website | GitHub | Documentation

Language: Python License: AGPL-3.0 Maintenance: Active (latest commit: December 2024, 207 stars, 5,153 commits) Python Support: 3.9+

Description#

OpenFisca is a versatile microsimulation engine that models tax and benefit systems as code. It originated in France in 2011 and has been adopted by multiple governments internationally. The architecture separates the core engine (openfisca-core) from country-specific packages (OpenFisca-France, OpenFisca-Italy, etc.).

Design philosophy:

Tax legislation should be expressed as executable code
Same code used for simulation, administration, and compliance
Web API enables non-programmers to run simulations
International: Core engine works for any country’s rules

Key Features#

Rules as code: Tax legislation expressed as Python functions with decorators
Web API: REST interface for simulations without Python
Survey data integration: Analyze reforms using census/administrative data
Multi-country support: France, Italy, UK (via PolicyEngine fork), Tunisia, Senegal
Interactive reforms: Calculate effects on single situations or entire populations
Formula versioning: Track how rules change over time
Period handling: Daily, monthly, yearly calculations
Entity modeling: Persons, families, households, tax units

Architecture#

Installation#

Example Usage#

Notable Users#

French government: Official tax-benefit modeling, used by Direction générale du Trésor
Tunisian government: Social benefit eligibility
Academic researchers: Worldwide tax policy studies
International organizations: World Bank, ILO for policy analysis

Strengths#

Mature codebase: 10+ years of development
Battle-tested: Used for official government calculations
International adoption: Proven in multiple countries
Web API: Non-programmers can use it
Formula versioning: Historical analysis possible

Limitations#

Country package quality varies: France is comprehensive, others less so
AGPL-3.0 license: Requires derivative works to be open-sourced (may restrict commercial use)
Steep learning curve: Complex entity modeling, decorators
Documentation inconsistency: Varies by country package
Performance: Python loops can be slow for large datasets (improving)

Governance#

Core maintained by OpenFisca team (nonprofit)
Country packages maintained by respective governments/communities
Monthly contributor calls
RFC process for major changes

Sources: OpenFisca Documentation, GitHub, France Digital Service

2. PolicyEngine#

Links: Website | GitHub | Core Docs

Language: Python License: AGPL-3.0 Maintenance: Active (16 stars on core, 5,139 commits, 11 open PRs) Python Support: 3.10-3.13

Description#

PolicyEngine is a nonprofit platform offering free, open-source tax-benefit microsimulation for the US and UK. Built on a fork of OpenFisca-Core, it provides both Python libraries and web applications for policy analysis. Major milestone: As of 2024, PolicyEngine covers all 50 US states plus DC for comprehensive state income tax modeling.

Design philosophy:

Make policy analysis accessible to everyone (web app)
Combine traditional microsimulation with machine learning
Use official government microdata + synthetic enhancement
Free, no paywalls or usage limits

Key Features#

Web application: No-code interface for reform design and analysis
US coverage: Federal + all 50 states (launched 2024)
UK model: Full UK tax-benefit system with official government use
Machine learning: Addresses undersampling and measurement errors in survey data
Individual & population analysis: Household impacts + distributional effects
API access: Programmatic access to all functionality
Real-time computation: Immediate results (< 1 second for individual, ~30 seconds for population)
Household calculator: Input your situation, see your tax

Architecture#

PolicyEngine-Core → PolicyEngine-US / PolicyEngine-UK

Installation#

Example Usage#

Notable Users#

UK HM Treasury: Official government use documented in algorithmic transparency records
US policy researchers: Think tanks, advocacy organizations
Academic institutions: Teaching and research
Journalists: Fact-checking policy claims

Strengths#

Web app accessibility: Non-coders can design reforms
All 50 US states: Most comprehensive open-source US model
Official UK adoption: Validated by government use
Active development: 50+ commits/month
ML enhancement: Better population representation
Free API: No usage quotas

Limitations#

US state models new: Launched in 2024, need more validation
Requires domain knowledge: Understanding tax concepts still needed
Web app limitations: Complex reforms may need Python
AGPL-3.0 license: Same restrictions as OpenFisca
Data dependencies: Uses proprietary enhancements to public data

Innovation: ML-Enhanced Microsimulation#

Traditional approach: Use survey data as-is (undersampling, measurement error)

PolicyEngine approach:

Start with public microdata (CPS, FRS)
Train ML models on administrative aggregates (IRS totals)
Adjust weights and impute missing values
Validate against published statistics

Result: Better matches to official revenue/poverty numbers

Governance#

Nonprofit organization (PolicyEngine Inc.)
Public GitHub development
Community contributions welcome
Partnerships with governments (UK Treasury)

Sources: PolicyEngine Website, Core Docs, UK Gov

3. Tax-Calculator#

Links: Website | GitHub

Language: Python License: CC0 1.0 (Public Domain) Maintenance: Active (latest release: 6.4.0 on Feb 4, 2026, 292 stars, 124 releases) Python Support: 3.11-3.13

Description#

Tax-Calculator is the most established US federal income and payroll tax microsimulation model, maintained by the Policy Simulation Library (PSL). It’s widely used in policy analysis, think tank research, and academic studies. The model can estimate aggregate revenue and distributional effects of tax reforms when paired with representative population data.

Design philosophy:

Academic rigor: extensive validation and testing
Public domain: zero restrictions on use
Federal focus: deep not broad (federal only, but comprehensive)
Integration: Works with other PSL models (Cost-of-Capital-Calculator)

Key Features#

Comprehensive US federal tax code: Income tax, payroll tax, refundable credits
Marginal tax rates: Calculate MTRs for 18 different income types
Reform analysis: Compare current law vs. proposed reforms
Integration: Works with Cost-of-Capital-Calculator for business tax analysis
TCJA modeling: Includes 2026 provision expirations (TCJA sunset)
Extensive testing: Complete code coverage with hundreds of tests
Public domain: No license restrictions on use or modification
Historical capability: Model taxes back to 2013

Architecture#

Three main components:

Policy: Tax rules and parameters
Records: Microdata representing tax filers
Calculator: Applies Policy to Records

Installation#

Example Usage#

Notable Users#

Tax Policy Center: Major think tank co-founded by Urban Institute and Brookings
Congressional Budget Office (CBO): Official US government budget analysis
Academic researchers: Cited in hundreds of papers
Policy advocacy organizations: Across the political spectrum

Strengths#

Most cited: Academic gold standard for US federal tax analysis
Public domain: No licensing concerns
CBO validation: Used for official scoring
Extensive tests: 100% code coverage
Documentation: Tutorial, cookbook, API reference
Historical analysis: Model past years

Limitations#

Federal only: Does not model state or local taxes
No benefit programs: Focuses on taxes, not SNAP/Medicaid/etc.
Data requirements: Needs microdata (PUF or CPS) for population-level estimates
Behavioral responses: Static model (no labor supply or saving responses)
Learning curve: Requires understanding of tax terminology

Validation#

Tax-Calculator validates against:

IRS Statistics of Income (SOI) aggregates
Tax Policy Center estimates
JCT (Joint Committee on Taxation) revenue scores
CBO baseline projections

Published validation reports: Annual comparisons to official statistics

Governance#

Policy Simulation Library (PSL) project
Community-driven development
Academic advisory board
Annual contributor meetings

Sources: Tax-Calculator Docs, GitHub, PSL

4. Cost-of-Capital-Calculator (CCC)#

Links: Website | GitHub

Language: Python License: CC0 1.0 (Public Domain) Maintenance: Active (latest release: 2.1.0 on Aug 25, 2025, 19 stars, 1,781 commits) Python Support: 3.11-3.13

Description#

Cost-of-Capital-Calculator (CCC) evaluates how US federal taxes affect corporate and non-corporate investment incentives. It computes marginal effective tax rates (METRs) on new investments by combining business asset data with individual tax filer microdata. CCC is part of the Policy Simulation Library ecosystem and integrates with Tax-Calculator.

Design philosophy:

Business tax analysis complement to Tax-Calculator
Academic foundation (Jorgenson-Hall cost of capital framework)
Integration of entity and shareholder taxation
Focus on investment incentives, not revenue estimation

Key Features#

Marginal effective tax rates: Cost of capital by asset type and industry
Corporate & non-corporate: Pass-through entities and C-corporations
Individual integration: Models taxation at both entity and shareholder levels
Depreciation schedules: Handles complex depreciation rules (MACRS, bonus depreciation)
Policy scenarios: Analyze investment incentive effects of tax reforms
Tax-Calculator integration: Combines business and individual tax modeling
Web interface: Available at ccc.pslmodels.org (limited functionality)
Asset-level detail: 96 asset types, 10 financing strategies

Concepts#

Marginal Effective Tax Rate (METR):

Interpretation:

METR = 0%: No tax distortion (tax doesn’t affect investment)
METR = 20%: Tax increases required return by 25% (1/0.8 - 1)
METR < 0%: Tax subsidy (encourages investment)

Installation#

Example Usage#

Notable Users#

Academic researchers: Studying corporate taxation and investment
Policy analysts: Evaluating business tax reforms (R&D credits, depreciation)
Think tanks: Combined with Tax-Calculator for comprehensive tax analysis

Strengths#

Unique focus: Only open-source business tax library
Academic rigor: Based on established economic framework
Integration: Works with Tax-Calculator
Asset detail: 96 asset types
Public domain: No licensing restrictions

Limitations#

US federal only: No state/local business taxes
Static model: No dynamic investment or growth responses
Complexity: Requires understanding of corporate tax and financial economics
Data needs: Relies on SOI and other IRS data sources
Integration required: Most useful when combined with Tax-Calculator
Niche use case: Smaller user community than Tax-Calculator

Integration with Tax-Calculator#

Combined analysis:

Tax-Calculator: Individual/household tax impacts
CCC: Business investment incentives
Together: Comprehensive tax reform analysis (household + business)

Governance#

Policy Simulation Library (PSL) project
Maintained by American Enterprise Institute researchers
Community contributions welcome

Sources: CCC Website, GitHub, PSL

Comparison Matrix#

Feature	OpenFisca	PolicyEngine	Tax-Calculator	CCC
Countries	France, Italy, Tunisia, etc.	US, UK	US only	US only
Scope	Income + benefits	Income + benefits + state	Federal income/payroll	Business taxes
Web App	Yes (limited)	Yes (full-featured)	Yes (basic)	Yes (basic)
License	AGPL-3.0	AGPL-3.0	Public Domain	Public Domain
API	REST	REST + Python	Python only	Python only
State Taxes	N/A	All 50 states	No	No
Maturity	10+ years	3 years	15+ years	5+ years
Government Use	France (official)	UK Treasury (official)	CBO, TPC	Research only
Learning Curve	Steep	Moderate	Moderate	Steep
Documentation	Good (varies)	Excellent	Excellent	Good

Ecosystem Tools#

These aren’t microsimulation engines but are commonly used alongside them:

Data Access#

census (Python): Census Bureau API wrapper
tidycensus (R): Tidy interface to US Census data
pandas (Python): DataFrame manipulation

Analysis#

statsmodels (Python): Regression analysis for tax incidence
scikit-learn (Python): ML for imputation, weighting
survey (R): Survey-weighted estimation

Visualization#

matplotlib/seaborn (Python): Charts, distributional plots
plotly (Python): Interactive dashboards
ggplot2 (R): Publication-quality graphics

Historical Context#

Evolution of Public Finance Modeling#

1960s-1980s: Government agencies build proprietary models

No transparency
No reproducibility
Each country/agency duplicates effort

1990s-2000s: First open-source attempts

TAXSIM (NBER): Web-based tax calculator (still active)
Early microsimulation models (FORTRAN, SAS)

2010s: Modern Python era

OpenFisca (2011): Rules as code philosophy
Tax-Calculator (2010): PSL ecosystem
PolicyEngine (2021): Web app + open source

2020s: Maturation

Government adoption (France, UK)
All 50 US states (PolicyEngine 2024)
ML-enhanced microsimulation

Key Innovations#

Rules as code (OpenFisca 2011)
Public domain licensing (PSL 2010)
Web accessibility (PolicyEngine 2021)
ML data enhancement (PolicyEngine 2022)
Comprehensive state coverage (PolicyEngine 2024)

Gaps Remain#

Even with these excellent tools, gaps persist:

Property tax: No open-source library (3,000+ counties)
Sales tax research: Commercial APIs exist, not policy modeling tools
Multi-jurisdictional: Cross-border workers, part-year residents
Behavioral responses: All tools are static models
Integration: No unified household tax burden calculator (income + property + sales)

See S3 (Solution Space) for approaches to these gaps.

S3: Need-Driven

S3: Solution Space - Approaches to Filling Gaps#

Overview#

This section explores approaches to building new public finance modeling tools, addressing the gaps identified in S1 and S2. We focus on three major gaps:

Property tax calculation libraries
Sales tax modeling for policy research
Multi-jurisdictional integration

Gap 1: Property Tax Calculation Libraries#

The Problem#

Property tax generates ~$600B annually in the US (1/3 of state/local revenue), yet there’s no open-source calculation library. Current state:

Data availability: Many jurisdictions publish assessment data (open data portals)
No calculation engine: Data exists, but no library to compute taxes
Extreme locality: 3,000+ counties, each with unique rules

Approach 1A: Top-N Metro Areas (Incremental)#

Strategy: Start with largest metro areas, expand incrementally

Steps:

Identify top 50 metro areas by population (~60% of US)
For each metro, encode:
- Assessment methodology (market value, Prop 13, use-based)
- Rate structures (mill levies, voter overrides)
- Exemptions (homestead, senior, veteran, agricultural)
- Special districts (school, fire, library)
Build validation dataset (scrape public tax bills)
Create Python library with pluggable jurisdiction modules

Example API:

Pros:

Focused scope (50 metros is achievable)
High impact (covers majority of US population)
Incremental validation (one jurisdiction at a time)

Cons:

Still substantial work (50+ jurisdictions)
Annual maintenance (rate changes, new levies)
Incomplete coverage (rural areas, small cities)

Estimated Effort: 2-3 person-years for initial 50 metros

Approach 1B: Crowdsourced Rule Encoding#

Strategy: Build framework, let community contribute jurisdiction rules

Inspired by: OpenFisca’s country-package model

Steps:

Create core engine (exemption logic, rate application, aggregation)
Define jurisdiction rule DSL (domain-specific language):
Provide tools for validation (compare calculated vs. actual bills)
Gamify contributions (leaderboard, jurisdiction coverage map)

Pros:

Scales beyond any single team’s capacity
Community ownership → sustainability
Framework reusable across all jurisdictions

Cons:

Slow initial adoption (cold-start problem)
Quality variance (need review process)
Complex rules hard to encode (edge cases)

Estimated Effort: 1 person-year for framework, 3-5 years to 500+ jurisdictions

Approach 1C: ML-Assisted Estimation#

Strategy: Train models on assessment + tax bill data, skip rule encoding

Steps:

Scrape public data (assessments, tax bills)
Features: [assessed_value, location, property_type, size, age, ...]
Target: total_tax
Train gradient boosting model (XGBoost, LightGBM)
Validate against holdout jurisdictions

Pros:

No manual rule encoding needed
Works even where rules are unclear
Handles complex interactions automatically

Cons:

Black box (can’t explain calculations)
Requires substantial data (10k+ tax bills per jurisdiction)
No counterfactual policy analysis (can’t model “what if senior exemption increased”)
Legal/transparency concerns (how was this calculated?)

Use case: Rough estimates for real estate platforms, NOT policy analysis

Estimated Effort: 6 months for prototype, 1 year for production

Recommended Approach: Hybrid 1A + 1B#

Phase 1: Core team builds top 20 metros (Approach 1A)

Proves viability
Establishes patterns
Creates validation methodology

Phase 2: Open to community (Approach 1B)

Core team provides framework + examples
Community contributes remaining metros
Maintains quality through validation tools

Why not 1C? ML approach unsuitable for policy analysis (primary use case)

Gap 2: Sales Tax Modeling for Policy Research#

The Problem#

Commercial APIs (TaxJar, Avalara) exist for e-commerce compliance but:

Expensive for research use ($1000s/month)
Not designed for counterfactual analysis
No access to underlying rate database
Address-level precision unnecessary for policy modeling

Research needs: “What if we exempt groceries?” not “What tax for this exact address?”

Approach 2A: Open Rate Database (Snapshot)#

Strategy: Public database of sales tax rates, quarterly updates

Scope:

State rates (all 50)
County rates (top 200 by population)
City rates (top 100 by population)
Coverage: ~70-80% of US sales transactions

Data sources:

State revenue department websites (public data)
Municipal code databases
Federation of Tax Administrators publications

Database schema:

Example:

Pros:

Open data (no API fees)
Sufficient for research (don’t need 11,000 jurisdictions)
Quarterly updates adequate
Enables policy modeling

Cons:

Manual maintenance (quarterly scraping)
Not suitable for real-time compliance
Product categories broad (no UPC-level)

Estimated Effort: 3 months for initial build, 1 week/quarter maintenance

Approach 2B: Microsimulation Integration#

Strategy: Combine rate database (2A) with consumer expenditure data

Data source: Consumer Expenditure Survey (CEX) from BLS

Steps:

Load CEX microdata (household expenditures by category)
Map CEX categories to tax categories:
For each household, calculate sales tax:
Aggregate: total revenue, by income quintile, by state

Policy analysis:

Pros:

Answers key policy questions
Distributional analysis (regressivity)
Integration with existing tools (Tax-Calculator)

Cons:

CEX has small sample (~20k households)
Measurement error in expenditures
Doesn’t capture behavioral responses

Estimated Effort: 6 months (assuming 2A completed)

Approach 2C: Collaborate with Commercial Providers#

Strategy: Partner with TaxJar/Avalara for research access

Model:

Commercial providers offer API access for research (reduced rate or free)
Academic/nonprofit researchers use for policy analysis
Providers benefit from research visibility

Precedent:

Google Cloud academic grants
AWS research credits
Qualtrics academic licenses

Pros:

No need to maintain rate database
Access to full 11,000 jurisdictions
Real-time accuracy

Cons:

Dependent on provider cooperation
May have usage limits
Not truly open-source
Commercial entity could terminate access

Estimated Effort: 3 months to negotiate partnership

Recommended Approach: 2A + 2B#

Why not 2C? Dependency on commercial entity undermines open research principles

Implementation:

Build open rate database (2A) - 3 months
Create microsimulation module (2B) - 6 months
Validate against state revenue reports
Publish as library + dataset

Gap 3: Multi-Jurisdictional Integration#

The Problem#

Comprehensive tax burden requires integrating:

Federal income tax (Tax-Calculator ✓)
State income tax (PolicyEngine ✓)
Property tax (Gap 1)
Sales tax (Gap 2)

Challenge: Each tax has different data requirements, computation order, interactions

Approach 3A: Orchestration Layer#

Strategy: Build wrapper that coordinates existing + new tools

Architecture:

Coordination challenges:

Data harmonization (different income definitions)
Computation order (federal affects state via SALT deduction)
Entity mapping (tax units vs. households vs. individuals)

Pros:

Leverages existing tools
No duplication of effort
Can start before Gaps 1 & 2 fully solved

Cons:

Brittle (depends on all upstream libraries)
API mismatches (different conventions)
Version incompatibilities

Estimated Effort: 6 months for orchestration layer

Approach 3B: Unified Microsimulation#

Strategy: Build comprehensive model from scratch (like OpenFisca, but US + all taxes)

Scope:

Federal income + payroll
50 state income taxes
Property tax (top metros)
Sales tax (policy-level)

Pros:

Consistent API
Optimized performance
Designed for integration from start

Cons:

Massive duplication of effort
5-10 person-years to match existing tools
Maintenance burden enormous

Verdict: Not recommended (reinventing wheel)

Approach 3C: Incremental Integration into PolicyEngine#

Strategy: Contribute property + sales modules to PolicyEngine-US

Rationale:

PolicyEngine already has federal + 50 states
Active development, responsive maintainers
Would create most comprehensive US model

Steps:

Propose integration to PolicyEngine team
Build property tax module using PolicyEngine’s framework
Build sales tax module
Submit PRs, iterate with maintainers

Pros:

Single comprehensive tool
Maintained by nonprofit
Leverages PolicyEngine’s web app
Community benefits

Cons:

Dependent on PolicyEngine roadmap
Must conform to their architecture
Governance not under your control

Estimated Effort: 1 year (if accepted by PolicyEngine)

Recommended Approach: 3A (Short-term) + 3C (Long-term)#

Phase 1: Build orchestration layer (3A)

Proves value of integration
Works with current tools
6 months

Phase 2: Contribute to PolicyEngine (3C)

Approach PolicyEngine team with working prototype
Discuss integration
If accepted: dedicate resources to contribution
If not: maintain orchestration layer

Implementation Priorities#

Based on impact, feasibility, and user needs:

Priority 1: Sales Tax Research Tools (Gap 2)#

Rationale: Highest impact/effort ratio
Approach: 2A + 2B (open rate database + microsimulation)
Timeline: 9 months
Users: State policy analysts, think tanks, academics

Priority 2: Property Tax (Top 20 Metros) (Gap 1)#

Rationale: Proof of concept for larger effort
Approach: 1A (top metros, then expand)
Timeline: 1 year for top 20
Users: Local governments, real estate platforms, researchers

Priority 3: Multi-Jurisdictional Integration (Gap 3)#

Rationale: Enables comprehensive tax burden analysis
Approach: 3A (orchestration layer)
Timeline: 6 months (after Gaps 1 & 2 progress)
Users: Researchers studying overall tax progressivity

Cross-Cutting Technical Decisions#

Language: Python#

All existing tools use Python
Rich ecosystem (pandas, NumPy, scikit-learn)
Easy integration

License: Public Domain (CC0) or Apache 2.0#

CC0: Like Tax-Calculator (most permissive)
Apache 2.0: If need contributor agreements
NOT AGPL: Want to enable commercial use (real estate, compliance)

Data Format: Parquet#

Efficient columnar storage
Fast reads (pandas, Polars)
Cross-language (R, Python, Julia)

Documentation: Jupyter Notebooks#

Examples with real data
Narrative + code
Reproducible

Testing: Extensive Validation#

Known correct answers (published tax bills)
Edge cases (phase-outs, cliffs)
Regression tests (changes don’t break existing)

Web API: Optional#

Start with Python library (CLI + API)
Add REST API if demand exists
Learn from PolicyEngine’s web app success

Risks and Mitigations#

Risk 1: Maintenance Burden#

Problem: Tax rules change annually, data sources change

Mitigations:

Build automated tests (detect when rules break)
Annual update process (documented, scheduled)
Community contributions (share maintenance)
Funding model (grants, sponsorships for maintenance)

Risk 2: Data Access#

Problem: Some data proprietary (e.g., IRS PUF restricted)

Mitigations:

Use public data (Census CPS, CEX, open data portals)
Synthetic data generation (PolicyEngine approach)
Partner with universities for data access

Risk 3: Legal/Liability#

Problem: What if calculations are wrong and someone relies on them?

Mitigations:

Disclaimer: “Not tax advice, for research only”
Extensive validation + testing
Public domain license (no warranty)
Insurance (if forming organization)

Risk 4: Adoption#

Problem: Researchers don’t switch from existing tools

Mitigations:

Interoperability with existing tools (don’t require switching)
Demonstrate value (fill gaps, don’t duplicate)
Publish papers using the tool (lead by example)
Partner with influential researchers

Success Metrics#

Adoption#

GitHub stars, forks, downloads (PyPI)
Academic citations (papers using the tool)
Government use (any official adoption)

Quality#

Validation accuracy (within X% of official statistics)
Test coverage (>90%)
Issues opened/closed ratio

Impact#

Policy reforms analyzed using the tool
Legislation influenced by analyses
Improved transparency (vs. black-box models)

These approaches build on patterns from:

OpenFisca: Pluggable country packages, rules as code
PolicyEngine: Web accessibility, ML enhancement
Tax-Calculator: Academic rigor, extensive validation
OpenStreetMap: Crowdsourced geographic data (inspiration for 1B)
Zillow/Redfin: Real estate data platforms (users for property tax tools)

Recommendations Summary#

Gap	Recommended Approach	Timeline	Priority
Sales Tax	Open rate DB + microsimulation	9 months	1 (High)
Property Tax	Top 20 metros → crowdsource	1 year → ongoing	2 (Medium)
Integration	Orchestration layer	6 months	3 (Low)

Long-term vision: Contribute to PolicyEngine for unified US model

Next step: Prototype sales tax research tool (highest impact/effort ratio)

S4: Strategic

S4: Selection Criteria - Evaluating Public Finance Modeling Tools#

Overview#

This section provides criteria for evaluating existing public finance modeling tools and assessing approaches to filling identified gaps.

Evaluation Framework#

1. Functional Requirements#

1.1 Scope & Coverage#

What to evaluate:

Geographic coverage (federal, state, local)
Tax types covered (income, payroll, property, sales, excise)
Benefit programs (SNAP, Medicaid, housing, EITC)
Time period support (current law, historical, future projections)

Scoring:

⭐⭐⭐⭐⭐ Comprehensive (federal + state + local, all major taxes/benefits)
⭐⭐⭐⭐ Broad (federal + state OR all major taxes at one level)
⭐⭐⭐ Moderate (single level, multiple taxes OR single tax, multiple levels)
⭐⭐ Limited (single level, single tax type)
⭐ Narrow (proof of concept only)

Examples:

PolicyEngine US: ⭐⭐⭐⭐ (Federal + 50 states income/payroll/benefits, no property/sales)
Tax-Calculator: ⭐⭐⭐ (Federal only, comprehensive)
Proposed property tax tool (top 20 metros): ⭐⭐ (Limited geographic)

1.2 Accuracy & Validation#

What to evaluate:

Matches official statistics (IRS, Census, state revenue departments)
Validation methodology (published tests, known correct answers)
Edge case handling (phase-outs, AMT, cliffs)
Error rates (% deviation from official aggregates)

Scoring:

⭐⭐⭐⭐⭐ Official use by government (CBO, HM Treasury)
⭐⭐⭐⭐ Validated against official stats (<5% error)
⭐⭐⭐ Some validation (published tests, no official comparison)
⭐⭐ Limited validation (unit tests only)
⭐ No validation

Examples:

Tax-Calculator: ⭐⭐⭐⭐⭐ (CBO uses it)
PolicyEngine UK: ⭐⭐⭐⭐⭐ (HM Treasury uses it)
OpenFisca France: ⭐⭐⭐⭐⭐ (Official French government use)

Validation evidence:

1.3 Policy Reform Capability#

What to evaluate:

Ease of specifying reforms (change rates, add provisions, modify phase-outs)
Counterfactual analysis (compare baseline vs. reform)
Interaction effects (how changes ripple through system)
Behavioral modeling (optional: labor supply responses)

Scoring:

⭐⭐⭐⭐⭐ Parameter changes + structural reforms, tested framework
⭐⭐⭐⭐ Parameter changes + some structural reforms
⭐⭐⭐ Parameter changes only (rates, thresholds)
⭐⭐ Limited reform capability (hard-coded scenarios)
⭐ No reform capability (current law only)

Examples:

PolicyEngine: ⭐⭐⭐⭐⭐ (Web app for designing reforms, structural changes possible)
Tax-Calculator: ⭐⭐⭐⭐ (Parameter reforms easy, structural reforms require code)

1.4 Distributional Analysis#

What to evaluate:

Outputs by income quintile/decile/percentile
Winners and losers (% benefit, $ change)
Poverty impacts (SPM, FPL)
Demographic breakdowns (age, race, geography)

Scoring:

⭐⭐⭐⭐⭐ Multiple dimensions (income + age + geography + demographics)
⭐⭐⭐⭐ Income quintiles + one other dimension
⭐⭐⭐ Income quintiles only
⭐⭐ Aggregate statistics only
⭐ No distributional output

Examples:

PolicyEngine: ⭐⭐⭐⭐⭐ (Income, poverty, age, state, race)
Tax-Calculator: ⭐⭐⭐⭐ (Income, age, filing status)

2. Technical Quality#

2.1 Performance#

What to evaluate:

Runtime for full population (150M+ individuals)
Memory usage
Optimization techniques (vectorization, caching)
Scalability (can it handle larger datasets?)

Benchmarks:

⭐⭐⭐⭐⭐ < 10 seconds for full population
⭐⭐⭐⭐ 10-60 seconds
⭐⭐⭐ 1-5 minutes
⭐⭐ 5-30 minutes
⭐ > 30 minutes or doesn’t scale

Examples:

PolicyEngine: ⭐⭐⭐⭐⭐ (~30 seconds for 300M people)
Tax-Calculator: ⭐⭐⭐⭐ (~2 minutes)

Measurement:

2.2 Code Quality#

What to evaluate:

Test coverage (%)
Documentation (API reference, tutorials, examples)
Type hints (Python 3.6+)
Linting (consistent style)
CI/CD (automated testing)

Scoring:

⭐⭐⭐⭐⭐ >90% coverage, comprehensive docs, full CI/CD
⭐⭐⭐⭐ 70-90% coverage, good docs, basic CI
⭐⭐⭐ 50-70% coverage, minimal docs
⭐⭐ <50% coverage, API reference only
⭐ No tests, no docs

Examples:

Tax-Calculator: ⭐⭐⭐⭐⭐ (100% coverage, extensive docs)
OpenFisca: ⭐⭐⭐⭐ (Good coverage, variable docs by country)

2.3 Maintainability#

What to evaluate:

Active development (commits/month, recent release)
Contributor community (number of contributors, responsiveness)
Governance (who maintains, funding model)
Breaking changes (API stability)

Scoring:

⭐⭐⭐⭐⭐ Active (weekly commits), funded, multiple maintainers
⭐⭐⭐⭐ Active (monthly commits), some funding
⭐⭐⭐ Occasional updates (quarterly)
⭐⭐ Minimal updates (annual)
⭐ Abandoned (no updates in 2+ years)

Examples:

PolicyEngine: ⭐⭐⭐⭐⭐ (50+ commits/month, nonprofit funded)
Tax-Calculator: ⭐⭐⭐⭐ (Monthly updates, PSL supported)
tenforty: ⭐⭐ (Last update 2018)

2.4 Integration & Interoperability#

What to evaluate:

APIs (REST, Python, other languages)
Data formats (Parquet, CSV, JSON)
Compatibility with other tools
Extension mechanisms (plugins, custom rules)

Scoring:

⭐⭐⭐⭐⭐ REST API + Python + documented extension framework
⭐⭐⭐⭐ Python API + extension framework
⭐⭐⭐ Python API only, some documentation for extensions
⭐⭐ Python API, hard to extend
⭐ Single-file script, no API

Examples:

OpenFisca: ⭐⭐⭐⭐⭐ (REST API, Python, country packages)
PolicyEngine: ⭐⭐⭐⭐⭐ (REST API, Python, web app)
Tax-Calculator: ⭐⭐⭐⭐ (Python API, parameter system)

3. Usability#

3.1 Learning Curve#

What to evaluate:

Prerequisites (programming, tax knowledge)
Documentation quality (tutorials, examples)
Community support (forums, Stack Overflow, GitHub issues)
Quickstart time (how long to first working example)

Scoring:

⭐⭐⭐⭐⭐ Web interface (no code) OR excellent tutorials (<1 day)
⭐⭐⭐⭐ Good tutorials, active community (1-3 days)
⭐⭐⭐ API reference, some examples (1-2 weeks)
⭐⭐ Minimal docs, expert only (1+ months)
⭐ No docs, read the code

Examples:

PolicyEngine: ⭐⭐⭐⭐⭐ (Web app requires zero coding)
Tax-Calculator: ⭐⭐⭐⭐ (Excellent tutorials)
OpenFisca: ⭐⭐⭐ (Steeper curve, entity modeling complex)

3.2 Accessibility#

What to evaluate:

Cost (free, freemium, paid)
Installation difficulty (pip install, Docker, complex setup)
Data availability (includes sample data? requires proprietary data?)
Target audience (researchers, policymakers, general public)

Scoring:

⭐⭐⭐⭐⭐ Free, easy install, sample data included, web interface
⭐⭐⭐⭐ Free, easy install, sample data
⭐⭐⭐ Free, moderate install, need to source data
⭐⭐ Free but complex setup OR paid but easy
⭐ Expensive and complex

Examples:

PolicyEngine: ⭐⭐⭐⭐⭐ (Free web app, includes data)
Tax-Calculator: ⭐⭐⭐⭐ (Free, pip install, includes CPS data)
TaxJar: ⭐⭐ (Paid API, $1000+/month)

4. Licensing & Governance#

4.1 License#

What to evaluate:

Open source? (OSI-approved license)
Permissiveness (MIT, Apache vs. GPL, AGPL)
Commercial use allowed?
Attribution requirements

Scoring:

⭐⭐⭐⭐⭐ Public domain (CC0) or highly permissive (MIT, Apache)
⭐⭐⭐⭐ Permissive with attribution (BSD, Apache 2.0)
⭐⭐⭐ Weak copyleft (LGPL)
⭐⭐ Strong copyleft (GPL, AGPL)
⭐ Proprietary

Examples:

Tax-Calculator: ⭐⭐⭐⭐⭐ (CC0 - Public Domain)
PolicyEngine: ⭐⭐ (AGPL-3.0)
OpenFisca: ⭐⭐ (AGPL-3.0)

Why it matters:

AGPL requires derivative works to be open-source (affects commercial products)
Public domain enables maximum reuse
For public finance: transparency matters, so open-source preferred even if copyleft

4.2 Governance#

What to evaluate:

Who controls direction? (government, nonprofit, company, community)
Funding model (grants, donations, commercial)
Contributor process (easy to contribute? CLA required?)
Decision-making (BDFL, committee, consensus)

Scoring:

⭐⭐⭐⭐⭐ Open governance, multiple funders, easy contributions
⭐⭐⭐⭐ Nonprofit/academic, some funders
⭐⭐⭐ Government-backed, clear roadmap
⭐⭐ Single company/individual, limited input
⭐ Closed governance

Examples:

Tax-Calculator: ⭐⭐⭐⭐⭐ (PSL open governance, community-driven)
PolicyEngine: ⭐⭐⭐⭐ (Nonprofit, transparent, accepts contributions)
OpenFisca: ⭐⭐⭐⭐ (French gov + community)

5. Impact & Adoption#

5.1 User Base#

What to evaluate:

Official government use
Academic citations (Google Scholar)
Industry use (think tanks, media)
GitHub metrics (stars, forks, downloads)

Scoring:

⭐⭐⭐⭐⭐ Official government use + widespread academic/industry
⭐⭐⭐⭐ Government use OR extensive academic citations
⭐⭐⭐ Some academic use, niche adoption
⭐⭐ Small community, few citations
⭐ No known users

Examples:

Tax-Calculator: ⭐⭐⭐⭐⭐ (CBO, TPC, 100+ academic papers)
PolicyEngine: ⭐⭐⭐⭐⭐ (UK HM Treasury, growing US adoption)
OpenFisca: ⭐⭐⭐⭐⭐ (French government, international)

5.2 Influence on Policy#

What to evaluate:

Has it influenced actual legislation?
Used in official government projections?
Cited in policy debates?
Impact on public understanding?

Evidence:

Direct: “CBO used Tool X to score Bill Y”
Indirect: Academic papers using tool cited in Congressional testimony
Public: Media coverage using tool’s analyses

Examples:

Tax-Calculator: Influenced TCJA debate (2017), CBO uses for official scores
PolicyEngine: UK Treasury uses for Universal Credit modeling
OpenFisca: French social benefit system runs on OpenFisca code

Applying the Framework#

Existing Tools Scorecard#

Criterion	OpenFisca	PolicyEngine	Tax-Calculator	CCC
Scope & Coverage	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐	⭐⭐
Accuracy	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
Reform Capability	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
Distributional	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐
Performance	⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐
Code Quality	⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
Maintainability	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
Integration	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
Learning Curve	⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐
Accessibility	⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐
License	⭐⭐	⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐
Governance	⭐⭐⭐⭐	⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐
User Base	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐
Policy Influence	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐⭐⭐	⭐⭐⭐
TOTAL	59/70	67/70	63/70	44/70

Interpretation#

PolicyEngine (67/70): Best overall, especially for accessibility (web app) and comprehensive US coverage

Tax-Calculator (63/70): Gold standard for academic use, public domain license

OpenFisca (59/70): Most international, mature, but steeper learning curve

CCC (44/70): Specialized tool, limited audience, but fills unique niche

Recommendations for Tool Selection#

Use Case 1: US Federal Tax Policy Analysis#

Recommendation: Tax-Calculator

Why:

CBO uses it (official validation)
Public domain license
Most cited in academic literature
Comprehensive federal tax modeling

When to use PolicyEngine instead:

Need state taxes
Want web interface
Non-programmer audience

Use Case 2: US State + Federal Integration#

Recommendation: PolicyEngine US

Why:

Only tool with all 50 states (as of 2024)
Active development
Web app for accessibility

Limitations:

AGPL license (if building commercial product)
State models new (need more validation)

Use Case 3: International (Non-US)#

Recommendation: OpenFisca

Why:

Proven in multiple countries
Core engine language-agnostic
Government adoption (France, Tunisia)

Check: Does country package exist? Quality varies.

Use Case 4: Business Tax / Investment Analysis#

Recommendation: Cost-of-Capital-Calculator

Why:

Only open-source business tax tool
Integrates with Tax-Calculator
Based on established economic framework

Combine with: Tax-Calculator for comprehensive analysis

Use Case 5: Property Tax (Future)#

Recommendation: Build new tool (none exist)

Approach: See S3 (Approach 1A: Top metros, incremental)

Evaluation criteria when choosing approach:

Incremental (1A) scores high on feasibility, moderate on coverage
Crowdsourced (1B) scores high on coverage, moderate on feasibility
ML (1C) scores high on automation, low on explainability

For policy research: Avoid ML (1C), prefer rule-based (1A/1B)

Use Case 6: Sales Tax Research (Future)#

Recommendation: Build new tool (commercial APIs not suitable)

Approach: See S3 (Approach 2A + 2B: Open rate database + microsimulation)

Why not TaxJar/Avalara:

Expensive ($1000+/month)
Not designed for counterfactual analysis
No access to underlying data

Decision Matrix for New Tool Development#

Should You Build a New Tool?#

Question	If YES →	If NO →
Does existing tool cover this?	Use existing	Consider building
Can you contribute to existing?	Contribute	Build standalone
Do you have 2+ person-years?	Maybe build	Probably don’t
Is there ongoing funding?	Maybe build	Probably don’t
Are users waiting?	Build	Don’t build

Example: Property Tax Tool#

Existing tool? NO → Consider building
Contribute to existing? NO (none exist) → Build standalone
Resources? YES (grant-funded) → Build
Funding? YES (3-year grant) → Build
Users? YES (real estate platforms, local govs) → BUILD

Example: Better OpenFisca Web UI#

Existing tool? YES (OpenFisca has web API) → Use existing
Contribute to existing? YES (open-source) → CONTRIBUTE, don’t fork
Resources? Doesn’t matter → Contribute
Funding? Doesn’t matter → Contribute
Users? Doesn’t matter → Contribute

Red Flags: When NOT to Build#

Duplicating Tax-Calculator: Federal US taxes are solved problem
Country-specific without maintenance plan: Will bitrot with annual tax changes
Proprietary data requirements: Users can’t access = tool unusable
No validation strategy: How will you know if it’s correct?
One-person project, no succession: Bus factor = 1
“Better OpenFisca” without distinct value: Just contribute instead

Green Lights: When TO Build#

Clear gap: No existing open-source solution (property tax, sales tax research)
User demand: Researchers/policymakers asking for it
Validation path: Can compare to official statistics
Maintenance plan: Funding for 3+ years
Interoperability: Works with existing tools
Unique value: Can’t be achieved by contributing to existing tool

Summary: Choosing Wisely#

For using existing tools:

US federal: Tax-Calculator (academic) or PolicyEngine (accessible)
US state: PolicyEngine US (only comprehensive option)
International: OpenFisca (if country package exists)
Business tax: Cost-of-Capital-Calculator

For building new tools:

Property tax: High priority, clear gap, user demand ✅
Sales tax research: Medium priority, clear gap, moderate demand ✅
Multi-jurisdictional: Build orchestration, contribute to PolicyEngine ✅
Duplicate existing: Don’t do it ❌

The golden rule: Contribute to existing tools when possible, build new tools when necessary.

Published: 2026-03-06 Updated: 2026-03-06

1.300 Public Finance Modeling#

Domain Explainer: Public Finance Modeling#

What is Public Finance Modeling?#

Why It Matters#

How It Works: Microsimulation#

Who Uses These Tools?#

Government Agencies#

Think Tanks & Policy Organizations#

Academic Researchers#

Advocacy Groups#

Technical Challenges#

1. Data Quality#

2. Rule Complexity#

3. Validation Difficulty#

Why Open Source Matters#

Current State of Open Source#

Comparison to Corporate Finance#

Example: Child Tax Credit Reform#

Key Terms#

Further Reading#

Academic#

Practitioner#

Technical Documentation#

S1: Problem Overview#

The Core Problem#

Why This Is Hard#

1. Complexity of Tax Codes#

2. Multi-Level Government#

3. Data Requirements#

4. Behavioral Responses#

What Public Finance Modeling Solves#

1. Revenue Estimation#

2. Distributional Analysis#

3. Policy Reform Testing#

4. Marginal Tax Rate Analysis#

Scope of This Research#

In Scope#

Out of Scope#

Target Users#

Primary Users#

User Needs#

Required Skills#

Why Existing Solutions Fall Short#

Problem 1: Multi-State Complexity#

Problem 2: Property Tax#

Problem 3: Sales Tax Research#

Problem 4: Integration#

Success Criteria#

Cross-Cutting Concerns#

Data Privacy#

Computational Performance#

Version Control#

International Portability#

Related Problems#

Why This Matters#

S2: Prior Art - Existing Tools#

Overview#

1. OpenFisca#

Description#

Key Features#

Architecture#

Installation#

Example Usage#

Notable Users#

Strengths#

Limitations#

Governance#

2. PolicyEngine#

Description#

Key Features#

Architecture#

Installation#

Example Usage#

Notable Users#

Strengths#

Limitations#

Innovation: ML-Enhanced Microsimulation#

Governance#

3. Tax-Calculator#

Description#