EnerGaze

Advanced Energy Program Evaluation Toolkit

Comprehensive Python package for evaluating energy efficiency programs using advanced statistical models. Implements methodologies from the Uniform Methods Project (UMP) with robust tools for energy savings estimation, treatment effect analysis, and program impact assessment.

Python 3.8+ MIT License v0.1.20 Active Development
Quick Start
# Install EnerGaze
pip install energaze

# Import and use
from energaze.models import PostRegression

model = PostRegression(
    data=df,
    consumption_var='consumption',
    treatment_var='treatment',
    time_var='date',
    site_var='site_id'
)

model.fit()
results = model.get_treatment_effect()

Key Features

Advanced Modeling

Multiple statistical models including TWFE, Post Regression, Conditional Savings, and Time Period Savings with standardized APIs

Flexible Grouping

State-level, wave-based, and combined grouping capabilities for complex treatment designs

Intelligent Treatment

Automatic treatment date inference and robust pre/post period validation

Developer Friendly

Fully type annotated, comprehensive testing, and extensive documentation

Standardized API

Consistent interface across all models with standardized result formats

Diagnostics Schema

Unified diagnostic records with pre-fit summaries, severities, and parallel-trend metrics

Statistical Models

Implementation of Uniform Methods Project (UMP) specifications

PostRegression

UMP 4.4.8

Enhanced treatment logic with automatic treatment date inference. Perfect for before-after analysis with control groups.

  • Automatic treatment date detection
  • Pre/post period validation
  • Control group comparison

TWFE

UMP 4.4.6

Two-Way Fixed Effects model with entity and time fixed effects. Ideal for panel data with multiple periods.

  • Entity fixed effects
  • Time fixed effects
  • Heterogeneous treatment

TimePeriodSavings

UMP 4.4.7

Interval-level analysis with panel data methods. Designed for measuring savings across different time periods.

  • Period-specific effects
  • Flexible time windows
  • Trend analysis

ConditionalSavings

UMP 4.4.5

Regression with weather/occupancy interactions for weather-sensitive programs.

  • Weather normalization
  • Interaction effects
  • Flexible baselines

SimpleDifferences

UMP 4.4.3

Basic difference-in-differences estimation for straightforward treatment effect analysis.

  • Simple DiD estimation
  • Clear interpretation
  • Minimal assumptions

HeterogeneousSavings

UMP 4.4.4

Models varying treatment effects across different units or characteristics.

  • Unit-specific effects
  • Characteristic-based analysis
  • Effect heterogeneity with registry-ready outputs

Outputs & Artifacts

Every workflow run writes a complete evidence bundle for engineers, analysts, and stakeholders.

Structured Tables

  • notebooks/artifacts/model_results_summary.csv – consolidated treatment effects for every model.
  • notebooks/artifacts/model_comparison.csv – registry-level benchmarking table.
  • notebooks/artifacts/<model>_results.csv – per-model diagnostics ready for BI tools.
  • notebooks/artifacts/diagnostics_summary.json – machine-readable metadata for automation.

Interactive Visuals

  • eda_load_shape.html & eda_treatment_balance.html – ready-to-share Plotly dashboards.
  • <model>_counterfactual.html – counterfactual vs. actual consumption bands.
  • <model>_parallel_trend.html – automated pre-trend validation charts.
  • Static assets/parallel_trends.png preview generated from the ETWFE pre-treatment effects on the 48-month constant dataset (or the simple parallel-trends diagnostic when ETWFE pre-periods are unavailable).

Notebooks & Reports

  • notebooks/post_regression_workflow.ipynb – full UMP 4.4.8 walkthrough.
  • notebooks/model_selection_showcase.ipynb – registry orchestration demo.
  • notebooks/artifacts/model_results_summary.csv feeds Markdown reports via ReportFormatter.
  • Every run ships a precheck log so data issues can be remediated quickly.

Workflow Automation

End-to-End Evaluation Pipeline

graph LR subgraph Ingestion A[Raw Meter Data] -->|infer_treatment_columns| B(Preprocessing) B -->|precheck_and_clean| C{Diagnostics} end subgraph Modeling C -->|Pass| D[Model Selection] C -->|Fail| E[Error Report] D --> F[PostRegression] D --> G[TWFE] D --> H[TimePeriodSavings] D --> I[ConditionalSavings] end subgraph Output F & G & H & I --> J[Treatment Effect Estimation] J --> K[HTML Report] J --> L[CSV Artifacts] end style A fill:#f7fafc,stroke:#2d3748,stroke-width:2px,rx:5 style K fill:#ebf8ff,stroke:#3182ce,stroke-width:2px,rx:5 style L fill:#ebf8ff,stroke:#3182ce,stroke-width:2px,rx:5 style C fill:#fff5f5,stroke:#e53e3e,stroke-width:2px,rx:5 style J fill:#f0fff4,stroke:#38a169,stroke-width:2px,rx:5

Intelligent Data Inference

`infer_treatment_columns()` inspects arbitrary CSV schemas, maps aliases such as d and after to the canonical treated/post_treatment fields, and records the inferred treatment date so every model starts from a clean, standardized DataFrame.

  • Detects cohort vs. exposure indicators automatically
  • Builds a post column for ETWFE diagnostics
  • Preserves the original cohort flag as treatment_group_indicator

Prechecks & Cleaning

`precheck_and_clean()` runs before every fit: it validates timestamps and energy values, ensures treated sites have both pre- and post-period observations, and removes rows that would otherwise break diagnostics while logging every action for transparency.

  • Reports dropped rows, invalid post indicators, and missing cohorts
  • Guards against ETWFE failures caused by NA event indices
  • Feeds structured issues into the CLI output stream

Registry & CLI Orchestration

The scripts/run_model_workflow.py runner wires the preprocessing helpers, model factory, ModelRegistry, and ReportFormatter together. One command ingests data, executes multiple models, and emits artifacts under notebooks/artifacts/.

python scripts/run_model_workflow.py \
  --data-path examples/data/generated_DiD_test_data.csv \
  --compare-models post_regression simple_differences twfe \
  --output-dir notebooks/artifacts/generated_demo

Installation

From PyPI

Recommended for most users

pip install energaze

From Source

For development or latest features

git clone https://github.com/Ecometricx-DataScience/EnerGaze.git
cd EnerGaze
pip install -e .

Usage Examples

import pandas as pd
import numpy as np
from energaze.models import PostRegression

# Load your data
df = pd.read_csv('energy_data.csv')

# Initialize model with data
model = PostRegression(
    data=df,
    consumption_var='consumption',
    treatment_var='treatment',
    time_var='date',
    site_var='site_id',
    treatment_date='2023-01-01'
)

# Fit the model
model.fit()

# Get treatment effects
results = model.get_treatment_effect()
print(f"Treatment Effect: {results['effect']:.2f}")
print(f"P-value: {results['p_value']:.4f}")

# Get detailed summary
summary = model.summary()
print(summary)
# State-level grouping
state_model = PostRegression(
    data=df,
    consumption_var='consumption',
    treatment_var='treatment',
    time_var='date',
    site_var='site_id',
    state_var='state',  # Group by state
    treatment_date='2023-01-01'
)

state_model.fit()
state_results = state_model.get_treatment_effect()

# Results for each state
for state, result in state_results.items():
    print(f"{state}: Effect = {result['effect']:.2f}")

# Combined State × Wave grouping
combined_model = PostRegression(
    data=df,
    consumption_var='consumption',
    treatment_var='treatment',
    time_var='date',
    site_var='site_id',
    state_var='state',
    wave_var='wave',  # Both state and wave
    treatment_date='2023-01-01'
)
from energaze.models import TWFE, TimePeriodSavingsRegression

# Compare different models
models = {
    'PostRegression': PostRegression(
        data=df, consumption_var='consumption',
        treatment_var='treatment', time_var='date',
        site_var='site_id', treatment_date='2023-01-01'
    ),
    'TWFE': TWFE(
        data=df, consumption_var='consumption',
        treatment_var='treatment', time_var='date',
        site_var='site_id', treatment_date='2023-01-01'
    ),
    'TimePeriodSavings': TimePeriodSavingsRegression(
        data=df, consumption_var='consumption',
        treatment_var='treatment', time_var='date',
        site_var='site_id', treatment_date='2023-01-01'
    )
}

# Fit all models and compare
results = {}
for name, model in models.items():
    model.fit()
    results[name] = model.get_treatment_effect()
    print(f"{name}: Effect = {results[name]['effect']:.2f}")

Comprehensive Visualizations

Robust analysis outputs with publication-ready graphics

Advanced Data Visualization

EnerGaze produces comprehensive, publication-ready visualizations anchored by standardized diagnostics and orchestration utilities:

  • Treatment effect comparisons across models
  • Time-series analysis with trend lines
  • Multi-dimensional grouping visualizations
  • Pre/post period consumption patterns with parallel-trend insights
  • State and wave-level comparisons
  • Model registry tables (model_comparison.csv) and counterfactual reporting
  • Export-ready bundles such as eda_load_shape.html and model_results_summary.csv

Example Output: Parallel-trends diagnostics from the 48-month constant synthetic dataset, using ETWFE pre-treatment effects when available and otherwise falling back to the simple difference-in-means trend check used in the HTML reports.

EnerGaze Parallel Trends Visualization

Documentation & Resources

Ready to Evaluate Your Energy Programs?

Start using EnerGaze today for robust energy efficiency analysis