EnerGaze

Advanced Energy Program Evaluation Toolkit

Comprehensive Python package for evaluating energy efficiency programs using advanced statistical models. Implements methodologies from the Uniform Methods Project (UMP) with robust tools for energy savings estimation, treatment effect analysis, and program impact assessment.

Python 3.10+ MIT License v0.1.20 Active Development

Get Started View on GitHub

Quick Start

# Install EnerGaze
pip install energaze

# Import and use
from energaze.models import PostRegression

model = PostRegression(
    data=df,
    consumption_var='consumption',
    treatment_var='treatment',
    time_var='date',
    site_var='site_id'
)

model.fit()
results = model.get_treatment_effect()

Key Features

Advanced Modeling

Multiple statistical models including TWFE, Post Regression, Conditional Savings, and Time Period Savings with standardized APIs

Flexible Grouping

State-level, wave-based, and combined grouping capabilities for complex treatment designs

Intelligent Treatment

Automatic treatment date inference and robust pre/post period validation

Developer Friendly

Fully type annotated, comprehensive testing, and extensive documentation

Standardized API

Consistent interface across all models with standardized result formats

Diagnostics Schema

Unified diagnostic records with pre-fit summaries, severities, and parallel-trend metrics

Statistical Models

Implementation of Uniform Methods Project (UMP) specifications

PostRegression

UMP 4.4.8

Enhanced treatment logic with automatic treatment date inference. Perfect for before-after analysis with control groups.

Automatic treatment date detection
Pre/post period validation
Control group comparison

TWFE

UMP 4.4.6

Two-Way Fixed Effects model with entity and time fixed effects. Ideal for panel data with multiple periods.

Entity fixed effects
Time fixed effects
Heterogeneous treatment

TimePeriodSavings

UMP 4.4.7

Interval-level analysis with panel data methods. Designed for measuring savings across different time periods.

Period-specific effects
Flexible time windows
Trend analysis

ConditionalSavings

UMP 4.4.5

Regression with weather/occupancy interactions for weather-sensitive programs.

Weather normalization
Interaction effects
Flexible baselines

SimpleDifferences

UMP 4.4.3

Basic difference-in-differences estimation for straightforward treatment effect analysis.

Simple DiD estimation
Clear interpretation
Minimal assumptions

HeterogeneousSavings

UMP 4.4.4

Models varying treatment effects across different units or characteristics.

Unit-specific effects
Characteristic-based analysis
Effect heterogeneity with registry-ready outputs

Outputs & Artifacts

Every workflow run writes a complete evidence bundle for engineers, analysts, and stakeholders.

Structured Tables

notebooks/artifacts/model_results_summary.csv – consolidated treatment effects for every model.
notebooks/artifacts/model_comparison.csv – registry-level benchmarking table.
notebooks/artifacts/<model>_results.csv – per-model diagnostics ready for BI tools.
notebooks/artifacts/diagnostics_summary.json – machine-readable metadata for automation.

Interactive Visuals

eda_load_shape.html & eda_treatment_balance.html – ready-to-share Plotly dashboards.
<model>_counterfactual.html – counterfactual vs. actual consumption bands.
<model>_parallel_trend.html – automated pre-trend validation charts.
Static assets/parallel_trends.png preview generated from the ETWFE pre-treatment effects on the 48-month constant dataset (or the simple parallel-trends diagnostic when ETWFE pre-periods are unavailable).

Notebooks & Reports

notebooks/post_regression_demo.ipynb – full UMP 4.4.8 walkthrough.
notebooks/full_feature_demo.ipynb – multi-model orchestration demo.
notebooks/artifacts/model_results_summary.csv feeds Markdown reports via ReportFormatter.
Every run ships a precheck log so data issues can be remediated quickly.

Workflow Automation

End-to-End Evaluation Pipeline

graph LR subgraph Ingestion A[Raw Meter Data] -->|infer_treatment_columns| B(Preprocessing) B -->|precheck_and_clean| C{Diagnostics} end subgraph Modeling C -->|Pass| D[Model Selection] C -->|Fail| E[Error Report] D --> F[PostRegression] D --> G[TWFE] D --> H[TimePeriodSavings] D --> I[ConditionalSavings] end subgraph Output F & G & H & I --> J[Treatment Effect Estimation] J --> K[HTML Report] J --> L[CSV Artifacts] end style A fill:#f7fafc,stroke:#2d3748,stroke-width:2px,rx:5 style K fill:#ebf8ff,stroke:#3182ce,stroke-width:2px,rx:5 style L fill:#ebf8ff,stroke:#3182ce,stroke-width:2px,rx:5 style C fill:#fff5f5,stroke:#e53e3e,stroke-width:2px,rx:5 style J fill:#f0fff4,stroke:#38a169,stroke-width:2px,rx:5

Intelligent Data Inference

`infer_treatment_columns()` inspects arbitrary CSV schemas, maps aliases such as d and after to the canonical treated/post_treatment fields, and records the inferred treatment date so every model starts from a clean, standardized DataFrame.

Detects cohort vs. exposure indicators automatically
Builds a post column for ETWFE diagnostics
Preserves the original cohort flag as treatment_group_indicator

Prechecks & Cleaning

`precheck_and_clean()` runs before every fit: it validates timestamps and energy values, ensures treated sites have both pre- and post-period observations, and removes rows that would otherwise break diagnostics while logging every action for transparency.

Reports dropped rows, invalid post indicators, and missing cohorts
Guards against ETWFE failures caused by NA event indices
Feeds structured issues into the CLI output stream

Registry & CLI Orchestration

The scripts/run_model_workflow.py runner wires the preprocessing helpers, model factory, ModelRegistry, and ReportFormatter together. One command ingests data, executes multiple models, and emits artifacts under notebooks/artifacts/.

python scripts/run_model_workflow.py \
  --data-path examples/data/synthetic_declining_48.csv \
  --compare-models post_regression simple_differences twfe \
  --output-dir notebooks/artifacts/generated_demo

Installation

From PyPI

Recommended for most users

pip install energaze

From Source

For development or latest features

git clone https://github.com/Ecometricx-DataScience/EnerGaze.git
cd EnerGaze
pip install -e .

Usage Examples

import pandas as pd
import numpy as np
from energaze.models import PostRegression

# Load your data
df = pd.read_csv('energy_data.csv')

# Initialize model with data
model = PostRegression(
    data=df,
    consumption_var='consumption',
    treatment_var='treatment',
    time_var='date',
    site_var='site_id',
    treatment_date='2023-01-01'
)

# Fit the model
model.fit()

# Get treatment effects
results = model.get_treatment_effect()
print(f"Treatment Effect: {results['effect']:.2f}")
print(f"P-value: {results['p_value']:.4f}")

# Get detailed summary
summary = model.summary()
print(summary)

# State-level grouping
state_model = PostRegression(
    data=df,
    consumption_var='consumption',
    treatment_var='treatment',
    time_var='date',
    site_var='site_id',
    state_var='state',  # Group by state
    treatment_date='2023-01-01'
)

state_model.fit()
state_results = state_model.get_treatment_effect()

# Results for each state
for state, result in state_results.items():
    print(f"{state}: Effect = {result['effect']:.2f}")

# Combined State × Wave grouping
combined_model = PostRegression(
    data=df,
    consumption_var='consumption',
    treatment_var='treatment',
    time_var='date',
    site_var='site_id',
    state_var='state',
    wave_var='wave',  # Both state and wave
    treatment_date='2023-01-01'
)

from energaze.models import TWFE, TimePeriodSavingsRegression

# Compare different models
models = {
    'PostRegression': PostRegression(
        data=df, consumption_var='consumption',
        treatment_var='treatment', time_var='date',
        site_var='site_id', treatment_date='2023-01-01'
    ),
    'TWFE': TWFE(
        data=df, consumption_var='consumption',
        treatment_var='treatment', time_var='date',
        site_var='site_id', treatment_date='2023-01-01'
    ),
    'TimePeriodSavings': TimePeriodSavingsRegression(
        data=df, consumption_var='consumption',
        treatment_var='treatment', time_var='date',
        site_var='site_id', treatment_date='2023-01-01'
    )
}

# Fit all models and compare
results = {}
for name, model in models.items():
    model.fit()
    results[name] = model.get_treatment_effect()
    print(f"{name}: Effect = {results[name]['effect']:.2f}")

Comprehensive Visualizations

Robust analysis outputs with publication-ready graphics

Advanced Data Visualization

EnerGaze produces comprehensive, publication-ready visualizations anchored by standardized diagnostics and orchestration utilities:

Treatment effect comparisons across models
Time-series analysis with trend lines
Multi-dimensional grouping visualizations
Pre/post period consumption patterns with parallel-trend insights
State and wave-level comparisons
Model registry tables (model_comparison.csv) and counterfactual reporting
Export-ready bundles such as eda_load_shape.html and model_results_summary.csv

Example Output: Parallel-trends diagnostics from the 48-month constant synthetic dataset, using ETWFE pre-treatment effects when available and otherwise falling back to the simple difference-in-means trend check used in the HTML reports.