Overview
The hetid
package implements identification through heteroskedasticity methods for models with endogenous regressors. It provides tools for estimation and inference using:
-
Lewbel (2012): Identification using continuous heteroskedasticity drivers
- GMM (Generalized Method of Moments) estimation for both triangular and simultaneous systems
- Traditional 2SLS (Two-Stage Least Squares) estimation
- Set identification when point identification assumptions are relaxed
-
Klein and Vella (2010): Control function approach using heteroskedasticity
- Parametric and semiparametric control function methods
- Heteroskedasticity-based identification without exclusion restrictions
- Support for triangular systems
-
Rigobon (2003): Identification using discrete regime indicators
- GMM estimation for regime-based heteroskedasticity
- 2SLS with regime-based instruments
- Support for both triangular and simultaneous systems
-
Prono (2014): GARCH-based heteroskedasticity identification
- GMM estimation using conditional variance from GARCH models
- 2SLS with GARCH-based instruments
- Support for triangular systems
- Comparison tools with other implementations (REndo, Stata)
- Monte Carlo simulation to validate theoretical results and bootstrap
- Visualization tools
Applications
- Identification in models with endogenous regressors by heteroskedasticity
- Useful when traditional instruments are unavailable
See the package website for complete documentation and examples.
Quick Start Options
Development Environment
For development and contribution, you can use:
Docker (Recommended): - Use the provided Docker setup for a consistent development environment - See Development Guide for Docker setup instructions
Local R Installation: - Install R 4.5.0 or later - Follow the local setup instructions in the Development Guide
📦 Local Installation
You can install the development version of hetid from GitHub with:
# install.packages("devtools")
devtools::install_github("fernando-duarte/heteroskedasticity_identification")
Note on Vignettes
If you’re installing from a path containing spaces (e.g., “Dropbox (Personal)”), vignettes may not build correctly during installation. To view the vignette after installation:
# Build vignettes manually if needed
devtools::build_vignettes()
# View available vignettes
browseVignettes("hetid")
Optional Dependencies
For enhanced functionality, you may want to install these optional packages:
# For enhanced table formatting in analysis output
install.packages("knitr")
# For comparison with other Lewbel (2012) implementations
install.packages("REndo") # Version >= 2.4.0 required
install.packages("AER") # For ivreg function
# For Stata comparison (if Stata is available)
install.packages("RStata")
install.packages("haven")
# For Prono GARCH-based identification
install.packages("tsgarch")
The package will work without these optional dependencies, but installing them provides: - knitr: Nicely formatted tables in analysis functions (when verbose = TRUE
) - REndo: Comparison with alternative R implementation of Lewbel (2012) - AER: Additional IV regression capabilities - RStata/haven: Comparison with alternative Stata implementation of Lewbel (2012) - tsgarch: GARCH modeling for Prono (2014) time-series identification
Quick Start
# Install and load
devtools::install_github("fernando-duarte/heteroskedasticity_identification")
library(hetid)
# Quick demonstration
run_lewbel_demo()
# Basic workflow
config <- create_default_config()
data <- generate_lewbel_data(100, config)
result <- run_single_lewbel_simulation(1, config)
# GMM estimation for Lewbel
gmm_result <- lewbel_gmm(data, system = "triangular")
summary(gmm_result)
# GMM estimation for Rigobon
rigobon_data <- generate_rigobon_data(500, list(
beta1_0 = 0.5, beta1_1 = 1.5, gamma1 = -0.8,
beta2_0 = 1.0, beta2_1 = -1.0,
regime_probs = c(0.4, 0.6),
sigma2_regimes = c(1.0, 3.0)
))
rigobon_result <- rigobon_gmm(rigobon_data)
summary(rigobon_result)
# GMM estimation for Prono
prono_data <- generate_prono_data(500, create_prono_config())
prono_result <- prono_gmm(prono_data)
summary(prono_result)
# Compare GMM with 2SLS
comparison <- compare_gmm_2sls(data)
print(comparison)
Testing
After cloning this repository:
# Quick tests (recommended for contributors)
# In RStudio or after opening R in project directory:
devtools::test() # Runs CRAN + fast tests
# Or use Make:
make test-fast # CRAN + fast tests
make test-all # All tests (10+ minutes)
# Minimal tests only
Sys.setenv(HETID_TEST_LEVEL = "cran")
devtools::test()
Note: Opening the project in RStudio or starting R in the project directory automatically configures the test environment for fast tests.
Test Organization
The package uses a hierarchical test system:
- CRAN tests: Very fast unit tests (< 1 minute total)
- Fast tests: Quick tests for development (< 2 minutes)
- Integration tests: Multi-component tests (< 5 minutes)
- Comprehensive tests: Full validation including Monte Carlo (10+ minutes)
See the Makefile for all available test commands.
References
Klein, R., & Vella, F. (2010). Estimating a class of triangular simultaneous equations models without exclusion restrictions. Journal of Econometrics, 154(2), 154-164. https://doi.org/10.1016/j.jeconom.2009.05.005
Lewbel, A. (2012). Using heteroscedasticity to identify and estimate mismeasured and endogenous regressor models. Journal of Business & Economic Statistics, 30(1), 67-80. https://doi.org/10.1080/07350015.2012.643126
Prono, T. (2014). The role of conditional heteroskedasticity in identifying and estimating linear triangular systems, with applications to asset pricing models that include a mismeasured factor. Journal of Applied Econometrics, 29(5), 800-824. https://doi.org/10.1002/jae.2331
Rigobon, R. (2003). Identification through heteroskedasticity. The Review of Economics and Statistics, 85(4), 777-792. https://doi.org/10.1162/003465303772815727
License
This project is licensed under the MIT License - see the LICENSE file for details.