Main function to estimate Lewbel's heteroskedasticity-based identification model using Generalized Method of Moments (GMM).
Usage
lewbel_gmm(
data,
system = c("triangular", "simultaneous"),
y1_var = "Y1",
y2_var = "Y2",
x_vars = "Xk",
z_vars = NULL,
add_intercept = TRUE,
gmm_type = c("twoStep", "iterative", "cue"),
initial_values = NULL,
vcov = c(.hetid_const("VCOV_HAC"), "iid", "cluster"),
cluster_var = NULL,
compute_se = TRUE,
verbose = FALSE,
...
)
Arguments
- data
Data frame containing all required variables. Must include the dependent variables and any exogenous regressors specified in the model.
- system
Character. Type of system: "triangular" or "simultaneous" (default: "triangular"). Note: Simultaneous systems require strong identification conditions - either many regimes (4+) or large variance differences across regimes for numerical stability.
- y1_var
Character. Name of the first dependent variable (default: "Y1").
- y2_var
Character. Name of the second dependent variable/endogenous regressor (default: "Y2").
- x_vars
Character vector. Names of exogenous variables (default: "Xk").
- z_vars
Character vector. Names of heteroskedasticity drivers (default: NULL).
- add_intercept
Logical. Whether to add an intercept to the exogenous variables (default: TRUE).
- gmm_type
Character. GMM type: "twoStep", "iterative", or "cue" (default: "twoStep").
- initial_values
Numeric vector. Initial parameter values (default: NULL, uses OLS).
- vcov
Character. Type of variance-covariance matrix: "HAC", "iid", or "cluster" (default: "HAC").
- cluster_var
Character. Variable name for clustering if vcov = "cluster" (default: NULL).
- compute_se
Logical. Whether to compute standard errors (default: TRUE). Passed to gmm call.
- verbose
Logical. Whether to print progress messages (default: TRUE).
- ...
Additional arguments passed to gmm().
Details
This function implements Lewbel's (2012) heteroskedasticity-based identification using the GMM framework. The method exploits heteroskedasticity in the error terms to generate valid instruments for endogenous regressors.
For simultaneous equation systems, identification becomes more challenging. The system requires sufficient variation in heteroskedasticity patterns to distinguish between the bidirectional effects. In practice, this means you need either many distinct heteroskedasticity regimes or very large differences in variance across existing regimes.
References
Lewbel, A. (2012). Using heteroscedasticity to identify and estimate mismeasured and endogenous regressor models. Journal of Business & Economic Statistics, 30(1), 67-80. doi:10.1080/07350015.2012.643126
See also
lewbel_triangular_moments
, lewbel_simultaneous_moments
for moment condition functions.
rigobon_gmm
for regime-based heteroskedasticity identification.
prono_gmm
for GARCH-based heteroskedasticity identification.
compare_gmm_2sls
for comparing GMM with 2SLS estimates.
run_single_lewbel_simulation
for 2SLS implementation.
Examples
if (FALSE) { # \dontrun{
# Generate example data
set.seed(123)
n <- 1000
params <- list(
beta1_0 = 0.5, beta1_1 = 1.5, gamma1 = -0.8,
beta2_0 = 1.0, beta2_1 = -1.0,
alpha1 = -0.5, alpha2 = 1.0, delta_het = 1.2
)
data <- generate_lewbel_data(n, params)
# Estimate triangular system
gmm_tri <- lewbel_gmm(data, system = "triangular")
summary(gmm_tri)
# Estimate simultaneous system
gmm_sim <- lewbel_gmm(data, system = "simultaneous")
summary(gmm_sim)
# Compare with 2SLS
tsls_result <- run_single_lewbel_simulation(1, c(params, sample_size = n))
cat("2SLS estimate:", tsls_result$tsls_gamma1, "\n")
cat("GMM estimate:", coef(gmm_tri)["gamma1"], "\n")
} # }