Skip to contents

Main function to estimate Lewbel's heteroskedasticity-based identification model using Generalized Method of Moments (GMM).

Usage

lewbel_gmm(
  data,
  system = c("triangular", "simultaneous"),
  y1_var = "Y1",
  y2_var = "Y2",
  x_vars = "Xk",
  z_vars = NULL,
  add_intercept = TRUE,
  gmm_type = c("twoStep", "iterative", "cue"),
  initial_values = NULL,
  vcov = c(.hetid_const("VCOV_HAC"), "iid", "cluster"),
  cluster_var = NULL,
  compute_se = TRUE,
  verbose = FALSE,
  ...
)

Arguments

data

Data frame containing all required variables. Must include the dependent variables and any exogenous regressors specified in the model.

system

Character. Type of system: "triangular" or "simultaneous" (default: "triangular"). Note: Simultaneous systems require strong identification conditions - either many regimes (4+) or large variance differences across regimes for numerical stability.

y1_var

Character. Name of the first dependent variable (default: "Y1").

y2_var

Character. Name of the second dependent variable/endogenous regressor (default: "Y2").

x_vars

Character vector. Names of exogenous variables (default: "Xk").

z_vars

Character vector. Names of heteroskedasticity drivers (default: NULL).

add_intercept

Logical. Whether to add an intercept to the exogenous variables (default: TRUE).

gmm_type

Character. GMM type: "twoStep", "iterative", or "cue" (default: "twoStep").

initial_values

Numeric vector. Initial parameter values (default: NULL, uses OLS).

vcov

Character. Type of variance-covariance matrix: "HAC", "iid", or "cluster" (default: "HAC").

cluster_var

Character. Variable name for clustering if vcov = "cluster" (default: NULL).

compute_se

Logical. Whether to compute standard errors (default: TRUE). Passed to gmm call.

verbose

Logical. Whether to print progress messages (default: TRUE).

...

Additional arguments passed to gmm().

Value

An object of class "gmm" containing estimation results.

Details

This function implements Lewbel's (2012) heteroskedasticity-based identification using the GMM framework. The method exploits heteroskedasticity in the error terms to generate valid instruments for endogenous regressors.

For simultaneous equation systems, identification becomes more challenging. The system requires sufficient variation in heteroskedasticity patterns to distinguish between the bidirectional effects. In practice, this means you need either many distinct heteroskedasticity regimes or very large differences in variance across existing regimes.

References

Lewbel, A. (2012). Using heteroscedasticity to identify and estimate mismeasured and endogenous regressor models. Journal of Business & Economic Statistics, 30(1), 67-80. doi:10.1080/07350015.2012.643126

See also

lewbel_triangular_moments, lewbel_simultaneous_moments for moment condition functions. rigobon_gmm for regime-based heteroskedasticity identification. prono_gmm for GARCH-based heteroskedasticity identification. compare_gmm_2sls for comparing GMM with 2SLS estimates. run_single_lewbel_simulation for 2SLS implementation.

Examples

if (FALSE) { # \dontrun{
# Generate example data
set.seed(123)
n <- 1000
params <- list(
  beta1_0 = 0.5, beta1_1 = 1.5, gamma1 = -0.8,
  beta2_0 = 1.0, beta2_1 = -1.0,
  alpha1 = -0.5, alpha2 = 1.0, delta_het = 1.2
)
data <- generate_lewbel_data(n, params)

# Estimate triangular system
gmm_tri <- lewbel_gmm(data, system = "triangular")
summary(gmm_tri)

# Estimate simultaneous system
gmm_sim <- lewbel_gmm(data, system = "simultaneous")
summary(gmm_sim)

# Compare with 2SLS
tsls_result <- run_single_lewbel_simulation(1, c(params, sample_size = n))
cat("2SLS estimate:", tsls_result$tsls_gamma1, "\n")
cat("GMM estimate:", coef(gmm_tri)["gamma1"], "\n")
} # }