Conley Spatial HAC standard errors for models with Fixed Effects

When estimating Spatial HAC errors as discussed in Conley (1999) and Conley (2008), I usually relied on code by Solomon Hsiang. He and others have made some code available that estimates standard errors that allow for spatial correlation along a smooth running variable (distance) and temporal correlation.

The code runs quite smoothly, but typically, when you add more controls you will find that stata starts to become extremely slow. The reason being that the code by Sol uses OLS to get the residuals that are then adjusted to account for spatial and serial correlation. We all know that OLS is powerful, but with fixed effects present, the matrices become ever larger and it takes longer and longer to invert and compute the standard errors.

After all – by including all the regressors into the reg command, you require operations on large matrices. For the US in my context, there are 50 states and 10 years, making a total of 500 state by year effects and 3000 county fixed effects. Clearly, I do not care about the standard errors of the fixed effects.

The way around this problem is simple: you need to demean the data beforehand and simply pass the residuals to Sol’s code that adjusts the standard errors.

I have written an ado file that does this. I piggy bag on the reg2hdfe module, the ols_spatial_HAC of course and the TMPDIR extensions.


In order to get things running, you need to install all these three and put them into your Stata ado-path.


The way to call the function is simply as ols_spatial_HAC, however, the code uses reg2hdfe to remove the time-variable (the first fixed effect) and the panel-variable (the second fixed effect).

reg2hdfespatial  logy   logx     ,
timevar(time) panelvar(district) lat(y) lon(x) distcutoff(1000) lagcutoff(20)

An alternative to doing this is to doing the demeaning manually, for example :

reg logy time_* district_*  
predict fitlogy if e(sample)
gen residy  = fitlogy-logy 
reg logx time_* district_* if e(sample)
predict fitx if e(sample)
gen residx = fitx-logx

ols_spatial_HAC residy residx , lat(y) lon(x) t(time) p(district) 
dist(1000) lag(20)

This is still quite slow, as it uses plain OLS. However, with xtreg and the likes it is not so easy to recover the correct residuals.