Estimating standard errors in panel data with Python and linearmodels

Python
Econometrics
Linearmodels
Author

Vincent Grégoire, PhD, CFA

Published

January 21, 2024

In this post, I show how to estimate standard errors in panel data with Python and the linearmodels library.

More specifically, I show how to estimate the following class of models:

If you just want the code examples with no explanations, jump to the cheat sheet at the end of the post.

For time series models, see my post on estimating standard errors in time series data with Python and statsmodels.

Video tutorial

This post is also available as a video tutorial on YouTube.

Panel data

One of the most common tasks in finance research is to estimate standard errors in panel data. Linear panel data models have the following form:

\[ y_{it} = \alpha_i + \alpha_t + x_{it}\beta + \epsilon_{it}, \]

where \(y_{it}\) is the dependent variable for entity \(i\in [1,I]\) at time \(t\in [1,T]\), \(\alpha_i\) is an entity fixed effect, \(\alpha_t\) is a time fixed effect, \(x_{it}\) is a vector of independent variables, \(\beta\) is a vector of coefficients, and \(\epsilon_{it}\) is the error term. For the sake of simplicity, in the rest of this post, I assume that the entity is a firm and the time is a year. The error term can be correlated across firms and years and can be heteroskedastic, meaning that the variance of the error term is not constant.

When we get to the data, we write the model in matrix form:

\[ \mathbf{y} = \mathbf{X}\beta + \mathbf{\epsilon}, \]

where \(\mathbf{y}\) is a \(N \times 1\) vector of dependent variables, \(\mathbf{X}\) is a \(N \times K\) matrix of independent variables, \(\beta\) is a \(K \times 1\) vector of coefficients, and \(\mathbf{\epsilon}\) is a \(N \times 1\) vector of error terms. The number of observations is \(N\) and the number of independent variables is \(K\). For a balanced panel in which each firm is observed for the same number of years, \(N=I \times T\).

If the model includes fixed effects, we assume that the vector \(\mathbf{y}\) and the matrix \(\mathbf{X}\) are demeaned by the relevant fixed effect.

It is then common to stack the observations for each entity. For example, if we have \(I\) firms and \(T\) years, the vector \(\mathbf{y}\) of elements \(y_{i,t}\) is defined as:

\[ \mathbf{y} = \left[\begin{array}{c}y_{1,1}\\ y_{1,2}\\ \vdots \\ y_{1, T} \\ y_{2,1}\\ y_{2,2}\\ \vdots \\ y_{2, T}\\ y_{I,1}\\ \vdots \\ y_{I, T} \end{array} \right], \]

The matrix \(\mathbf{X}\) formed with vectors \(x_{i,t}\) is defined as:

\[ \mathbf{X} = \left[\begin{array}{ccccc} x_{1, 1}^{(1)} & x_{1, 1}^{(2)} & \cdots & x_{1, 1}^{(K-1)} & x_{1,1}^{(K)} \\ x_{1, 2}^{(1)} & x_{1, 2}^{(2)} & \cdots & x_{1, 2}^{(K-1)} & x_{1,2}^{(K)} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ x_{1, T}^{(1)} & x_{1, T}^{(2)} & \cdots & x_{1, T}^{(K-1)} & x_{1,T}^{(K)} \\ x_{2, 1}^{(1)} & x_{2, 1}^{(2)} & \cdots & x_{2, 1}^{(K-1)} & x_{2,1}^{(K)} \\ x_{2, 2}^{(1)} & x_{2, 2}^{(2)} & \cdots & x_{2, 2}^{(K-1)} & x_{2,2}^{(K)} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ x_{2, T}^{(1)} & x_{2, T}^{(2)} & \cdots & x_{2, T}^{(K-1)} & x_{2,T}^{(K)} \\ x_{I, 1}^{(1)} & x_{I, 1}^{(2)} & \cdots & x_{I, 1}^{(K-1)} & x_{I,1}^{(K)} \\ x_{I, 2}^{(1)} & x_{I, 2}^{(2)} & \cdots & x_{I, 2}^{(K-1)} & x_{I,2}^{(K)} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ x_{I, T}^{(1)} & x_{I, T}^{(2)} & \cdots & x_{I, T}^{(K-1)} & x_{I,T}^{(K)} \end{array} \right], \]

where \(x_{i,t}^{(j)}\) is the \(j\)-th element of the vector \(x_{i,t}\).

Finally, the \(\beta\) vector is defined as usual:

\[ \beta = \left[\begin{array}{c}\beta_1 \\ \beta_2 \\ \vdots \\ \beta_{K-1} \\ \beta_{K} \end{array} \right]. \]

The goal is then to estimate the coefficients \(\beta\) and their associated standard errors. The standard errors are important because they allow us to perform statistical inference. For example, we can test whether the coefficients are statistically different from zero.

In his seminal paper, Petersen (2009) compares the performance of different standard errors estimators in panel data. In addition, he provides programming advice on how to estimate standard errors in panel data with Stata and other languages, but not for Python.

In this post, I explain how to estimate standard errors in panel data with Python and the linearmodels library. I do so by replicating the sample results from Petersen’s test data page in Python, and cover a few more standard errors estimators that are useful in finance research.

I use the test data set provided by Mitchell Petersen and available here. The data set contains 500 firms and 10 years.

import pandas as pd

df = pd.read_table(
    "http://www.kellogg.northwestern.edu/faculty/petersen/htm/papers/se/test_data.txt",
    names=["firmid", "year", "x", "y"],
    sep=r"\s+",
)
df
firmid year x y
0 1 1 -1.113973 2.251535
1 1 2 -0.080854 1.242346
2 1 3 -0.237607 -1.426376
3 1 4 -0.152486 -1.109394
4 1 5 -0.001426 0.914686
... ... ... ... ...
4995 500 6 -0.077057 3.720502
4996 500 7 0.218847 0.559121
4997 500 8 -0.155530 -3.766785
4998 500 9 -0.040172 0.903354
4999 500 10 -0.001172 -0.529761

5000 rows × 4 columns

linearmodels

The linearmodels library is a Python package written by Kevin Sheppard at the University of Oxford that extends the statsmodels library with functions commonly used in financial econometrics. Of note, the documentation is quite good and the library is actively maintained.

Installation

You can install the library with your package manager of choice:

poetry:

poetry add linearmodels

pip:

pip install linearmodels

conda (linearmodels is not available in the default conda channel, but it is available in the conda-forge channel):

conda install -c conda-forge linearmodels

Usage

The library is easy to use, and intuitive for anyone familiar with statsmodels. In this post, I will use two model types: PanelOLS and FamaMacBeth. As their names imply, the PanelOLS class is used to estimate panel data models, and the FamaMacBeth class is used to estimate Fama-MacBeth regressions. The library also provides implementations for random effects and first differences models, and for between estimation and pooled OLS estimators. In addition to panel data models, the library also provides implementations for instrumental variables models (including 2SLS, 3SLS, GMM, and SUR) and for estimating linear factor models.

A key particularity of linearmodels is that it uses the indexing capabilities of pandas to identify the panel data structure. In particular, the DataFrame that contains your panel data must have a MultiIndex with the entity dimension in the first level and the time dimension in the second level:

df = df.set_index(["firmid", "year"])
df
x y
firmid year
1 1 -1.113973 2.251535
2 -0.080854 1.242346
3 -0.237607 -1.426376
4 -0.152486 -1.109394
5 -0.001426 0.914686
... ... ... ...
500 6 -0.077057 3.720502
7 0.218847 0.559121
8 -0.155530 -3.766785
9 -0.040172 0.903354
10 -0.001172 -0.529761

5000 rows × 2 columns

You define models by passing the dependent variable and the exogenous variables to the model class. Alternatively, following statsmodels, linearmodels leverages the patsy formula language to provide a more intuitive way to define models. For example, the following two lines of code are equivalent:

mod = PanelOLS(data["y"], data[["x1", "x2"]])
mod = PanelOLS.from_formula(formula="y ~ x1 + x2", data)

You then estimate models by calling the fit() method. The fit() method returns a linearmodels regression results object, which contains the estimated coefficients, standard errors, and other statistics. The linearmodels results object is similar to the statsmodels results object, but it contains additional methods and attributes that are useful for panel data models.

OLS coefficients and standard errors

The first model I estimate is a vanilla OLS model.

The OLS estimator is the most common estimator in finance research. It is also the most efficient estimator when the errors are homoskedastic and uncorrelated across entities and time.

The OLS estimator is defined as:

\[ \hat{\beta}_{OLS} = (\mathbf{X}'\mathbf{X})^{-1}\mathbf{X}'\mathbf{y}, \]

where \(\mathbf{X}\) is the matrix of exogenous variables and \(\mathbf{y}\) is the vector of dependent variables. The covariance matrix \(\Sigma\) of the estimator is defined as:

\[ \Sigma_{OLS} =\sigma_{\mathbf{\epsilon}}^{2}\cdot\left(\mathbf{X}^{\prime}\mathbf{X}\right)^{-1}, \]

where \(\sigma_{\mathbf{\epsilon}}^{2}\) is usually estimated with the residuals \(\mathbf{\hat{e}}=( \mathbf{y}-\mathbf{X}\hat{\beta}_{OLS})\) from the OLS regression:

\[ \hat{\sigma}_{\mathbf{\epsilon}}^{2} =\frac{1}{N-K}\mathbf{\hat{e}}'\mathbf{\hat{e}}. \]

The estimate of the covariance matrix is then:

\[ \hat{\Sigma}_{OLS} =\frac{1}{N-K}\mathbf{\hat{e}}'\mathbf{\hat{e}}\cdot\left(\mathbf{X}^{\prime}\mathbf{X}\right)^{-1}, \]

\[ \hat{\Sigma}_{OLS} =\left[\begin{array}{ccccc} \hat{\sigma}_{\beta_1}^2 & \hat{\sigma}_{\beta_1,\beta_2} & \cdots & \hat{\sigma}_{\beta_1,\beta_{K-1}} & \hat{\sigma}_{\beta_1,\beta_K} \\ \hat{\sigma}_{\beta_2,\beta_1} & \hat{\sigma}_{\beta_2}^2 & \cdots & \hat{\sigma}_{\beta_2,\beta_{K-1}} & \hat{\sigma}_{\beta_2,\beta_K} \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ \hat{\sigma}_{\beta_{K-1},\beta_1} & \hat{\sigma}_{\beta_{K-1},\beta_2} & \cdots & \hat{\sigma}_{\beta_{K-1}}^2 & \hat{\sigma}_{\beta_{K-1},\beta_K} \\ \hat{\sigma}_{\beta_K,\beta_1} & \hat{\sigma}_{\beta_K,\beta_2} & \cdots & \hat{\sigma}_{\beta_K,\beta_{K-1}} & \hat{\sigma}_{\beta_K}^2 \end{array} \right]. \]

The PanelOLS class can be used to estimate OLS models. The following code estimates the OLS coefficients and standard errors:

from linearmodels import PanelOLS

mod = PanelOLS.from_formula("y ~ 1 + x", df)

res = mod.fit()
res.summary
PanelOLS Estimation Summary
Dep. Variable: y R-squared: 0.2078
Estimator: PanelOLS R-squared (Between): 0.2208
No. Observations: 5000 R-squared (Within): 0.1907
Date: Fri, Jan 19 2024 R-squared (Overall): 0.2078
Time: 08:23:25 Log-likelihood -1.057e+04
Cov. Estimator: Unadjusted
F-statistic: 1310.7
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4998)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 1310.7
P-value 0.0000
Time periods: 10 Distribution: F(1,4998)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
Intercept 0.0297 0.0284 1.0466 0.2954 -0.0259 0.0853
x 1.0348 0.0286 36.204 0.0000 0.9788 1.0909


Note that because the intercept is usually excluded by default from panel data models because the use of fixed-effects renders it redundant, we need to add it manually to the model by adding 1 to the formula.

White standard errors

The OLS estimator standard errors are consistent when the errors are homoskedastic and uncorrelated across entities and time. In this case, the OLS estimator is unbiased and consistent. However, this is not usually the case with financial data, where the errors are often heteroskedastic and correlated across entities and/or time. In this case, the OLS estimator is biased and inconsistent, and the standard errors are incorrect.

A common solution to this problem is to use heteroskedasticity-robust standard errors. The most common heteroskedasticity-robust standard errors estimator is the White estimator, which is defined as

\[ \hat{\Sigma}_{White} = (\mathbf{X}' \mathbf{X})^{-1}\cdot\left[\mathbf{X}' \cdot \hat{\Omega}_0 \cdot \mathbf{X}\right] \cdot\left(\mathbf{X}' \mathbf{X}\right)^{-1}, \]

where \(\hat{\Omega}_0\) is a diagonal matrix with the squared residuals \(\mathbf{\hat{e}}^{2}\) on the diagonal, i.e., with the square of the j-th residual \(\hat{e}_{j}^{2}\) at position \((j,j)\) and \(0\) elsewhere. The matrix \(\hat{\Omega}_0\) is defined as:

\[ \hat{\Omega_0}= \left[\begin{array}{cccccccc} e_{1, 1}^{2} & 0 & \cdots & 0 & 0& \cdots & 0 & 0 \\ 0 & e_{1, 2}^{2} & \cdots & 0 & 0 & \cdots & 0 & 0\\ \vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots & \vdots\\ 0 & 0& \cdots &e_{2, 1}^{2} & 0 & \cdots & 0 & 0 \\ 0 & 0 & \cdots & 0 & e_{2, 2}^{2} & \cdots & 0 & 0 \\ \vdots & \vdots & \ddots & \vdots & \vdots & \ddots & \vdots & \vdots\\ 0 & 0 & \cdots & 0 & 0 & \cdots & e_{I,T-1}^{2} & 0 \\ 0 & 0 & \cdots & 0 & 0 & \cdots & 0 & e_{I,T}^{2} \end{array} \right]. \]

The PanelOLS class can be used to estimate OLS models with White standard errors. The following code estimates the OLS coefficients and White standard errors:

res = mod.fit(cov_type="robust")
res.summary
PanelOLS Estimation Summary
Dep. Variable: y R-squared: 0.2078
Estimator: PanelOLS R-squared (Between): 0.2208
No. Observations: 5000 R-squared (Within): 0.1907
Date: Fri, Jan 19 2024 R-squared (Overall): 0.2078
Time: 08:23:25 Log-likelihood -1.057e+04
Cov. Estimator: Robust
F-statistic: 1310.7
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4998)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 1328.2
P-value 0.0000
Time periods: 10 Distribution: F(1,4998)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
Intercept 0.0297 0.0284 1.0465 0.2954 -0.0259 0.0853
x 1.0348 0.0284 36.444 0.0000 0.9792 1.0905


The estimated coefficients are the same as with the OLS standard errors, but the standard errors are different. In turn, all statistical measures that depend on the standard errors, such as t-statistics and p-values, are different.

Clustered standard errors

Beyond homoskedasticity, a common concern with panel models is that the errors are correlated across entities and/or time. In this case, the OLS estimator is biased and inconsistent, and the standard errors are incorrect. Clustering standard errors is a widely used adjustment for this problem.

The intuition behind clustering is that often the errors are correlated within groups, but not across groups. For example, in a panel data set with firms, the errors could correlated within the observations of a firm, but not across firms. In this case, we can cluster the standard errors by firm to obtain consistent standard errors. Implicitly we assume that the errors have the following structure:1

Firm 1 Firm 2 Firm 3
Firm 1 \(\epsilon_{11}^2\) \(\epsilon_{11}\epsilon_{12}\) \(\epsilon_{11}\epsilon_{13}\) 0 0 0 0 0 0
\(\epsilon_{12}\epsilon_{11}\) \(\epsilon_{12}^2\) \(\epsilon_{12}\epsilon_{13}\) 0 0 0 0 0 0
\(\epsilon_{13}\epsilon_{11}\) \(\epsilon_{13}\epsilon_{12}\) \(\epsilon_{13}^2\) 0 0 0 0 0 0
Firm 2 0 0 0 \(\epsilon_{21}^2\) \(\epsilon_{21}\epsilon_{22}\) \(\epsilon_{21}\epsilon_{23}\) 0 0 0
0 0 0 \(\epsilon_{22}\epsilon_{21}\) \(\epsilon_{22}^2\) \(\epsilon_{22}\epsilon_{23}\) 0 0 0
0 0 0 \(\epsilon_{23}\epsilon_{21}\) \(\epsilon_{23}\epsilon_{22}\) \(\epsilon_{23}^2\) 0 0 0
Firm 3 0 0 0 0 0 0 \(\epsilon_{31}^2\) \(\epsilon_{31}\epsilon_{32}\) \(\epsilon_{31}\epsilon_{33}\)
0 0 0 0 0 0 \(\epsilon_{32}\epsilon_{31}\) \(\epsilon_{32}^2\) \(\epsilon_{32}\epsilon_{33}\)
0 0 0 0 0 0 \(\epsilon_{33}\epsilon_{31}\) \(\epsilon_{33}\epsilon_{32}\) \(\epsilon_{33}^2\)

If you assume that the errors are correlated within the observations of a firm, and across firms within each time period, you can use standard errors clustered by firm and year. In this case, the errors have the following structure:

Firm 1 Firm 2 Firm 3
Firm 1 \(\epsilon_{11}^2\) \(\epsilon_{11}\epsilon_{12}\) \(\epsilon_{11}\epsilon_{13}\) \(\epsilon_{11}\epsilon_{21}\) 0 0 \(\epsilon_{11}\epsilon_{31}\) 0 0
\(\epsilon_{12}\epsilon_{11}\) \(\epsilon_{12}^2\) \(\epsilon_{12}\epsilon_{13}\) 0 \(\epsilon_{12}\epsilon_{22}\) 0 0 \(\epsilon_{12}\epsilon_{32}\) 0
\(\epsilon_{13}\epsilon_{11}\) \(\epsilon_{13}\epsilon_{12}\) \(\epsilon_{13}^2\) 0 0 \(\epsilon_{13}\epsilon_{23}\) 0 0 \(\epsilon_{13}\epsilon_{33}\)
Firm 2 \(\epsilon_{21}\epsilon_{11}\) 0 0 \(\epsilon_{21}^2\) \(\epsilon_{21}\epsilon_{22}\) \(\epsilon_{21}\epsilon_{23}\) \(\epsilon_{21}\epsilon_{31}\) 0 0
0 \(\epsilon_{22}\epsilon_{12}\) 0 \(\epsilon_{22}\epsilon_{21}\) \(\epsilon_{22}^2\) \(\epsilon_{22}\epsilon_{23}\) 0 \(\epsilon_{22}\epsilon_{32}\) 0
0 0 \(\epsilon_{23}\epsilon_{13}\) \(\epsilon_{23}\epsilon_{21}\) \(\epsilon_{23}\epsilon_{22}\) \(\epsilon_{23}^2\) 0 0 \(\epsilon_{23}\epsilon_{33}\)
Firm 3 \(\epsilon_{31}\epsilon_{11}\) 0 0 \(\epsilon_{31}\epsilon_{21}\) 0 0 \(\epsilon_{31}^2\) \(\epsilon_{31}\epsilon_{32}\) \(\epsilon_{31}\epsilon_{33}\)
0 \(\epsilon_{32}\epsilon_{12}\) 0 0 \(\epsilon_{32}\epsilon_{22}\) 0 \(\epsilon_{32}\epsilon_{31}\) \(\epsilon_{32}^2\) \(\epsilon_{32}\epsilon_{33}\)
0 0 \(\epsilon_{33}\epsilon_{13}\) 0 0 \(\epsilon_{33}\epsilon_{23}\) \(\epsilon_{33}\epsilon_{31}\) \(\epsilon_{33}\epsilon_{32}\) \(\epsilon_{33}^2\)

You can easily estimate clustered standard errors with the PanelOLS class by passing cov_type="clustered" and the entity_effects and time_effects arguments to fit() method. The following code estimates the OLS coefficients and standard errors clustered by firm, year, and firm and year:

# Firm only
res = mod.fit(
    cov_type="clustered", cluster_entity=True, cluster_time=False, group_debias=True
)
res.summary
PanelOLS Estimation Summary
Dep. Variable: y R-squared: 0.2078
Estimator: PanelOLS R-squared (Between): 0.2208
No. Observations: 5000 R-squared (Within): 0.1907
Date: Fri, Jan 19 2024 R-squared (Overall): 0.2078
Time: 08:23:25 Log-likelihood -1.057e+04
Cov. Estimator: Clustered
F-statistic: 1310.7
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4998)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 418.32
P-value 0.0000
Time periods: 10 Distribution: F(1,4998)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
Intercept 0.0297 0.0670 0.4429 0.6579 -0.1017 0.1611
x 1.0348 0.0506 20.453 0.0000 0.9356 1.1340


# Year only
res = mod.fit(
    cov_type="clustered", cluster_entity=False, cluster_time=True, group_debias=True
)
res.summary
PanelOLS Estimation Summary
Dep. Variable: y R-squared: 0.2078
Estimator: PanelOLS R-squared (Between): 0.2208
No. Observations: 5000 R-squared (Within): 0.1907
Date: Fri, Jan 19 2024 R-squared (Overall): 0.2078
Time: 08:23:25 Log-likelihood -1.057e+04
Cov. Estimator: Clustered
F-statistic: 1310.7
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4998)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 960.59
P-value 0.0000
Time periods: 10 Distribution: F(1,4998)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
Intercept 0.0297 0.0234 1.2691 0.2045 -0.0162 0.0755
x 1.0348 0.0334 30.993 0.0000 0.9694 1.1003


# Firm and year
res = mod.fit(
    cov_type="clustered", cluster_entity=True, cluster_time=True, group_debias=True
)
res.summary
PanelOLS Estimation Summary
Dep. Variable: y R-squared: 0.2078
Estimator: PanelOLS R-squared (Between): 0.2208
No. Observations: 5000 R-squared (Within): 0.1907
Date: Fri, Jan 19 2024 R-squared (Overall): 0.2078
Time: 08:23:25 Log-likelihood -1.057e+04
Cov. Estimator: Clustered
F-statistic: 1310.7
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4998)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 373.33
P-value 0.0000
Time periods: 10 Distribution: F(1,4998)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
Intercept 0.0297 0.0651 0.4562 0.6483 -0.0979 0.1572
x 1.0348 0.0536 19.322 0.0000 0.9298 1.1398


Difference with Stata and statsmodels

By default, Stata and statsmodels estimators adjust the degrees of freedom when estimating clustered standard errors with small clusters. linearmodels does not do that by default, that is what the group_debias parameter is for. In this case it will make a difference because our time dimension is small.

Note that there is a corresponding debiased parameter for the OLS and White estimator, but it is set to True by default so we do not need to set it explicitly.

Panel regression with fixed effects

It is common in finance research to control for entity fixed effects to capture unobserved heterogeneity across entities and for time fixed effects to capture unobserved heterogeneity across time. For example, in a panel of firms, we can control for firm fixed effects to capture unobserved firm-specific characteristics that are constant over time. These fixed effects correspond to the \(\alpha_i\) and \(\alpha_t\) in our panel regression:

\[ y_{it} = \alpha_i + \alpha_t + x_{it}\beta + \epsilon_{it}, \]

The PanelOLS class can be used to estimate panel regression with fixed effects by adding EntityEffects or TimeEffects to the formula. In theory this is equivalent to adding a dummy variable for each entity or time period (i.e. adding C(firmid) or C(year)), but in practice it is much more efficient because linear models will use a efficient transformation to apply the effetc of the dummy variables without including them in the regression estimation. linearmodels will thus not provide coefficients for the fixed effects. Also, the intercept is usually excluded by default from panel data models because the use of fixed-effects renders it redundant, so linearmodels will not provide an intercept either.

The following code estimates the OLS coefficients and standard errors with firm fixed effects, year fixed effects, and firm and year fixed effects, and clustered standard errors along the same dimensions:

# Firm only
mod_ffe = PanelOLS.from_formula("y ~ x + EntityEffects", df)
res = mod_ffe.fit(
    cov_type="clustered", cluster_entity=True, cluster_time=False, group_debias=True
)
res.summary
PanelOLS Estimation Summary
Dep. Variable: y R-squared: 0.1916
Estimator: PanelOLS R-squared (Between): 0.2187
No. Observations: 5000 R-squared (Within): 0.1916
Date: Fri, Jan 19 2024 R-squared (Overall): 0.2070
Time: 08:23:25 Log-likelihood -8532.8
Cov. Estimator: Clustered
F-statistic: 1066.3
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4499)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 1035.4
P-value 0.0000
Time periods: 10 Distribution: F(1,4499)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
x 0.9699 0.0301 32.177 0.0000 0.9108 1.0290


F-test for Poolability: 11.372
P-value: 0.0000
Distribution: F(499,4499)

Included effects: Entity
# Year only
mod_yfe = PanelOLS.from_formula("y ~ x + TimeEffects", df)
res = mod_yfe.fit(
    cov_type="clustered", cluster_entity=False, cluster_time=True, group_debias=True
)
res.summary
PanelOLS Estimation Summary
Dep. Variable: y R-squared: 0.2077
Estimator: PanelOLS R-squared (Between): 0.2208
No. Observations: 5000 R-squared (Within): 0.1907
Date: Fri, Jan 19 2024 R-squared (Overall): 0.2078
Time: 08:23:25 Log-likelihood -1.057e+04
Cov. Estimator: Clustered
F-statistic: 1307.5
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4989)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 961.52
P-value 0.0000
Time periods: 10 Distribution: F(1,4989)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
x 1.0351 0.0334 31.008 0.0000 0.9696 1.1005


F-test for Poolability: 0.6799
P-value: 0.7279
Distribution: F(9,4989)

Included effects: Time
# Firm and year
mod_fyfe = PanelOLS.from_formula("y ~ x + EntityEffects + TimeEffects", df)
res = mod_fyfe.fit(
    cov_type="clustered", cluster_entity=True, cluster_time=True, group_debias=True
)
res.summary
PanelOLS Estimation Summary
Dep. Variable: y R-squared: 0.1913
Estimator: PanelOLS R-squared (Between): 0.2187
No. Observations: 5000 R-squared (Within): 0.1916
Date: Fri, Jan 19 2024 R-squared (Overall): 0.2070
Time: 08:23:25 Log-likelihood -8525.9
Cov. Estimator: Clustered
F-statistic: 1062.0
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4490)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 972.96
P-value 0.0000
Time periods: 10 Distribution: F(1,4490)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
x 0.9700 0.0311 31.192 0.0000 0.9091 1.0310


F-test for Poolability: 11.203
P-value: 0.0000
Distribution: F(508,4490)

Included effects: Entity, Time

Driscoll-Kraay standard errors

The Driscoll-Kraay standard errors estimator is a heteroskedasticity-robust standard errors estimator that is robust to serial correlation in the errors. This is particularly important in datasets where the assumption that observations are independent across cross-sections may not hold. See Driscoll and Kraay (1998) for more details.

To use the Driscoll-Kraay standard errors estimator, you need to pass the following arguments to the fit() method:

  • cov_type="kernel": To use the Driscoll-Kraay HAC estimator.
  • kernel="bartlett": The kernel to use. Bartlett is the kernel used by the Newey-West estimator. linearmodels also supports the Parzen (parzen) and the Quadratic Spectral (qs) kernels.
  • bandwith=3: the number of lags to use, automatically computed if not specified.
res = mod.fit(cov_type="kernel", kernel="bartlett", bandwidth=3)
res.summary
PanelOLS Estimation Summary
Dep. Variable: y R-squared: 0.2078
Estimator: PanelOLS R-squared (Between): 0.2208
No. Observations: 5000 R-squared (Within): 0.1907
Date: Sat, Jan 20 2024 R-squared (Overall): 0.2078
Time: 07:01:02 Log-likelihood -1.057e+04
Cov. Estimator: Driscoll-Kraay
F-statistic: 1310.7
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4998)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 1708.6
P-value 0.0000
Time periods: 10 Distribution: F(1,4998)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
Intercept 0.0297 0.0218 1.3622 0.1732 -0.0130 0.0724
x 1.0348 0.0250 41.335 0.0000 0.9858 1.0839


Fama-MacBeth coefficients and standard errors

The Fama-MacBeth regression, introduced in Fama and MacBeth (1973), is a robust method for estimating financial models in the context of panel data in empirical finance research that is effective in dealing with cross-sectional correlation and heteroskedasticity. This technique involves two key steps. For our model, we would first estimate the following cross-sectional regression for each time period \(t\):

\[ y_{it} = \alpha_t + x_{it}\beta_t + \epsilon_{it}, \]

and then average the coefficients \(\beta_t\) over all time periods to provide final estimates:

\[ \bar{\beta} = \frac{1}{T} \sum_{t=1}^{T} \beta_t. \]

The covariance matrix can then be computed as:

\[ \hat{\Sigma}_{FM} = \frac{1}{T-1} \sum_{t=1}^{T} (\beta_t - \bar{\beta})(\beta_t - \bar{\beta})'. \]

linearmodels provides a FamaMacBeth class that can be used to estimate Fama-MacBeth regressions. The following code estimates the Fama-MacBeth coefficients and standard errors:

from linearmodels import FamaMacBeth

mod_fm = FamaMacBeth.from_formula("y ~ 1 + x", df)
res = mod_fm.fit()
res.summary
FamaMacBeth Estimation Summary
Dep. Variable: y R-squared: 0.2078
Estimator: FamaMacBeth R-squared (Between): 0.2208
No. Observations: 5000 R-squared (Within): 0.1907
Date: Sat, Jan 20 2024 R-squared (Overall): 0.2078
Time: 06:09:49 Log-likelihood -1.057e+04
Cov. Estimator: Fama-MacBeth Standard Cov
F-statistic: 1310.7
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4998)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 964.72
P-value 0.0000
Time periods: 10 Distribution: F(1,4998)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
Intercept 0.0313 0.0234 1.3392 0.1806 -0.0145 0.0771
x 1.0356 0.0333 31.060 0.0000 0.9702 1.1010

Newey-West standard errors

It is common in practice to use Newey-West standard errors (introduced in Newey and West 1987) instead of the standard errors above in Fama-MacBeth regressions to account for heteroskedasticity and autocorrelation in the residuals. For more details on the Newey-West estimator, see my post on the topic.

To use Newey-West standard errors, you can pass the following arguments to the fit() method:

  • cov_type="kernel": To use HAC standard errors. See the documentation on Kernel (HAC) for more details.
  • kernel="bartlett": The Newey-West estimator uses the Bartlett kernel. linearmodels also supports the Parzen (parzen) and the Quadratic Spectral (qs) kernels.
  • bandwith=3: the number of lags to use, automatically computed if not specified.
fm_res = fm_mod.fit(cov_type="kernel", kernel="bartlett", bandwidth=3)
fm_res.summary
FamaMacBeth Estimation Summary
Dep. Variable: y R-squared: 0.2078
Estimator: FamaMacBeth R-squared (Between): 0.2208
No. Observations: 5000 R-squared (Within): 0.1907
Date: Sat, Jan 20 2024 R-squared (Overall): 0.2078
Time: 06:34:46 Log-likelihood -1.057e+04
Cov. Estimator: Fama-MacBeth Kernel Cov
F-statistic: 1310.7
Entities: 500 P-value 0.0000
Avg Obs: 10.0000 Distribution: F(1,4998)
Min Obs: 10.0000
Max Obs: 10.0000 F-statistic (robust): 1440.8
P-value 0.0000
Time periods: 10 Distribution: F(1,4998)
Avg Obs: 500.00
Min Obs: 500.00
Max Obs: 500.00
Parameter Estimates
Parameter Std. Err. T-stat P-value Lower CI Upper CI
Intercept 0.0313 0.0224 1.3934 0.1636 -0.0127 0.0753
x 1.0356 0.0273 37.958 0.0000 0.9821 1.0891

Cheat sheet

The following table summarizes the standard errors estimators available in linearmodels and their corresponding methods:

from linearmodels import PanelOLS

# Model with constant
mod_cst = PanelOLS.from_formula("y ~ 1 + x1 + x2", data)
# Non-robust errors (regular OLS)
res = mod_cst.fit()
# White standard errors (heteroskedasticity-robust)
res = mod_cst.fit(cov_type="robust")

# Clustered standard errors
# By entity (firm)
res = mod_cst.fit(
    cov_type="clustered", cluster_entity=True, cluster_time=False, group_debias=True
)
# By time (year)
res = mod_cst.fit(
    cov_type="clustered", cluster_entity=False, cluster_time=True, group_debias=True
)
# By entity and time
res = mod_cst.fit(
    cov_type="clustered", cluster_entity=True, cluster_time=True, group_debias=True
)

# Fixed effects models
# Model with entity (firm) fixed effects
mod_fe = PanelOLS.from_formula("y ~ x1 + x2 + EntityEffects", data)
# Model with time (year) fixed effects
mod_fe = PanelOLS.from_formula("y ~ x1 + x2 + TimeEffects", data)
# Model with entity and time fixed effects
mod_fe = PanelOLS.from_formula("y ~ x1 + x2 + EntityEffects+ TimeEffects", data)

# To cluster standard errors by entity and time, the usage is the same as above,
# but applied to the FE model.

# By entity (firm)
res = mod_fe.fit(
    cov_type="clustered", cluster_entity=True, cluster_time=False, group_debias=True
)
# By time (year)
res = mod_fe.fit(
    cov_type="clustered", cluster_entity=False, cluster_time=True, group_debias=True
)
# By entity and time
res = mod_fe.fit(
    cov_type="clustered", cluster_entity=True, cluster_time=True, group_debias=True
)

# Driscoll-Kraay standard errors
res = mod_cst.fit(cov_type="kernel", kernel="bartlett", bandwidth=3)

# Fama-MacBeth regressions
from linearmodels import FamaMacBeth

mod_fm = FamaMacBeth.from_formula("y ~ 1 + x1 + x2", data)
# Non-adjusted standard errors
res = mod_fm.fit()
# Newey-West standard errors with 3 lags
res = mod_fm.fit(cov_type="kernel", kernel="bartlett", bandwidth=3)

Learn more

To learn more about the linearmodels library, I recommend reading the documentation, especially the section on panel data model estimation.

Using a powerful tool like linearmodels can help you save time and avoid errors when estimating panel data models. However, it is important to understand the underlying econometric theory. If you are interested in learning more about panel data models and their associated standard error, I recommend reading the Petersen (2009) paper or watching the talk he gave at the 2008 FMA Annual meetings.

References

Driscoll, John C, and Aart C Kraay. 1998. “Consistent Covariance Matrix Estimation with Spatially Dependent Panel Data.” Review of Economics and Statistics 80 (4): 549–60. https://www.mitpressjournals.org/doi/abs/10.1162/003465398557825.
Fama, Eugene F, and James D MacBeth. 1973. “Risk, Return, and Equilibrium: Empirical Tests.” Journal of Political Economy 81 (3): 607–36. https://www.journals.uchicago.edu/doi/abs/10.1086/260061.
Newey, Whitney K, and Kenneth D West. 1987. “A Simple, Positive Semi-Definite, Heteroskedasticity and Autocorrelation Consistent Covariance Matrix.” Econometrica 55 (3): 703–8. https://www.jstor.org/stable/1913610.
Petersen, Mitchell A. 2009. “Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches.” The Review of Financial Studies 22 (1): 435–80. https://academic.oup.com/rfs/article-abstract/22/1/435/1585940.

Footnotes

  1. Credit: the two tables are taken from Petersen (2009).↩︎

Reuse