Skip to main content

Econometrics

The capabilities of the Econometrics menu in the OpenBB Terminal are wrapped into a powerful SDK, enabling users to work with the data in a flexible environment that can be fully customized to meet the needs of any user. The Econometrics menu's purpose is to provide the user the ability to perform statistical research on custom datasets. The menu allows the user to load in his/her own dataset(s), modify the data by adding columns or setting indices, apply statistical tests (e.g. Breusch-Godfrey autocorrelation tests) as well as OLS regressions and Panel regressions (e.g. Random Effects and Fixed Effects)

How to use

Start a Python script or Notebook file and import the SDK module:

from openbb_terminal.sdk import openbb

Below is a brief description of each function within the Portfolio module:

PathTypeDescription
openbb.econometrics.grangerFunctionCheck time-series for Granger causality (X causes Y)
openbb.econometrics.granger_chartFunctionObtain a nice table of Granger causality
openbb.econometrics.feFunctionPerform a Fixed Effects (fe) regression on Panel data
openbb.econometrics.optionsFunctionObtain all options that can be used for regression techniques
openbb.econometrics.options_chartFunctionGet a nice table of the options
openbb.econometrics.dwatFunctionCheck for auto-correlation with Durbin Watson
openbb.econometrics.dwat_chartFunctionPlot the residuals of the OLS model
openbb.econometrics.cointFunctionCheck whether time-series are cointegrated
openbb.econometrics.coint_chartFunctionShow the error-correction terms plot
openbb.econometrics.polsFunctionPerform a Pooled OLS (pols) regression on Panel data
openbb.econometrics.rootFunctionCheck for unit root in the timeseries
openbb.econometrics.root_chartFunctionShow a nice table of the unit root test results
openbb.econometrics.bgodFunctionCheck for autocorrelation with Breusch Godfrey
openbb.econometrics.bgod_chartFunctionShow a nice table of the autocorrelation test results
openbb.econometrics.reFunctionPerform a Random Effects (re) regression on Panel data
openbb.econometrics.bolsFunctionPerform a Between OLS (bols) regression on Panel data
openbb.econometrics.panelFunctionObtain the regression results wrapper
openbb.econometrics.get_regression_dataFunctionObtain an OLS model wrapper
openbb.econometrics.normFunctionCheck whether the time-series is normally distributed
openbb.econometrics.norm_chartFunctionShow a histogram of the time-series
openbb.econometrics.cleanFunctionApply either a fill or drop method to clean the data
openbb.econometrics.bpagFunctionTest for heteroskedasticity with a Breusch-Pagan test
openbb.econometrics.bpag_chartFunctionGet a nice table with Breusch-Pagan test results
openbb.econometrics.fdolsFunctionPerform a First Difference OLS (fdols) regression on Panel data
openbb.econometrics.loadFunctionLoad in a dataset to be used within other functionalities
openbb.econometrics.olsFunctionPerform an Ordinary Least Squares (ols) regression on time-series data
openbb.econometrics.comparisonFunctionCompare different regression models in one table

Alternatively you can print the contents of the Econometrics SDK with:

help(openbb.econometrics)

Examples

Loading a dataset

The first step in using this menu is loading a dataset. This can be either an example dataset, see the list below, or any locally stored Excel file. To demonstrate the usage of the menu, the longley dataset is loaded in. This can be done with the following

example_load = openbb.econometrics.load("anes96")
file_load = openbb.econometric.load("PATH_TO_FILE/FILE.xlsx")
FileDescription
anes96American National Election Survey 1996
cancerBreast Cancer Data
ccardBill Greene’s credit scoring data.
cancer_chinaSmoking and lung cancer in eight cities in China.
co2Mauna Loa Weekly Atmospheric CO2 Data
committeeFirst 100 days of the US House of Representatives 1995
copperWorld Copper Market 1951-1975 Dataset
cpunishUS Capital Punishment dataset.
danish_dataDanish Money Demand Data
elninoEl Nino - Sea Surface Temperatures
engelEngel (1857) food expenditure data
fairAffairs dataset
fertilityWorld Bank Fertility Data
grunfeldGrunfeld (1950) Investment Data
heartTransplant Survival Data
interest_inflation(West) German interest and inflation rate 1972-1998
longleyLongley dataset
macrodataUnited States Macroeconomic data
modechoiceTravel Mode Choice
nileNile River flows at Ashwan 1871-1970
randhieRAND Health Insurance Experiment Data
scotlandTaxation Powers Vote for the Scottish Parliament 1997
spectorSpector and Mazzeo (1980) - Program Effectiveness Data
stacklossStack loss data
star98Star98 Educational Dataset
statecrimStatewide Crime Data 2009
strikesU.S. Strike Duration Data
sunspotsYearly sunspots data 1700-2008
wage_panelVeila and M. Verbeek (1998): Whose Wages Do Unions Raise?

Working with Time Series data

To demonstrate the usage of the Econometrics SDK for time series data, the

longley dataset is loaded in.
# Load the data
longley = openbb.econometrics.load("longley")

# Show the data
longley
TOTEMPGNPDEFLGNPUNEMPARMEDPOPYEAR
06032383234289235615901076081947
16112288.5259426232514561086321948
26017188.2258054368216161097731949
36118789.5284599335116501109291950
46322196.2328975209930991120751951
56363998.1346999193235941132701952
66498999365385187035471150941953
763761100363112357833501162191954
866019101.2397469290430481173881955
967857104.6419180282228571187341956
1068169108.4442769293627981204451957
1166513110.8444546468126371219501958
1268655112.6482704381325521233661959
1369564114.2502601393125141253681960
1469331115.7518173480625721278521961
1570551116.9554894400728271300811962

This can be extended by also showing the descriptive statistics, this can be done with a native command from Pandas as follows:

longley.describe()
TOTEMPGNPDEFLGNPUNEMPARMEDPOPYEAR
count16161616161616
mean65317101.6813876983193.312606.691174241954.5
std3511.9710.791699394.9934.464695.926956.14.76095
min6017183234289187014561076081947
25%62712.594.5253178812348.2522981117881950.75
50%65504100.63814273143.52717.51168041954.5
75%68290.5111.254540863842.53060.751223041958.25
max70551116.9554894480635941300811962

It is possible to check for a variety of assumptions, e.g. normality, unit root, granger and co-integration. The functions openbb.econometric.norm and openb.econometrics.root are shown below. Note that due to the small size of the dataset, many of these tests are not statistically significant.

openbb.econometrics.norm(longley['GNP'])
KurtosisSkewnessJarque-BeraShapiro-WilkKolmogorov-Smirnov
Statistic-1.19440.05253170.8350920.9625931
p-value0.232320.958110.658660.7090
openbb.econometrics.root(longley['POP'])
ADFKPSS
Test Statistic2.352040.324887
P-Value0.9989860.1
NLags60
Nobs90
ICBest113.0540

The longley dataset is known for the ability to create an OLS regression that results in a R-squared of 1.0 due to the fact that the US macroeconomic variables are known to be highly collinear. See the following regression performed with openbb.econometrics.ols as follows:

# Perform the regression technique. TOTEMP is dependent,  all others independent
ols_regression = openbb.econometrics.ols(longley['TOTEMP'], longley.drop('TOTEMP', axis=1))

# Show the model summary
ols_regression.summary()
                                 OLS Regression Results
=======================================================================================
Dep. Variable: TOTEMP R-squared (uncentered): 1.000
Model: OLS Adj. R-squared (uncentered): 1.000
Method: Least Squares F-statistic: 5.052e+04
Date: Mon, 21 Nov 2022 Prob (F-statistic): 8.20e-22
Time: 10:54:19 Log-Likelihood: -117.56
No. Observations: 16 AIC: 247.1
Df Residuals: 10 BIC: 251.8
Df Model: 6
Covariance Type: nonrobust
==============================================================================
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
GNPDEFL -52.9936 129.545 -0.409 0.691 -341.638 235.650
GNP 0.0711 0.030 2.356 0.040 0.004 0.138
UNEMP -0.4235 0.418 -1.014 0.335 -1.354 0.507
ARMED -0.5726 0.279 -2.052 0.067 -1.194 0.049
POP -0.4142 0.321 -1.289 0.226 -1.130 0.302
YEAR 48.4179 17.689 2.737 0.021 9.003 87.832
==============================================================================
Omnibus: 1.443 Durbin-Watson: 1.277
Prob(Omnibus): 0.486 Jarque-Bera (JB): 0.605
Skew: 0.476 Prob(JB): 0.739
Kurtosis: 3.031 Cond. No. 4.56e+05
==============================================================================

Notes:
[1] R² is computed without centering (uncentered) since the model does not contain a constant.
[2] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[3] The condition number is large, 4.56e+05. This might indicate that there are
strong multicollinearity or other numerical problems.

After running the regression estimation, it is possible to perform tests on the residuals of the model. E.g. for autocorrelation and heteroscedasity as shown below with the openbb.econometrics.bgod and openbb.econometrics.bpag functions.

# Perform Breusch Goodfrey auto-correlation test
openbb.econometrics.bgod(ols_regression)
0
lm-stat10.3471
p-value0.0158347
f-stat0.0970889
fp-value0.958799
# Perform Breusch Pagan heteroskedacity test
openbb.econometrics.bpag(ols_regression)
0
lm-stat7.90331
p-value0.161645
f-stat1.62686
fp-value0.236596

Working with Panel data

Within the examples there is one panel dataset available named wage_panel. This is a dataset from the paper by Vella and M. Verbeek (1998), “Whose Wages Do Unions Raise? A Dynamic Model of Unionism and Wage Rate Determination for Young Men,” Journal of Applied Econometrics 13, 163-183. This is a well-known dataset also used within Chapter 14 of Introduction to Econometrics by Jeffrey Wooldridge.

# Load the data
wage_panel = openbb.econometrics.load("wage_panel")

# Show the data
wage_panel
nryearblackexperhisphoursmarriededucunionlwageexpersqoccupation
0131980010267201401.1975419
1131981020232001411.8530649
2131982030294001401.3444699
3131983040296001401.43321169
4131984050307101401.56812255

4360 rows × 12 columns

To run panel regressions, it is important to define both entity (e.g. company) and time (e.g. year). Trying to run the openbb.econometrics.re function would right now result in the following:

openbb.econometrics.re(wage_panel['black'], wage_panel.drop('black', axis=1))

Error: Series can only be used with a 2-level MultiIndex

This can be corrected by setting a multi-index, this can be done with the following:

wage_panel = wage_panel.set_index(['nr', 'year'], drop=False)

The columns nr and year still exists within the dataset and could have been dropped with the if desired. However, in this case the year column is relevant for generating time effects in Pooled OLS, Fixed Effects and Random Effects estimations. To be able to do this, the type of the year column needs to be changed accordingly to 'category' so it is perceived as categorical data. This can be done with the following:

# Observe the current types
wage_panel.dtypes
0
nrint64
yearint64
blackint64
experint64
hispint64
hoursint64
marriedint64
educint64
unionint64
lwagefloat64
expersqint64
occupationint64
# Change the type of year to categorical
wage_panel['year'] = wage_panel['year'].astype('category')

# Observe the changed types
wage_panel.dtypes
0
nrint64
yearcategory
blackint64
experint64
hispint64
hoursint64
marriedint64
educint64
unionint64
lwagefloat64
expersqint64
occupationint64

The dataset is now properly configured to allow for proper panel regressions. The Econometrics SDK supports the following regression techniques.

PathDescription
openbb.econometrics.olsPerform an Ordinary Least Squares (ols) regression on time-series data
openbb.econometrics.polsPerform a Pooled OLS (pols) regression on Panel data
openbb.econometrics.bolsPerform a Between OLS (bols) regression on Panel data
openbb.econometrics.fdolsPerform a First Difference OLS (fdols) regression on Panel data
openbb.econometrics.fePerform a Fixed Effects (fe) regression on Panel data
openbb.econometrics.rePerform a Random Effects (re) regression on Panel data

As an example, a Random Effects regression is performed. This can be done as follows:

# Perform the Random Effects regression technique
random_effects_regression = openbb.econometrics.re(wage_panel['lwage'], wage_panel[['black', 'hisp', 'exper', 'expersq', 'married', 'educ', 'union','year']])

# Show the results
random_effects_regression.summary
                        RandomEffects Estimation Summary
================================================================================
Dep. Variable: lwage R-squared: 0.1806
Estimator: RandomEffects R-squared (Between): 0.1853
No. Observations: 4360 R-squared (Within): 0.1799
Date: Mon, Nov 21 2022 R-squared (Overall): 0.1828
Time: 11:13:36 Log-likelihood -1622.5
Cov. Estimator: Unadjusted
F-statistic: 68.409
Entities: 545 P-value 0.0000
Avg Obs: 8.0000 Distribution: F(14,4345)
Min Obs: 8.0000
Max Obs: 8.0000 F-statistic (robust): 68.409
P-value 0.0000
Time periods: 8 Distribution: F(14,4345)
Avg Obs: 545.00
Min Obs: 545.00
Max Obs: 545.00

Parameter Estimates
==============================================================================
Parameter Std. Err. T-stat P-value Lower CI Upper CI
------------------------------------------------------------------------------
const 0.0234 0.1514 0.1546 0.8771 -0.2735 0.3203
black -0.1394 0.0480 -2.9054 0.0037 -0.2334 -0.0453
hisp 0.0217 0.0428 0.5078 0.6116 -0.0622 0.1057
exper 0.1058 0.0154 6.8706 0.0000 0.0756 0.1361
expersq -0.0047 0.0007 -6.8623 0.0000 -0.0061 -0.0034
married 0.0638 0.0168 3.8035 0.0001 0.0309 0.0967
educ 0.0919 0.0107 8.5744 0.0000 0.0709 0.1129
union 0.1059 0.0179 5.9289 0.0000 0.0709 0.1409
year.1981 0.0404 0.0247 1.6362 0.1019 -0.0080 0.0889
year.1982 0.0309 0.0324 0.9519 0.3412 -0.0327 0.0944
year.1983 0.0202 0.0417 0.4840 0.6284 -0.0616 0.1020
year.1984 0.0430 0.0515 0.8350 0.4037 -0.0580 0.1440
year.1985 0.0577 0.0615 0.9383 0.3482 -0.0629 0.1782
year.1986 0.0918 0.0716 1.2834 0.1994 -0.0485 0.2321
year.1987 0.1348 0.0817 1.6504 0.0989 -0.0253 0.2950
==============================================================================