Skip to main content

Econometrics

The Econometrics functions are for performing statistical analysis on custom datasets. Multiple data sets can be loaded from local storage and modified with basic DataFrame operations. Statistical tests - (e.g. Breusch-Godfrey autocorrelation tests) or OLS and Panel regressions (e.g. Random Effects and Fixed Effects) - are performed on any of the loaded files.

Usage

Enter the Econometrics menu from the main menu by typing, econometrics, into the Terminal. The absolute path for the menu is:

/econometrics

Screenshot 2023-11-02 at 9 03 24 AM

Sample Datasets

Screenshot 2023-11-02 at 12 12 07 PM

There are sample datasets included in the Scipy library, those are listed by adding --examples to the load command. For example, longley:

load longley
note

Due to the small size of the dataset, many of these tests are not statistically significant.

Load

The first step in using the Econometrics menu is to load in some data. Place files in the paths displayed at the top of the menu, under "Looking for data in:".

Screenshot 2023-11-02 at 9 15 17 AM

This file contains historical monthly levels of the S&P 500 price and P/E ratio. It was populated from: Nasdaq Data Link.

After loading a file, refreshing the screen (? or h with no command) updates the information printed under "Loaded files and data columns:".

Loaded files and data columns:
sp500_pe : date, pe, price

Show

Use the show command to inspect a a loaded file. If more than one file has been loaded, specify the target's name.

show sp500_pe
datepeprice
1871-01-3111.14.44
1871-02-2811.254.5
.........
2023-10-3123.944193.8

Index

Set the index by using a similar syntax to:

index sp500_pe -i date

A confirmation message will print:

Successfully updated 'sp500_pe' index to be 'date'

Type

Format any column as one of:

  • int
  • float
  • str
  • bool
  • cataegory
  • date

To see what a column is defined as already:

type sp500_pe.pe
The type of 'sp500_pe.pe' is 'float64'

Change it by adding the --format argument and one of the choices listed above.

If this column of numbers was defined as a string, it could be changed with:

type -n sp500_pe.pe --format float
Update 'sp500_pe.pe' with type 'float'

RET

Add a column to the time series for returns.

ret -v sp500_pe.price

Clean

If NaN values exist, use the clean command to handle them. The example below removes rows where they exist. The new returns column will contain a NaN in the first row.

clean sp500_pe -d rdrop
Successfully cleaned 'sp500_pe' dataset

Plot

Plot columns from a loaded dataset using the plot command.

plot sp500_pe.pe

Screenshot 2023-11-02 at 9 40 47 AM

OLS

Fit an OLS regression model to a loaded data set by defining the dependent and independent variables as column names.

ols sp500_pe.pe -i sp500_pe.price,sp500_pe.price_returns

Screenshot 2023-11-02 at 11 15 11 AM

bgod and bpag commands require running OLS first.

Norm

The norm is used to determine whether the data is normally distributed.

norm sp500_pe.price_returns
KurtosisSkewnessJarque-BeraShapiro-WilkKolmogorov-Smirnov
Statistic20.57847.2062320258.40.9033740.454473

A histogram of the distribution is displayed by adding a, -p, flag to the command.

norm sp500_pe.price_returns -p

Screenshot 2023-11-02 at 12 25 22 PM

Working With Panel Data

Within the examples of load --examples there is one panel dataset available named wage_panel. This is a dataset from the paper by Vella and M. Verbeek (1998), “Whose Wages Do Unions Raise? A Dynamic Model of Unionism and Wage Rate Determination for Young Men,” Journal of Applied Econometrics 13, 163-183. This is a well-known dataset also used within Chapter 14 of Introduction to Econometrics by Jeffrey Wooldridge.

In the example below, the dataset is loaded and given an alias by adding the, -a, argument.

/econometrics/load --file wage_panel -a wp

To run panel regressions, it is important to define both entity (e.g. company) and time (e.g. year). Trying to run the panel command would right now result in the following:

panel -d wp.lwage -i wp.black,wp.hisp,wp.exper,wp.expersq,wp.married,wp.educ,wp.union,wp.year
The column 'lwage' from the dataset 'wp' is not a MultiIndex.

Make sure you set the index correctly with the index (e.g. index wp -i lwage,nr) command where the first level is the entity (e.g. Tesla Inc.) and the second level the date (e.g. 2021-03-31)

Within this dataset the nr and year columns define the entity and time. To allow panel regression estimations, they will need to be defined using the index command.

index wp -i nr,year
Successfully updated 'wp' index to be 'nr, year'

The columns nr and year still exists within the dataset and could have been dropped with the -d argument if desired. However, in this case the year column is relevant for generating time effects in Pooled OLS, Fixed Effects and Random Effects estimations. To be able to do this, the type of the year column needs to be changed.

For the panel regressions, it can be beneficial to see time effects from year. Therefore, the type of the year column should be altered to category. This can be done with the following command:

type wp.year --format category
Update 'wp.year' with type 'category'

The dataset is now properly configured to allow for proper panel regressions. The type of regression used is a choice of:

  • -r pols (Pooled OLS)
  • -r re (Random Effects)
  • -r bols (Between OLS)
  • -r fe (Fixed Effects)
  • -r fdols (First Difference OLS).

For example, a Random Effects regression is performed.

panel -d wp.lwage -i wp.black,wp.hisp,wp.exper,wp.expersq,wp.married,wp.educ,wp.union,wp.year -r re

Screenshot 2023-11-02 at 1 03 57 PM

Scripts & Routines

Doing research, both as a student or professor for a university or as a professional, often requires the findings to be easily replicated. As many steps could be required, the ability to make small adjustments without needing to re-do every single step again. This is where OpenBB Routines play an important role.

Use the contents below as a demo file, copying and pasting into a file saved to the ~/OpenBBUserData/routines folder.

# Go into the econometrics context
econometrics

# Load the wage_panel dataset and include an alias
load wage_panel -a wp

# Set the MultiIndex, allowing for Panel regressions to be performed
index wp -i nr,year

# Change the type of the year column so it can be included as time effects within the regressions
type wp.year --format category

# Perform a Pooled OLS, Random Effects and Fixed Effects estimation
panel -d wp.lwage -i wp.black,wp.hisp,wp.exper,wp.expersq,wp.married,wp.educ,wp.union,wp.year
panel -d wp.lwage -i wp.black,wp.hisp,wp.exper,wp.expersq,wp.married,wp.educ,wp.union,wp.year -r re
panel -d wp.lwage -i wp.expersq,wp.union,wp.married,wp.year -r fe

# Compare the results obtained from these regressions
compare

Run the routine from the Main menu:

/exe -f name_of_file.openbb