Skip to main content

hcp

Builds hierarchical clustering based portfolios

Source Code: [link]

openbb.portfolio.po.hcp(symbols: List[str], kwargs: Any)

Parameters

NameTypeDescriptionDefaultOptional
symbolsList[str]List of portfolio stocksNoneFalse
intervalstrinterval to get stock data, by default "3mo"NoneTrue
start_datestrIf not using interval, start date string (YYYY-MM-DD)NoneTrue
end_datestrIf not using interval, end date string (YYYY-MM-DD). If empty use last
weekday.
NoneTrue
log_returnsboolIf True calculate log returns, else arithmetic returns. Default value
is False
NoneTrue
freqstrThe frequency used to calculate returns. Default value is 'D'. Possible
values are:

- 'D' for daily returns.
- 'W' for weekly returns.
- 'M' for monthly returns.
NoneTrue
maxnanfloatMax percentage of nan values accepted per asset to be included in
returns.
NoneTrue
thresholdfloatValue used to replace outliers that are higher to threshold.NoneTrue
methodstrMethod used to fill nan values. Default value is 'time'. For more information see interpolate <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.interpolate.html>__.NoneTrue
modelstrThe hierarchical cluster portfolio model used for optimize the
portfolio. The default is 'HRP'. Possible values are:

- 'HRP': Hierarchical Risk Parity.
- 'HERC': Hierarchical Equal Risk Contribution.
- 'NCO': Nested Clustered Optimization.
NoneTrue
codependencestrThe codependence or similarity matrix used to build the distance
metric and clusters. The default is 'pearson'. Possible values are:

- 'pearson': pearson correlation matrix. Distance formula:
.. math:: D{i,j} = \sqrt{0.5(1-\rho^{pearson}{i,j})}
- 'spearman': spearman correlation matrix. Distance formula:
.. math:: D{i,j} = \sqrt{0.5(1-\rho^{spearman}{i,j})}
- 'abspearson': absolute value pearson correlation matrix. Distance formula:
.. math:: D
{i,j} = \sqrt{(1-
\rho^{pearson}_{i,j})}
- 'absspearman': absolute value spearman correlation matrix. Distance formula:
.. math:: D
{i,j} = \sqrt{(1-
covariancestrThe method used to estimate the covariance matrix:
The default is 'hist'. Possible values are:

- 'hist': use historical estimates.
- 'ewma1': use ewma with adjust=True. For more information see EWM <https://pandas.pydata.org/pandas-docs/stable/user_guide/window.html#exponentially-weighted-window>.
- 'ewma2': use ewma with adjust=False. For more information see EWM <https://pandas.pydata.org/pandas-docs/stable/user_guide/window.html#exponentially-weighted-window>
.
- 'ledoit': use the Ledoit and Wolf Shrinkage method.
- 'oas': use the Oracle Approximation Shrinkage method.
- 'shrunk': use the basic Shrunk Covariance method.
- 'gl': use the basic Graphical Lasso Covariance method.
- 'jlogo': use the j-LoGo Covariance method. For more information see: c-jLogo.
- 'fixed': denoise using fixed method. For more information see chapter 2 of c-MLforAM.
- 'spectral': denoise using spectral method. For more information see chapter 2 of c-MLforAM.
- 'shrink': denoise using shrink method. For more information see chapter 2 of c-MLforAM.
NoneTrue
objectivestrObjective function used by the NCO model.
The default is 'MinRisk'. Possible values are:

- 'MinRisk': Minimize the selected risk measure.
- 'Utility': Maximize the risk averse utility function.
- 'Sharpe': Maximize the risk adjusted return ratio based on the selected risk measure.
- 'ERC': Equally risk contribution portfolio of the selected risk measure.
NoneTrue
risk_measurestrThe risk measure used to optimize the portfolio. If model is 'NCO',
the risk measures available depends on the objective function.
The default is 'MV'. Possible values are:

- 'MV': Variance.
- 'MAD': Mean Absolute Deviation.
- 'MSV': Semi Standard Deviation.
- 'FLPM': First Lower Partial Moment (Omega Ratio).
- 'SLPM': Second Lower Partial Moment (Sortino Ratio).
- 'VaR': Value at Risk.
- 'CVaR': Conditional Value at Risk.
- 'TG': Tail Gini.
- 'EVaR': Entropic Value at Risk.
- 'WR': Worst Realization (Minimax).
- 'RG': Range of returns.
- 'CVRG': CVaR range of returns.
- 'TGRG': Tail Gini range of returns.
- 'MDD': Maximum Drawdown of uncompounded cumulative returns (Calmar Ratio).
- 'ADD': Average Drawdown of uncompounded cumulative returns.
- 'DaR': Drawdown at Risk of uncompounded cumulative returns.
- 'CDaR': Conditional Drawdown at Risk of uncompounded cumulative returns.
- 'EDaR': Entropic Drawdown at Risk of uncompounded cumulative returns.
- 'UCI': Ulcer Index of uncompounded cumulative returns.
- 'MDD_Rel': Maximum Drawdown of compounded cumulative returns (Calmar Ratio).
- 'ADD_Rel': Average Drawdown of compounded cumulative returns.
- 'DaR_Rel': Drawdown at Risk of compounded cumulative returns.
- 'CDaR_Rel': Conditional Drawdown at Risk of compounded cumulative returns.
- 'EDaR_Rel': Entropic Drawdown at Risk of compounded cumulative returns.
- 'UCI_Rel': Ulcer Index of compounded cumulative returns.
NoneTrue
risk_free_ratefloatRisk free rate, must be in annual frequency.
Used for 'FLPM' and 'SLPM'. The default is 0.
NoneTrue
risk_aversionfloatRisk aversion factor of the 'Utility' objective function.
The default is 1.
NoneTrue
alphafloatSignificance level of VaR, CVaR, EDaR, DaR, CDaR, EDaR, Tail Gini of losses.
The default is 0.05.
NoneTrue
a_simfloatNumber of CVaRs used to approximate Tail Gini of losses. The default is 100.NoneTrue
betafloatSignificance level of CVaR and Tail Gini of gains. If None it duplicates alpha value.
The default is None.
NoneTrue
b_simfloatNumber of CVaRs used to approximate Tail Gini of gains. If None it duplicates a_sim value.
The default is None.
NoneTrue
linkagestrLinkage method of hierarchical clustering. For more information see linkage <https://docs.scipy.org/doc/scipy/reference/generated/scipy.<br/>cluster.hierarchy.linkage.html?highlight=linkage#scipy.cluster.hierarchy.linkage>__.
The default is 'single'. Possible values are:

- 'single'.
- 'complete'.
- 'average'.
- 'weighted'.
- 'centroid'.
- 'median'.
- 'ward'.
- 'dbht': Direct Bubble Hierarchical Tree.
NoneTrue
kintNumber of clusters. This value is took instead of the optimal number
of clusters calculated with the two difference gap statistic.
The default is None.
NoneTrue
max_kintMax number of clusters used by the two difference gap statistic
to find the optimal number of clusters. The default is 10.
NoneTrue
bins_infostrNumber of bins used to calculate variation of information. The default
value is 'KN'. Possible values are:

- 'KN': Knuth's choice method. For more information see knuth_bin_width <https://docs.astropy.org/en/stable/api/astropy.stats.knuth_bin_width.html>.
- 'FD': Freedman–Diaconis' choice method. For more information see freedman_bin_width <https://docs.astropy.org/en/stable/api/astropy.stats.freedman_bin_width.html>
.
- 'SC': Scotts' choice method. For more information see scott_bin_width <https://docs.astropy.org/en/stable/api/astropy.stats.scott_bin_width.html>__.
- 'HGR': Hacine-Gharbi and Ravier' choice method.
NoneTrue
alpha_tailfloatSignificance level for lower tail dependence index. The default is 0.05.NoneTrue
leaf_orderboolIndicates if the cluster are ordered so that the distance between
successive leaves is minimal. The default is True.
NoneTrue
d_ewmafloatThe smoothing factor of ewma methods.
The default is 0.94.
NoneTrue
valuefloatAmount of money to allocate. The default is 1.NoneTrue

Returns

TypeDescription
Tuple[Optional[dict], pd.DataFrame]Dictionary of portfolio weights,
DataFrame of stock returns.