hcp

Builds hierarchical clustering based portfolios

Source Code: [link]

openbb.portfolio.po.hcp(symbols: List[str], kwargs: Any)

Parameters

Name	Type	Description	Default	Optional
symbols	List[str]	List of portfolio stocks	None	False
interval	str	interval to get stock data, by default "3mo"	None	True
start_date	str	If not using interval, start date string (YYYY-MM-DD)	None	True
end_date	str	If not using interval, end date string (YYYY-MM-DD). If empty use last weekday.	None	True
log_returns	bool	If True calculate log returns, else arithmetic returns. Default value is False	None	True
freq	str	The frequency used to calculate returns. Default value is 'D'. Possible values are: - 'D' for daily returns. - 'W' for weekly returns. - 'M' for monthly returns.	None	True
maxnan	float	Max percentage of nan values accepted per asset to be included in returns.	None	True
threshold	float	Value used to replace outliers that are higher to threshold.	None	True
method	str	Method used to fill nan values. Default value is 'time'. For more information see `interpolate <https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.interpolate.html>`__.	None	True
model	str	The hierarchical cluster portfolio model used for optimize the portfolio. The default is 'HRP'. Possible values are: - 'HRP': Hierarchical Risk Parity. - 'HERC': Hierarchical Equal Risk Contribution. - 'NCO': Nested Clustered Optimization.	None	True
codependence	str	The codependence or similarity matrix used to build the distance metric and clusters. The default is 'pearson'. Possible values are: - 'pearson': pearson correlation matrix. Distance formula: .. math:: D{i,j} = \sqrt{0.5(1-\rho^{pearson}{i,j})} - 'spearman': spearman correlation matrix. Distance formula: .. math:: D{i,j} = \sqrt{0.5(1-\rho^{spearman}{i,j})} - 'abspearson': absolute value pearson correlation matrix. Distance formula: .. math:: D{i,j} = \sqrt{(1-	\rho^{pearson}_{i,j}	)} - 'absspearman': absolute value spearman correlation matrix. Distance formula: .. math:: D{i,j} = \sqrt{(1-
covariance	str	The method used to estimate the covariance matrix: The default is 'hist'. Possible values are: - 'hist': use historical estimates. - 'ewma1': use ewma with adjust=True. For more information see `EWM <https://pandas.pydata.org/pandas-docs/stable/user_guide/window.html#exponentially-weighted-window>`. - 'ewma2': use ewma with adjust=False. For more information see `EWM <https://pandas.pydata.org/pandas-docs/stable/user_guide/window.html#exponentially-weighted-window>`. - 'ledoit': use the Ledoit and Wolf Shrinkage method. - 'oas': use the Oracle Approximation Shrinkage method. - 'shrunk': use the basic Shrunk Covariance method. - 'gl': use the basic Graphical Lasso Covariance method. - 'jlogo': use the j-LoGo Covariance method. For more information see: `c-jLogo`. - 'fixed': denoise using fixed method. For more information see chapter 2 of `c-MLforAM`. - 'spectral': denoise using spectral method. For more information see chapter 2 of `c-MLforAM`. - 'shrink': denoise using shrink method. For more information see chapter 2 of `c-MLforAM`.	None	True
objective	str	Objective function used by the NCO model. The default is 'MinRisk'. Possible values are: - 'MinRisk': Minimize the selected risk measure. - 'Utility': Maximize the risk averse utility function. - 'Sharpe': Maximize the risk adjusted return ratio based on the selected risk measure. - 'ERC': Equally risk contribution portfolio of the selected risk measure.	None	True
risk_measure	str	The risk measure used to optimize the portfolio. If model is 'NCO', the risk measures available depends on the objective function. The default is 'MV'. Possible values are: - 'MV': Variance. - 'MAD': Mean Absolute Deviation. - 'MSV': Semi Standard Deviation. - 'FLPM': First Lower Partial Moment (Omega Ratio). - 'SLPM': Second Lower Partial Moment (Sortino Ratio). - 'VaR': Value at Risk. - 'CVaR': Conditional Value at Risk. - 'TG': Tail Gini. - 'EVaR': Entropic Value at Risk. - 'WR': Worst Realization (Minimax). - 'RG': Range of returns. - 'CVRG': CVaR range of returns. - 'TGRG': Tail Gini range of returns. - 'MDD': Maximum Drawdown of uncompounded cumulative returns (Calmar Ratio). - 'ADD': Average Drawdown of uncompounded cumulative returns. - 'DaR': Drawdown at Risk of uncompounded cumulative returns. - 'CDaR': Conditional Drawdown at Risk of uncompounded cumulative returns. - 'EDaR': Entropic Drawdown at Risk of uncompounded cumulative returns. - 'UCI': Ulcer Index of uncompounded cumulative returns. - 'MDD_Rel': Maximum Drawdown of compounded cumulative returns (Calmar Ratio). - 'ADD_Rel': Average Drawdown of compounded cumulative returns. - 'DaR_Rel': Drawdown at Risk of compounded cumulative returns. - 'CDaR_Rel': Conditional Drawdown at Risk of compounded cumulative returns. - 'EDaR_Rel': Entropic Drawdown at Risk of compounded cumulative returns. - 'UCI_Rel': Ulcer Index of compounded cumulative returns.	None	True
risk_free_rate	float	Risk free rate, must be in annual frequency. Used for 'FLPM' and 'SLPM'. The default is 0.	None	True
risk_aversion	float	Risk aversion factor of the 'Utility' objective function. The default is 1.	None	True
alpha	float	Significance level of VaR, CVaR, EDaR, DaR, CDaR, EDaR, Tail Gini of losses. The default is 0.05.	None	True
a_sim	float	Number of CVaRs used to approximate Tail Gini of losses. The default is 100.	None	True
beta	float	Significance level of CVaR and Tail Gini of gains. If None it duplicates alpha value. The default is None.	None	True
b_sim	float	Number of CVaRs used to approximate Tail Gini of gains. If None it duplicates a_sim value. The default is None.	None	True
linkage	str	Linkage method of hierarchical clustering. For more information see `linkage <https://docs.scipy.org/doc/scipy/reference/generated/scipy.<br/>cluster.hierarchy.linkage.html?highlight=linkage#scipy.cluster.hierarchy.linkage>`__. The default is 'single'. Possible values are: - 'single'. - 'complete'. - 'average'. - 'weighted'. - 'centroid'. - 'median'. - 'ward'. - 'dbht': Direct Bubble Hierarchical Tree.	None	True
k	int	Number of clusters. This value is took instead of the optimal number of clusters calculated with the two difference gap statistic. The default is None.	None	True
max_k	int	Max number of clusters used by the two difference gap statistic to find the optimal number of clusters. The default is 10.	None	True
bins_info	str	Number of bins used to calculate variation of information. The default value is 'KN'. Possible values are: - 'KN': Knuth's choice method. For more information see `knuth_bin_width <https://docs.astropy.org/en/stable/api/astropy.stats.knuth_bin_width.html>`. - 'FD': Freedman–Diaconis' choice method. For more information see `freedman_bin_width <https://docs.astropy.org/en/stable/api/astropy.stats.freedman_bin_width.html>`. - 'SC': Scotts' choice method. For more information see `scott_bin_width <https://docs.astropy.org/en/stable/api/astropy.stats.scott_bin_width.html>`__. - 'HGR': Hacine-Gharbi and Ravier' choice method.	None	True
alpha_tail	float	Significance level for lower tail dependence index. The default is 0.05.	None	True
leaf_order	bool	Indicates if the cluster are ordered so that the distance between successive leaves is minimal. The default is True.	None	True
d_ewma	float	The smoothing factor of ewma methods. The default is 0.94.	None	True
value	float	Amount of money to allocate. The default is 1.	None	True

Returns

Type	Description
Tuple[Optional[dict], pd.DataFrame]	Dictionary of portfolio weights, DataFrame of stock returns.

Parameters​

Returns​

Parameters

Returns