Cryptocurrrency Market Risk Factors

Dashboard > Research > Cryptocurrrency Market Risk Factors


In this project, we inspected the top 100 cryptocurrencies by market value to identify the cluster structure of these alternative assets based on investment returns. As of 2022, cryptos have reached 0.5% of the global total liquid assets and emerged to become a new asset class. It is important to analyze these new decentralized financial instruments to gain a better understanding of their risk-and-reward profile and inherent market structure. Our main objective for this project is to investigate how crypto assets relate to each other in market risk. This research will help guide asset allocation decisions and analyze the risk attributes for better risk management.


We have a raw data set of daily prices of the top 151 cryptocurrencies by market value from a crypto asset market data provider. Each cryptocurrency is identified by its ticker, typically a three to four digits character string. The raw data is since the inception of each cryptocurrency until 9/15/2022. Most cryptos came to the market after 2017. Bitcoin is the crypto asset with the longest history going back to January 2009. Because of the variation in data history, we used the price data from 8/31/2020 to 9/15/2022 to get 117 cryptocurrencies with full length of data history during this period. This gave us 747 data points per cryptocurrency considered. This sample size can help us to draw conclusions that are statistically meaningful.

In the early days of a new crypto asset, trading activity can be sparse, and prices can be volatile, due to limited user base and few crypto exchanges trading the new currency. A couple of cryptocurrencies were removed because of many missing data points.

We calculated the daily log return of the cryptocurrencies from the price data. Cryptocurrencies trade 24hrs and 7 days a week non-stop. There is no weekend or holiday adjustment that need to be made. We further calculated the summary statistics of the data including mean, volatility, median, min, max, 1st and 3rd quartile. Please see Appendix A. We also made box plots to visually inspect the data. See Appendix B. Large outliers are common and on both the upside and the downside. Cryptos are way more volatile than returns of traditional asset classes.

There are two major types of cryptocurrencies: stablecoins and the rest. Stablecoins are created either by tokenization of a fiat currency or based on algorithmic structure. With the collapse of Terra/ Luna algorithmic stablecoin, the academic has questioned the viability of any algorithmic stablecoin. For our research project, we will exclude algorithmic stablecoin and separate the top 100 cryptocurrencies by tokenized stablecoins and the rest. We have additional meta data to separate stablecoins from other cryptos in the data set. This separation information was not applied to the clustering method beforehand though.

Our approach is mainly cluster analysis using Hierarchical Clustering and K-means as well as dimension reduction with Principal Component Analysis (PCA) for factor attribution. We would like to identify the structure of cryptocurrencies by their returns if there is any. It will help us to understand how the cryptocurrencies relate to each other.

First, we inspect the daily return data series by cluster analysis. Each cryptocurrency is measured by a vector of daily return time series, 747 dimensions in our case. Hierarchical Cluster reveals the tree structure how closely the cryptocurrencies relate to each other based on the distance between the cluster boundary points. It helps to identify natural grouping of the cryptocurrencies. Second, using K-means, we identified the kernels of the cryptocurrency clusters based on the distance measure using the vector of daily returns. This helps to inspect the market structure with a slightly different way to measure distance among groups. Lastly, PCA treat data points differently by identifying the best linear combinations of cyrptos to tilt towards the direction of maximal variation in returns data. It can be thought of as reorienting the axes to align with the direction of maximal variation. The data for PCA is the transpose of that for cluster analysis. PCA analysis is a dimension reduction technique and can help single out a couple most important factors to explain for the return variations.


Hierarchical Clustering and K-means classification consistently reveal that crypto assets can be naturally separated into three groups as shown below. Separation into finer groups beyond the three is weak, especially within group (3). See Appendix C for the Hierarchical Cluster tree diagram.

(1) stablecoins,
(2) well-known cryptos, and
(3) the rest.

Hierarchical Cluster Group








The exact classification of each crypto using each method can be found in Appendix E. Group 1 is composed of the most liquidly traded cryptos. Group 2 is composed of all stable coins plus LEO and PAXG, which is a tokenization of gold and tend to be stabler than other cryptos. The remaining is more volatile tokens with shorter history or popularity. Interestingly, we did not tag the stablecoins beforehand in any ways. The cluster analysis algorithm was able to distinguish them.

Another important finding is that idiosyncrasy is high among cryptos. The tree structure does not yield meaningful bifurcation of large branches beyond the three groups. Within the two groups of non-stablecoins, the branches quickly become granular with crytos of similar characteristics. This is especially pronounced within Group 2, or the most volatile and diverse group of cryptos.

When comparing the classifications of the cryptos using the two methods when K-means is set to be a classification of three groups, we find the similarity is striking. Group 3 for stable coins are identical. The classifications of the other two groups are also almost matching. When we increase the classification groups to be four or higher, the K-means algorithm separated one crypto into its own group. This confusion is similar in pattern to the Hierarchical Clustering tree diagram when sub-branches fail to produce clear group division beyond three groups.

When checking the results from PCA, we have consistent findings as in the classification analysis above. The largest principal component (PC1) explains only 12% of the variation and the 2nd and 3rd component explain ~3% each. Market risk in crypto assets is disintegrated in several equally important risk factors. Decline in risk factor importance is flat beyond the first few. The strongest risk factor comes from a representation of a long position in the largest, established, and more stable crypto assets such as BTC, USDC, and DAI. Factor loadings of PC1 are mostly from stable coins or Group 3 in the classification methods. When subsequent principal components can explain little variation in data, classification methods would also typically have trouble distinguish the sub-groups.

Discussion and Conclusion:

Discussion/Conclusion: Discussion with key insights from your work, limitations, future extensions, and implications for the models' use in practice.

The key takeaway from this research are as follows:
1. Crypto assets can be categorized into three major groups as mentioned before:

(1) Most liquid cryptos with stable performance
(2) Less know and more volatile crypto assets
(3) Stablecoins
2. Market risk in cryptos is featured by idiosyncratic risk factors that are hard to distinguish among themselves. Market risk management by factor attribution may be challenging to achieve.
3. Findings from clustering analysis and PCA are consistent.

One issue of concern for this research is that our chosen data set has survivorship biases. When a crypto fails, market trading data will stop. The data we used covers only cryptocurrencies that existed and survived to 9/15/2022. For those cryptocurrencies that failed, they are not in our data set. From market risk perspective, the failure of a cryptocurrency would result in a total loss of the asset value and therefore would have the most severe downside risk. For example, FTT, the native token of the bankrupt crypto exchange FTX that exploded in early November 2022, is in our dataset but the data history contains only the time when FTT was trading fine but does not cover the history long enough to include the time during and after the bankruptcy. However, the moment of failure and subsequent performance are not captured in our data. Risk can be underestimated due to the inherent survivorship biases in the data.

Another issue is the observation window of the data. Two years of data is very short from the perspective of economic and business cycles, which tend to last for 10yrs. When we are in an expansion phase of the business cycle, the asset clustering and classification patterns can be very different from in other phases. In other words, the relationship among cryptocurrencies' may not be intertemporally robust. Unfortunately, for classification methods and PCA to work, we cannot have data with different history length. If we wait for data accumulation over time, we may not be able to manage risks properly in the meantime.

Appendix A: Summary Statistics

Appendix B: Sample Box Plots of Daily Returns

Appendix C: Hierarchical Clustering Tree Diagram

Appendix D: Variation of Returns Explained by Principal Components

Appendix E: Comparison of Clustering and PCA Factor Loadings