SB|Education
Measuring Cryptocurrency Similarity: A Quantitative Framework for Portfolio Managers

Measuring Cryptocurrency Similarity: A Quantitative Framework for Portfolio Managers

Kevin P.
Kevin P.
January 18, 2026 · 14 min read
In brief

Most crypto portfolios are less diversified than they look. Narrative categories — DeFi, Layer-1s, exchange tokens — feel distinct until you measure what actually matters: whether two assets crash together and share a long-run price equilibrium. Our Research formalizes this into one question: if you transfer funds from A to B, are you getting different risk exposure, or paying transaction costs to hold a statistical duplicate?

The Transfer Problem

There is a decision that crypto portfolio managers make constantly and almost never quantify properly: whether moving capital from one asset to another actually changes their risk exposure in a meaningful way. (Risk exposure = how sensitive your portfolio is to market movements and losses.)

The instinct is to think about this in narrative terms — different sectors, different use cases, different teams. But narrative diversification and statistical diversification are different things, and in crypto they diverge more dramatically and more consequentially than in any other asset class. (Statistical diversification = diversification measured using actual price behavior, not just project differences.)

Our Research has formalized this into a precise operational question: if I transfer funds from asset A to asset B, am I getting genuinely different exposure, or am I paying transaction costs to hold a statistical duplicate? That framing cuts through a great deal of portfolio management folklore. It does not ask whether two assets are different in concept. It asks whether their joint price behavior is distinguishable in the ways that actually determine portfolio performance — particularly during drawdowns, which in crypto regularly reach 40-70% even for blue-chip assets. (Drawdown = the percentage decline from a recent peak to a low.)

The uncomfortable finding that emerges from rigorous similarity analysis is that most crypto portfolios carry far more redundancy than they appear to. A portfolio spread across Layer-1s, DeFi protocols, infrastructure tokens, and exchange tokens sounds diversified. Measured statistically — especially in bear markets — it frequently behaves like a handful of correlated bets dressed in different narratives. The reason is structural: Bitcoin dominates the crypto capital structure in a way with no parallel in traditional markets. Most tokens carry significant beta to BTC, creating a baseline co-movement across the entire asset class that masks genuine diversification. On top of that, crypto-specific contagion mechanisms — exchange counterparty risk (demonstrated by FTX's November 2022 collapse), shared DeFi liquidity pools, simultaneous liquidation cascades — produce crash-time correlation dramatically higher than calm-period averages suggest. (Contagion = risk spreading from one asset or platform to others.)

Measuring this correctly requires a framework that operates at multiple levels simultaneously. Correlation alone captures perhaps 20% of the relevant structure.

The Problem With Correlation

Pearson correlation is not wrong approach. It is incomplete in ways that are specifically consequential for crypto.

The core limitation is that correlation is unconditional: it pools bull markets, bear markets, sideways consolidation, and liquidity crises into a single average. Two assets might show a moderate pooled correlation of 0.58 while being nearly perfectly correlated (0.90+) in bear markets and barely related (0.20) during bull runs. That regime-dependence is invisible in one number. For risk management — where "will these two assets crash together?" matters far more than "do they move together on average?" — unconditional correlation is the wrong tool.

The second limitation is linearity. Crypto returns are fat-tailed: the kurtosis of daily returns for major cryptocurrencies is typically 8-20, versus 3 for a normal distribution. This means extreme events happen far more often than standard intuition suggests, and the co-movement structure specifically in those extremes — do these assets crash together more than their average correlation implies? — is simply not captured by Pearson. This is the central issue for portfolio construction in an asset class where 40% drawdowns are routine.

Third, at scale, multiple testing is catastrophic. With 200+ coins there are roughly 124,750 unique pairs. At a 5% significance threshold, approximately 6,237 false positives are expected by pure chance. Any analysis that reports "significantly similar" pairs without applying Benjamini-Hochberg False Discovery Rate (FDR) correction — which controls the expected proportion of false discoveries among all declared significant results — is largely reporting noise. (FDR correction = a method to reduce false positives when testing many relationships.) We treat BH-FDR at q = 0.05 as non-negotiable across every statistical test in the pipeline, a standard that most published crypto correlation studies do not meet.

Before any of this can begin, there is a prerequisite that practitioners routinely skip: liquidity filtering. Illiquid assets — those with thin order books, high Amihud illiquidity ratios, and median daily volume below $500K — show spurious correlation not because they genuinely co-move but because their prices barely move at all. Including them contaminates the entire similarity matrix. Our Research estimates that applying a proper Amihud filter alongside a volume floor eliminates 50-70% of a typical 200-coin universe before a single correlation is computed — a figure that should give pause to any study that doesn't disclose its liquidity screening methodology. (Amihud ratio = a measure of how much price moves per unit of trading volume.)

Tail Dependence: The Question That Matters

Once a clean, liquid universe is established, the most important analytical layer is not correlation at all. It is tail dependence — the statistical structure of co-movement specifically during extreme market moves.

Copula models are the tool of choice here, and the intuition is straightforward even if the mathematics is not. A copula separates the dependence structure between two assets from their individual return distributions. Sklar's Theorem guarantees this decomposition is always possible: transform each asset's returns to uniform marginals using the empirical CDF, and what remains is pure dependence — untainted by individual volatility levels. Fitted to this transformed data, different copula families characterize different types of dependence.

(Copula = a statistical method that isolates how variables depend on each other.)

The Clayton copula is parametrized specifically to capture lower tail dependence, producing the coefficient λL — the conditional probability that both assets experience simultaneous extreme losses. A pair with λL of 0.70 has a 70% conditional probability of joint extreme loss. They are effectively the same asset in a bear market, regardless of what their average correlation says. The Joe-Clayton (SJC) copula extends this by estimating upper and lower tail dependence independently, making explicit what empirical crypto data consistently shows: crash co-movement is substantially stronger than rally co-movement. Two assets with λL of 0.65 and λU of 0.28 are dramatically more similar when markets fall than when they rise — a structure that average correlation completely obscures.

Our research framework treats λL as the single most important pairwise signal in the entire analysis, weighting it highest in the composite score and using it as the primary criterion for "highly redundant" classification. The thresholds that emerge from the research: λL above 0.5 is a strong redundancy signal; below 0.3, the pair is genuinely distinct in bear conditions regardless of their correlation structure. A portfolio manager relying on correlation who holds two assets with λL of 0.68 believes they are diversified. In the conditions that trigger real losses, they are not.

Regimes, Structural Breaks, and the FTX Problem

Tail dependence measured over a two-year window still pools different market environments. The second critical layer explicitly models those environments and computes similarity within each.

Markov Regime Switching models — applied to Bitcoin log returns as the dominant market factor — identify latent states that evolve probabilistically over time: typically a low-volatility bull state and a high-volatility bear/crash state. The Hamilton Filter produces daily state probabilities, allowing each trading day to be classified into its dominant regime. Correlation matrices computed within each regime reveal structure invisible in pooled analysis.

The key diagnostic Our Research highlights is the regime stability gap: the difference between a pair's bear-regime correlation and its bull-regime correlation. A large gap is the signature of what finance literature calls asymmetric dependence — assets that appear to offer diversification in normal conditions but converge precisely during crises. This is arguably the most dangerous configuration in portfolio construction, and it is extremely common in crypto. The FTX collapse of November 2022 produced regime stability gaps across the market that would have shocked anyone relying on pre-crisis correlation matrices: assets from entirely different sectors that had looked uncorrelated suddenly moved in near-perfect lockstep as forced liquidations and exchange contagion propagated through the ecosystem.

This is why structural break testing is not optional. The Bai-Perron multiple breakpoint test identifies dates where statistical properties of return series change discontinuously — in crypto, November 2022 (FTX) and January 2024 (Bitcoin ETF approval) are the canonical breaks over the recent two-year window. Any similarity estimate that pools across these breaks is averaging incompatible regimes. Our Research framework computes all similarity metrics within structurally stable subperiods, then requires that a "highly redundant" classification hold across all subperiods before it is confirmed. A pair that looks redundant only in the post-ETF bull market is not classified as redundant. The bar requires consistency.

Cointegration: When Price Levels Tell a Different Story

Returns analysis — correlation, tail dependence, regime-conditional correlation — operates at the daily frequency. A separate and complementary question is whether two assets share a long-run price level relationship that the market enforces over time.

(Cointegration = two price series move together over the long term even if they drift short term.)

Cointegration, in Engle-Granger terms, asks whether a linear combination of two non-stationary price series is stationary — whether the spread between two drifting price paths is itself mean-reverting. In crypto, cointegration tends to arise from genuine structural relationships: two Layer-1 platforms competing for the same developer and capital flows, a DEX governance token and the chain it operates on, staking derivatives and their underlying assets. The economic intuition is that whenever the spread widens enough, arbitrage-like capital flows are attracted that push it back — creating a statistical attractor around the equilibrium.

Threshold VECM (Vector Error Correction Model) extends standard cointegration by identifying a neutral band within which no reversion occurs — because transaction costs and execution friction make arbitrage unprofitable at small spreads — and separate reversion speeds above and below the band. For portfolio managers, these parameters translate directly: a narrow neutral band with fast reversion speed means the market aggressively enforces the equilibrium. A wide band with slow reversion means the spread can persist for months before correcting, significantly weakening the redundancy case even if the long-run relationship technically exists.

A secondary output of cointegration analysis — often more actionable than the cointegration finding itself — is the identification of Granger leaders within each cluster. Granger causality tests whether lagged returns of asset A improve the prediction of asset B's returns beyond what B's own history explains. The Granger-leader is the asset whose price movements propagate to others, not the reverse. Empirically, the leader is almost always the most liquid asset with the highest market cap in a cluster — price discovery concentrates where liquidity is deepest. The portfolio implication is direct: within a cluster of redundant assets, always hold the leader. Moving capital from leader to follower means accepting a lagged version of the same signal with higher slippage. Our research identifies the Granger-leader for every cluster as a core deliverable, specifically to enable this transfer decision.

The Network View: Where Each Asset Sits in the Redundancy Landscape

Pairwise analysis — even rigorous pairwise analysis — can miss the broader structure of how an entire asset universe is organized. Network methods address this by treating all assets simultaneously.

(Precision matrix = inverse covariance matrix showing direct relationships.)

GLASSO (Graphical LASSO) estimates a regularized precision matrix — the inverse covariance matrix with an L1 penalty that forces many entries to exactly zero. The non-zero entries correspond to conditional dependencies: pairs that remain related even after controlling for all other assets in the universe. This removes the spurious indirect connections that plague standard correlation networks. If ETH and a Layer-2 token both correlate with BTC, they will appear similar to each other in a standard correlation network even if their direct relationship is negligible. The GLASSO network exposes that the apparent similarity is mediated entirely by the common BTC factor — a critical distinction for portfolio construction.

Constructing a Minimum Spanning Tree (MST) on the lower tail dependence matrix — using λL as the edge weight — reveals the backbone of crash-time redundancy across the universe. Assets at the core of the MST, connected to many others through strong crash-time relationships, are the most substitutable. Assets at the periphery are the most unique, offering genuine diversification benefit. Louvain community detection applied to both networks — the GLASSO partial correlation network and the λL-MST — identifies natural asset groupings. Where both network constructions agree on cluster membership, the classification is high-confidence. Where they disagree, the asset sits in a genuinely ambiguous region and warrants manual review rather than automatic classification.

The Composite Score and What to Do With It

Each analytical layer produces a distinct signal. Our Research framework combines them into a single pairwise redundancy score with explicit weighting that reflects a deliberate prioritization: crash-time behavior dominates. Lower tail dependence (λL) and bear-regime correlation together account for roughly 45% of the composite score. Rolling 90-day correlation contributes around 20% as a stable baseline. Cointegration results — presence and reversion speed — contribute 15%. The GLASSO partial correlation and BTC-beta-adjusted correlation (correlation on residuals after removing the common BTC factor, separating genuine co-movement from shared market exposure) account for the remainder.

Every component is min-max normalized to [0,1] before combination. The aggregation logic below illustrates how this works in practice. Each component matrix — already computed from the upstream layers — is normalized, weighted, and combined. BH-FDR correction is applied before any pair is declared significant, and the output is a ranked DataFrame with a redundancy tier attached to every pair in the universe.

python
1import numpy as np
2import pandas as pd
3from statsmodels.stats.multitest import multipletests
4from scipy.stats import spearmanr
5
6# ─────────────────────────────────────────────────────────────────
7# SB Research — Composite Similarity Score
8# Inputs: precomputed pairwise matrices from Phases 2–6
Click to view full code

The output is a score per pair, classified into four tiers: highly redundant (0.80–1.00), moderately redundant (0.60–0.79), partially similar (0.40–0.59), and genuinely distinct (below 0.40). But Our Research enforces a hard validation rule before confirming "highly redundant": all four major signal types must agree — high rolling correlation, high λL, high bear-regime correlation, and confirmed cointegration. If only two or three agree, the classification is downgraded to moderately redundant. This prevents any single noisy signal from driving a consequential portfolio decision.

Validation uses a 70/30 in-sample/out-of-sample split, testing whether high in-sample similarity scores actually predict high realized co-movement in the holdout period — measuring this with Spearman rank correlation between the score and out-of-sample realized correlation. Confidence intervals use stationary block bootstrap (circular variant, block length ~20 days) to preserve temporal dependence structure — standard bootstrap is invalid for time series. For the copula layer specifically, surrogate data testing phase-shuffles each return series 500 times to generate a null distribution for λL, confirming that observed tail dependence is genuine rather than an estimation artifact.

What This Means for Practice

The framework produces two things that are immediately actionable. First, a ranked list of pairs with a classification and transfer recommendation — avoid transfer (near-zero diversification benefit), marginal benefit only, meaningful difference, or genuine diversification. Second, for each confirmed cluster of redundant assets, a designated representative: the most liquid, longest-tenured, lowest-Amihud, highest-market-cap asset in the cluster — verified to be the Granger leader — that is the asset to hold.

Three principles from this body of work are worth internalizing regardless of whether a portfolio manager implements the full framework.

Regime stability matters more than average correlation. A pair with average correlation 0.65 that is stable across bull and bear regimes is more redundant — and a worse transfer candidate — than a pair with average correlation 0.78 that collapses in bull markets. The regime-conditional structure is the signal. The unconditional average is the noise.

The liquidity filter will surprise you. Applying proper Amihud screening to a 200+ coin universe eliminates the majority of assets before any statistics are computed. Most of what looks like interesting correlation structure in unfiltered crypto data is illiquidity masquerading as co-movement. Any similarity analysis that does not show its liquidity screening methodology should be treated with skepticism.

Hold the Granger leader. Within any identified cluster, the leader is always the right asset to hold. The market tells you which asset processes information first. The portfolio construction decision should follow that signal, not fight it.

The goal of this framework is not academic. It is operational — a decision tool rigorous enough to hold up in both calm conditions and the kind of crisis where the diversification question stops being theoretical and starts costing real money.

Framework developed in collaboration with SB Finance Research. Methodology draws on copula theory, Markov regime switching, threshold cointegration, and network topology applied to the cryptocurrency universe.

Explore more writing on topics that matter.

← Back to all posts