Appendix: Reference Material

17.1 Notation summary

Symbol	Dimensions	Meaning	First used
$N$	scalar	number of assets	Ch. 1
$K$	scalar	number of factors	Ch. 1
$T$	scalar	number of time periods	Ch. 1
$r$	$N \times 1$	asset returns over one period	Ch. 2
$X$	$N \times K$	factor exposure (loading) matrix	Ch. 2
$f$	$K \times 1$	factor returns over one period	Ch. 2
$\epsilon$	$N \times 1$	specific (idiosyncratic) returns	Ch. 2
$F$	$K \times K$	factor covariance matrix	Ch. 2
$\Delta$	$N \times N$ diag.	specific variance matrix, $\Delta_{ii} = \sigma^2_{\epsilon_i}$	Ch. 2
$\Sigma$	$N \times N$	asset covariance, $\Sigma = XFX^\top + \Delta$	Ch. 2
$w, w_p, w_b, w_a$	$N \times 1$	weights: generic, portfolio, benchmark, active ( $w_p - w_b$ )	Ch. 2, Ch. 9
$x, x_p, x_b, x_a$	$K \times 1$	factor exposures of a portfolio, $x = X^\top w$	Ch. 2, Ch. 9
$\sigma_p, \sigma_a$	scalar	portfolio volatility, tracking error	Ch. 2
$d_{ik}$	scalar	raw descriptor value, stock $i$ , factor $k$	Ch. 3
$\mu_k, \sigma_k$	scalar	standardization location (cap-weighted) and scale (equal-weighted)	Ch. 3
$S(f)$	scalar	least-squares objective, (weighted) sum of squared residuals minimized to get $\hat f$	Ch. 6
$W$	$N \times N$ diag.	regression weights (convention: $\propto \sqrt{\text{cap}}$ )	Ch. 6
$C$	$1 \times K$ (or more rows)	constraint matrix, $Cf = 0$	Ch. 6
$R$	$K \times (K - c)$	restriction matrix, feasible $f = Rg$	Ch. 6
$P$	$K \times N$	pure factor portfolio matrix, $\hat f = Pr$	Ch. 6, Ch. 7
$H$	$N \times N$	OLS projection (“hat”) matrix, $H = X(X^\top X)^{-1}X^\top$	§17.2
$\Omega$	$N \times N$	general specific-return covariance in GLS; reduces to $\Delta$ when diagonal	§17.2
$\beta, \beta_{p,h}$	scalar	time-series beta to a factor; portfolio beta to a hedge instrument	Ch. 1, Ch. 12
$\gamma$	scalar / $K \times 1$	Fama–MacBeth risk premium; Lagrange multiplier in the characteristic-portfolio derivation	Ch. 7
$h$	varies	hedge notionals, characteristic portfolio weights	Ch. 7, Ch. 12
$\lambda$	scalar	risk aversion (Ch. 11), EWMA decay (Ch. 8), context disambiguates	Ch. 8, Ch. 11
$b$	scalar	bias statistic, $\mathrm{std}(r_t/\hat\sigma_{t-1})$	Ch. 8, Ch. 14
$\mathrm{MCR}_i, \mathrm{CTR}_i$	scalar	marginal contribution $(\Sigma w)_i/\sigma$ , contribution $w_i \cdot \mathrm{MCR}_i$	Ch. 9

Conventions: returns are arithmetic and in decimal unless a table is marked %. Risk numbers are annualized unless noted. “Exposure” always means a column-standardized or dummy loading per Chapter 3.

17.2 Refresher: the least-squares family in one place

OLS: $\min_f (r - Xf)^\top(r - Xf) \Rightarrow \hat f = (X^\top X)^{-1} X^\top r$ . Fitted values $X\hat f = Hr$ with the projection (“hat”) matrix $H = X(X^\top X)^{-1}X^\top$ : symmetric, idempotent ( $H^2 = H$ ), projecting onto the column space of $X$ . Residuals $(I - H)r$ are orthogonal to that space: $X^\top \hat\epsilon = 0$ .

WLS: With positive diagonal $W$ : $\min_f (r - Xf)^\top W (r - Xf) \Rightarrow \hat f = (X^\top W X)^{-1} X^\top W r$ . Equivalent to OLS on $\tilde r = W^{1/2} r$ , $\tilde X = W^{1/2} X$ . Best linear unbiased when $W \propto \mathrm{Cov}(\epsilon)^{-1}$ (Aitken/GLS). The $\sqrt{\text{cap}}$ convention approximates this for equities (Ch. 6).

GLS: General $\mathrm{Cov}(\epsilon) = \Omega$ : $\hat f = (X^\top \Omega^{-1} X)^{-1} X^\top \Omega^{-1} r$ . In the factor-model context $\Omega = \Delta$ (diagonal), so GLS = WLS with $W = \Delta^{-1}$ .

Covariance algebra used throughout: For conformable constant matrices $A, B$ and random vectors $u, v$ : $\mathrm{Cov}(Au) = A\,\mathrm{Cov}(u)\,A^\top$ ; $\mathrm{Cov}(Au + Bv) = A\,\mathrm{Cov}(u)A^\top + B\,\mathrm{Cov}(v)B^\top$ when $\mathrm{Cov}(u,v) = 0$ . These two lines, applied to $r = Xf + \epsilon$ , are the derivation of $\Sigma = XFX^\top + \Delta$ .

EWMA: Weights $\lambda^s$ on lag $s$ . Half-life $h \leftrightarrow \lambda = 2^{-1/h}$ . Recursive update $\hat F_t = \lambda \hat F_{t-1} + (1-\lambda)\tilde f_t \tilde f_t^\top$ (normalized form). Effective sample size $\approx (1+\lambda)/(1-\lambda) \approx 2.89h$ for large $h$ .

Eigendecomposition/PCA: Symmetric $\Sigma = Q\Lambda Q^\top$ , $Q$ orthonormal, $\Lambda$ diagonal with $\lambda_1 \ge \dots \ge \lambda_N \ge 0$ for PSD. PCA: factor $j$ ‘s exposures = $q_j$ , factor variance = $\lambda_j$ . The rank- $K$ truncation is the best rank- $K$ approximation in Frobenius norm (Eckart–Young). Marchenko–Pastur noise edge for aspect ratio $N/T$ : $\lambda_{\pm} = \sigma^2 (1 \pm \sqrt{N/T})^2$ .

17.3 Derivation collection

D1 - Constrained WLS (Ch. 6): Minimize $S(f) = (r - Xf)^\top W (r - Xf)$ s.t. $Cf = 0$ ( $c$ independent rows). Restriction form: pick $R$ ( $K \times (K-c)$ ) whose columns span the null space of $C$ (so $CR = 0$ and any feasible $f = Rg$ ). Substitute: $S(Rg)$ is unconstrained in $g$ , and by the WLS formula with design $XR$ : $\hat g = (R^\top X^\top W X R)^{-1} R^\top X^\top W r$ , $\hat f = R\hat g$ . Lagrangian form: $\mathcal{L} = S(f) + 2\lambda^\top C f$ . Stationarity gives the bordered system $\begin{pmatrix} X^\top W X & C^\top \\ C & 0\end{pmatrix}\begin{pmatrix}\hat f\\ \lambda\end{pmatrix} = \begin{pmatrix}X^\top W r\\ 0\end{pmatrix}$ . Same solution, and the multiplier $\lambda$ prices the constraint.

D2 - Pure factor portfolios, $PX = I$ (Ch. 7): Unconstrained: $P = (X^\top W X)^{-1}X^\top W \Rightarrow PX = (X^\top W X)^{-1}(X^\top W X) = I_K$ . Row $k$ of $P$ is a portfolio with exposure vector = row $k$ of $PX$ = $e_k^\top$ : unit own-factor, zero others. Constrained: $P = R(R^\top A R)^{-1} R^\top A$ with $A = X^\top W X$ . Then $PXRg = Rg$ for all $g$ , identity on the feasible subspace. Style rows are exactly pure, market/industry rows carry the constraint’s structure (Ch. 7 table).

D3 - Euler risk decomposition (Ch. 9): $\sigma(w) = (w^\top \Sigma w)^{1/2}$ is positively homogeneous of degree 1. Euler’s theorem for homogeneous functions: $\sigma = \sum_i w_i \partial\sigma/\partial w_i$ . Directly: $\partial \sigma/\partial w = \Sigma w / \sigma$ , so $\sum_i w_i (\Sigma w)_i / \sigma = w^\top \Sigma w / \sigma = \sigma$ . ∎ Factor-space version: with $\sigma(x) = (x^\top F x)^{1/2}$ (factor block), contributions $x_k (Fx)_k$ sum to $x^\top F x$ .

D4 - Characteristic portfolio (Ch. 7): $\min_h h^\top \Sigma h$ s.t. $x^\top h = 1$ . Lagrangian $h^\top \Sigma h - 2\gamma(x^\top h - 1)$ . Stationarity $\Sigma h = \gamma x \Rightarrow h = \gamma \Sigma^{-1} x$ . The constraint fixes $\gamma = 1/(x^\top \Sigma^{-1} x)$ : $h_x = \Sigma^{-1}x / (x^\top \Sigma^{-1} x)$ , with minimized variance $1/(x^\top \Sigma^{-1}x)$ . The GLS pure portfolio for factor $k$ solves the same problem with the added constraints of zero exposure to the other factors. Stacking those constraints and applying D1’s bordered system shows the GLS ( $W = \Delta^{-1}$ ) regression rows solve it: estimation efficiency = portfolio efficiency.

D5 - Woodbury identity for $\Sigma^{-1}$ (Ch. 11): For $\Sigma = \Delta + XFX^\top$ with $\Delta, F$ invertible: $\Sigma^{-1} = \Delta^{-1} - \Delta^{-1}X(F^{-1} + X^\top \Delta^{-1} X)^{-1} X^\top \Delta^{-1}$ . Verify by multiplication: $\Sigma \cdot [\text{RHS}] = I + X F X^\top \Delta^{-1} - (X + XFX^\top\Delta^{-1}X)(F^{-1} + X^\top \Delta^{-1}X)^{-1}X^\top \Delta^{-1}$ . Factor $XF$ from the middle term: $X + XFX^\top \Delta^{-1} X = XF(F^{-1} + X^\top \Delta^{-1} X)$ , so the middle term collapses to $XFX^\top \Delta^{-1}$ , cancelling. ∎ Cost: $O(NK^2 + K^3)$ vs. $O(N^3)$ .

D6 - Minimum-variance hedge ratio (Ch. 12): $\mathrm{Var}(r_p + h\, r_h) = \sigma_p^2 + 2h\,\mathrm{Cov}(r_p, r_h) + h^2 \sigma_h^2$ . Minimize over $h$ : $h^* = -\mathrm{Cov}(r_p, r_h)/\sigma_h^2 = -\beta_{p,h}$ . Multi-instrument: $h^* = -\mathrm{Cov}(r_H)^{-1}\mathrm{Cov}(r_H, r_p)$ , the population regression coefficients of the portfolio on the instruments. Through the model, $\mathrm{Cov}(r_H) = X_h F X_h^\top + \Delta_h$ and $\mathrm{Cov}(r_H, r_p) = X_h F x_p (+ \text{specific overlap})$ .

17.4 Glossary

Active return/risk: portfolio minus benchmark return. Volatility thereof (tracking error).
Alpha: expected return not explained by factor exposures. In construction, the forecast vector fed to an optimizer.
Bias statistic: std of realized returns standardized by forecast volatility. The calibration score of a risk model.
Characteristic portfolio: minimum-variance portfolio with unit exposure to a characteristic: $\Sigma^{-1}x / x^\top\Sigma^{-1}x$ .
Coverage universe: all assets the model assigns exposures and risk to (cf. estimation universe).
Descriptor: a raw measurable per-stock quantity (B/P, 12-1 return) before standardization. Factors blend one or more.
Estimation universe: the curated asset set on which factor returns are estimated.
Exposure (loading): a stock’s sensitivity to a factor: dummy (industry/country) or standardized z-score (style).
Factor-mimicking/pure factor portfolio: the long–short portfolio (row of $P$ ) whose return is the estimated factor return. Unit own-exposure, zero other-exposure.
Factor return: per-period payoff to unit exposure of a factor, estimated by cross-sectional regression.
Half-life: lag at which an EWMA weight halves. The responsiveness dial of covariance estimation.
Idiosyncratic/specific risk: return variance unique to a stock. Diagonal of $\Delta$ . Diversifiable.
Information coefficient (IC): cross-sectional correlation between a signal and forward returns. In alpha research, the number that matters is the IC of the factor-residualized signal.
Linked assets: multiple listings of one issuer (ADR, share classes) sharing factor exposures and correlated specifics.
Pure factor portfolio: see factor-mimicking portfolio.
Restriction matrix: basis of a constraint’s null space, converting constrained to unconstrained regression.
Style factor: continuous characteristic-based factor (value, momentum, size, quality…).
Tracking error: annualized volatility of active return.
VIF (variance inflation factor): $1/(1-R^2)$ of one exposure regressed on the others. The redundancy gauge for candidate factors.
Winsorization: clipping extreme descriptor values before standardization.
Z-score: standardized exposure: (descriptor − cap-weighted mean) / equal-weighted std.

17.5 The mini example: complete dataset

Everything below reproduces every number in Chapters 2–15 (NumPy, deterministic, no randomness). The full script is reproduced on its own page: Mini Example Source Code.

Universe, descriptors, month-1 data, portfolios:

Stock	Industry	Cap $bn	B/P	Mom (12-1)	Spec vol (ann)	$r_1$ (%)	$w_p$
AXIOM	Tech	150	0.15	+0.32	18%	+4.2	0.10
BINARY	Tech	80	0.25	+0.18	22%	+2.8	0.08
CIPHER	Tech	40	0.45	−0.05	30%	+0.5	0.10
DIGIT	Tech	10	0.60	+0.40	38%	+6.0	0.03
EVERGREEN	Fin	120	0.85	+0.06	16%	+0.8	0.22
FIDELIS	Fin	60	0.95	−0.02	20%	−0.6	0.14
GUARDIAN	Fin	20	1.10	−0.12	28%	−1.8	0.06
HARVEST	Cons	90	0.40	+0.10	17%	+1.2	0.15
INDIGO	Cons	30	0.55	+0.02	26%	+2.0	0.08
JUNIPER	Cons	15	0.70	−0.08	32%	−0.5	0.04

Benchmark $w_b$ = cap weights (total cap 615). SIZE descriptor = ln(cap). Style standardization: cap-weighted mean, equal-weighted std (Ch. 3. Resulting $X$ tabulated there). Regression weights $\propto \sqrt{\text{cap}}$ , normalized. Constraint: cap-weighted industry factor returns sum to zero, industry cap weights (0.4553, 0.3252, 0.2195).

Factor covariance $F$ (annualized): Volatilities: MKT 16%, TECH 9%, FIN 7%, CONS 5%, VALUE 4%, MOM 6%, SIZE 4%. Correlations:

	MKT	TECH	FIN	CONS	VALUE	MOM	SIZE
MKT	1	0.10	−0.05	−0.10	−0.20	0.05	0.15
TECH		1	−0.40	−0.30	−0.35	0.30	0.05
FIN			1	−0.10	0.40	−0.15	0.00
CONS				1	0.05	−0.05	−0.05
VALUE					1	−0.45	0.10
MOM						1	0.05
SIZE							1

(Symmetric, eigenvalues all positive: PSD verified in the script.)

Stipulated later-month factor returns (Ch. 10. CONS set by the constraint): $f_2$ = (−2.0, −1.5, +1.0, +1.63, +1.2, −0.8, +0.5)%. $f_3$ = (+3.0, +0.8, −0.5, −0.919, −0.6, +1.0, −0.3)%. Active specific returns months 2–3: +0.30%, −0.10%. Constant exposures assumed across the quarter.

Hedge instruments (Ch. 12): index future = benchmark exposures (1, 0.4553, 0.3252, 0.2195, 0, 0, 0). Small-cap future = (1.05, 0.35, 0.30, 0.35, 0.10, −0.05, −1.20). Both specific-risk-free.

Alpha-research candidate (Ch. 13): raw descriptor = book-to-price − 0.30·(12-1 momentum) + a fixed per-stock idiosyncratic term, standardized to exposure $a$ . Regressing $a$ on $X$ gives the spanning results below.

Key computed results (cross-chapter checkpoints): month-1 factor returns $f_1$ = (1.821, 0.768, −1.282, 0.306, 0.548, 1.962, 0.046)%. Portfolio/benchmark/active vols 17.55% / 18.14% / 5.42%. Active exposures (0, −0.145, 0.095, 0.051, 0.385, −0.332, −0.275). Quarter attribution VALUE +0.46% MOM −0.72% specific +0.24% total −0.102%. Optimization TE 5.42->4.65% at 25.2% turnover. Hedge (−0.759, −0.229) giving 17.55->8.14%. Alpha-research candidate: corr with VALUE/MOM/SIZE +0.82/−0.44/−0.50, spanned fraction 0.884, IC vs $r_1$ −0.49 raw / −0.13 residualized, signal long–short −0.63% = factor −0.46% + specific −0.17%.

17.6 Annotated bibliography

Foundations

Sharpe (1964), “Capital Asset Prices,” JF: the one-factor beginning. Beta as the first exposure.
Ross (1976), “The Arbitrage Theory of Capital Asset Pricing,” JET: multi-factor pricing without specifying the factors. The license under which all factor models operate.
Fama & MacBeth (1973), “Risk, Return, and Equilibrium,” JPE: the two-pass cross-sectional methodology (Ch. 7). Still the standard premia test.
Chen, Roll & Ross (1986), “Economic Forces and the Stock Market,” JB: the canonical macroeconomic factor model (Ch. 4).
Fama & French (1992, 1993) JF/JFE; (2015) JFE; Carhart (1997) JF: the style-factor canon: size and value, the three-factor model, profitability/investment, momentum. The sorted-portfolio construction of Ch. 7.
Rosenberg (1974), “Extra-Market Components of Covariance,” JFQA: the founding paper of the fundamental cross-sectional architecture this primer centers on.

Books

Grinold & Kahn, Active Portfolio Management (2nd ed.): the practitioner bible: characteristic portfolios, the fundamental law, IR-based thinking. The source of Ch. 7’s optimization view and Ch. 11’s alpha discipline.
Connor, Goldberg & Korajczyk, Portfolio Risk Analysis: the most rigorous book-length treatment of factor risk models per se, all three families.
Qian, Hua & Sorensen, Quantitative Equity Portfolio Management: cross-sectional modeling and construction with worked detail.
Litterman et al., Modern Investment Management: risk decomposition and budgeting culture (Goldman’s quantitative tradition).

Methods

Ledoit & Wolf (2003, 2004): shrinkage covariance estimation (Ch. 8): “Honey, I Shrunk the Sample Covariance Matrix.”
Newey & West (1987), Econometrica: autocorrelation-consistent covariance. The horizon-scaling fix of Ch. 8.
Shanken (1992), RFS: errors-in-variables correction for Fama–MacBeth (Ch. 7).
Menchero (2000s, various): multi-period attribution linking (Ch. 10), Carino (1999): the log-linking algorithm.
Black & Litterman (1992), FAJ: equilibrium-anchored expected returns (Ch. 11).
Michaud (1989), “The Markowitz Optimization Enigma,” FAJ: error maximization named and shamed (Ch. 11).
Harvey, Liu & Zhu (2016), RFS, “…and the Cross-Section of Expected Returns”: the factor zoo’s multiple-testing reckoning (Ch. 16). Hou, Xue & Zhang (2020), RFS: the replication audit.
Kelly, Pruitt & Su (2019), JFE: IPCA, Gu, Kelly & Xiu (2020), RFS: ML asset pricing (Ch. 16 directions).

Practitioner references

MSCI Barra model handbooks (USE4, GEM3 and successors): full disclosure of a production fundamental model: descriptor recipes, estimation universes, regression weights, specific-risk blending. The single best way to see every choice in Chapters 3–8 made concretely, with parameters.
Axioma/SimCorp research papers: practical treatments of alpha alignment (Ch. 11), statistical-vs-fundamental hybrids (Ch. 4), and bias-statistic methodology (Ch. 14).
Menchero, Orr & Wang (MSCI), “The Barra US Equity Model (USE4)” research notes: readable bridge between the handbooks and the academic literature.