# Chapter 4: Types of Factor Model

_Previous: [Chapter 3: Factors and Exposures](03-factors-and-exposures.md)_

---

Every factor model must produce the same four ingredients: exposures $X$, factor returns $f$, a factor covariance $F$, and specific risk $\Delta$. $F$ and $\Delta$ are estimated from data in all three families, so what separates the families is which of $X$ and $f$ is _observed_ and which is _estimated_. [Chapter 1](01-introduction.md) previewed that split; this chapter works through each family in turn, then the hybrids (§4.4) that combine them. The choice determines the rest: the data each family needs, its reaction speed, and where it's useful.

## 4.1 Time-series/Macroeconomic models

**The idea:** Pick factors that are directly observable as a time series: the market's excess return, surprises in industrial production and inflation, shifts and twists of the yield curve, credit spread changes, oil prices. The unknowns are each stock's sensitivities, estimated by a time-series regression per stock over some period of time:

$$r_{it} = \alpha_i + \beta_i^\top f_t + \epsilon_{it}, \qquad t = 1, \dots, T,$$

so stock $i$'s exposure vector is $\hat{\beta}_i = \left(\textstyle\sum_t \tilde f_t \tilde f_t^\top\right)^{-1} \textstyle\sum_t \tilde f_t \tilde r_{it}$ (OLS on demeaned variables $\tilde f, \tilde r$). With factor returns observed, the factor covariance $F$ is just the sample covariance of the $f_t$ series, and $\Delta$ comes from the regression residuals $\epsilon_{it}$.

**Estimation details that matter.**

- _Window choice_: a rolling window (commonly 60 months) keeps betas current but noisy. An expanding window (all history to date) is the most stable but the slowest to react. Either way, longer windows trade noise for staleness. Exponential weighting (recent observations count more, with some chosen half-life) is the standard compromise.
- _Standard errors_: with 60 monthly observations and one regressor, the standard error of a beta near 1 is typically 0.15–0.25, wide. Beta estimates are routinely shrunk toward 1 (e.g., [Blume or Vasicek adjustment](https://en.wikipedia.org/wiki/Beta_%28finance%29#Improved_estimators): $\hat\beta^{\text{adj}} = \lambda \hat\beta + (1-\lambda)\cdot 1$, with $\lambda$ stock-specific in Vasicek's precision-weighted version) to combat this noise.
- _Macro surprises, not levels_: for macro factors only the _unexpected_ component should be priced in the period's return, so inputs are innovations from a forecasting model (e.g., changes vs. consensus), not raw levels. Estimating that expected component is itself a modeling problem, and errors in it feed straight into the factor returns.

**Strengths:** Factors are economically meaningful by construction. They directly answer questions like "what is my portfolio's sensitivity to rates?". Minimal data is required beyond returns and the factor series.

**Weaknesses, and why this family lost the risk-model market:**

- _Stale exposures._ A company that doubles its leverage today still carries the beta of its old self for years until the regression window catches up. Measured characteristics (the fundamental model family) update immediately.
- _No history, no model._ IPOs and recent listings cannot be regressed. They need proxies.
- _Low explanatory power._ Macro factors explain single-stock returns poorly: the per-stock time-series regression typically yields an $R^2$ well under 20%. Too much ends up in idiosyncratic risk, violating the assumption that $\Delta$ is diagonal.
- _Errors-in-variables._ Estimated $\hat\beta$'s carry estimation noise into every downstream use.

Macro models survive in some areas: asset allocation, macro scenario analysis, and as overlays answering rate/oil/inflation sensitivity questions that characteristic-based factors do not address directly.

## 4.2 Cross-sectional/Fundamental models

**The idea:** Flip what is known. Exposures are _measured_ from observable characteristics, fresh every period (see [Chapter 3](03-factors-and-exposures.md)). The unknowns are the factor returns, recovered each period by a cross-sectional regression across stocks:

$$r_t = X_{t-1} f_t + \epsilon_t \quad \xrightarrow{\;\text{WLS across } i\;} \quad \hat f_t.$$

One regression per period, giving a time series $\{\hat f_t\}$ from which $F$ is estimated, with residuals feeding $\Delta$. [Chapter 6](06-estimating-factor-returns.md) is devoted to this regression. [Chapter 7](07-factor-portfolios.md) focuses on the fact that each $\hat f_{kt}$ is the return of an investable long–short portfolio.

**Exposures-from-characteristics react faster:** The leverage example again: the company that doubles its debt sees its leverage _descriptor_ jump at the next data update, so its risk profile updates in days. The momentum exposure of a stock with a price jump updates mechanically with the price. No regression window to wait out. Measured exposures also bring high explanatory power (cross-sectional $R^2$ of 20–40% per month for single stocks, far higher for portfolios) and cover new listings from day one (an IPO has characteristics immediately). That $R^2$ is measured across stocks within one period, so it is not comparable to the per-stock time-series $R^2$ in §4.1. The two answer different questions.

That combination is why every major commercial risk model, MSCI Barra and SimCorp Axioma among them, is built this way, and why this primer's construction chapters (5-8) follow the architecture.

**Costs:** Heavy data infrastructure (point-in-time fundamentals, classifications, corporate actions, see [Chapter 16](16-practical-considerations.md)). A judgment-laden factor definition process (which characteristics, which descriptor recipes, see Chapters [3](03-factors-and-exposures.md) and [15](15-modifying-the-model.md)). And the model only knows about the characteristics it was given. A common driver not represented in $X$ leaks into residuals (the detection-and-repair loop of [Chapter 15](15-modifying-the-model.md)).

The [MiniModel](18-mini-example-source-code.md) is this type. Exposures were measured in [Chapter 3](03-factors-and-exposures.md). Factor returns get estimated in [Chapter 6](06-estimating-factor-returns.md).

## 4.3 Statistical models

**The idea:** Let the returns data choose the factors. No characteristics, no chosen series. Extract the directions of greatest common variation directly from the $T \times N$ panel of returns.

**PCA mechanics in brief:** Form the sample covariance $\hat\Sigma$ of returns. Eigendecompose $\hat\Sigma = Q \Lambda Q^\top$ with eigenvalues $\lambda_1 \ge \lambda_2 \ge \dots$ and orthonormal eigenvectors $q_1, q_2, \dots$. Take the top $K$ eigenvectors as the exposure matrix $X = [q_1, \dots, q_K]$. Factor returns are the projections $f_t = X^\top r_t$ (the eigenvectors are orthonormal, $X^\top X = I_K$, so the projection needs no $(X^\top X)^{-1}$ term). The truncated reconstruction $X \Lambda_K X^\top + \text{diag(residual variances)}$ is the factor risk model. The first principal component of an equity universe is always recognizably "the market" (all-positive weights). The next few often resemble size, rate-sensitivity, or large industry blocks, but nothing guarantees it. The eigendecomposition is derived in the [appendix](17-appendix.md).

**How many factors?** The central choice in a statistical model. There are three standard tools:

- _Scree plot_: keep components before the eigenvalue spectrum flattens.
- _Random matrix theory_: for a panel with $N$ assets and $T$ observations of pure noise, the [Marchenko–Pastur law](https://en.wikipedia.org/wiki/Marchenko%E2%80%93Pastur_distribution) says sample eigenvalues fall (asymptotically) inside $\left[(1-\sqrt{N/T})^2, (1+\sqrt{N/T})^2\right] \times \sigma^2$. Eigenvalues above the upper edge are evidence of genuine common structure. Keep those. With $N = 3000, T = 500$, the ratio $N/T = 6$ puts the nonzero noise bulk in roughly $[(1-\sqrt 6)^2, (1+\sqrt 6)^2]\,\sigma^2 \approx [2.1, 11.9]\,\sigma^2$, so a sample eigenvalue has to clear about $11.9\,\sigma^2$ to count as signal, and most apparent "factors" in a sample covariance are artifacts. (Worse, with $N > T$ the sample covariance is singular: only $T = 500$ of its eigenvalues are nonzero and the other $N - T = 2500$ are exactly zero, the same rank deficiency [Chapter 1](01-introduction.md) flagged. The law describes the nonzero bulk.) The law gives a principled cutoff for how many components to keep.
- _Asymptotic PCA_ (Connor–Korajczyk): when $N \gg T$, eigendecompose the $T \times T$ cross-product matrix instead: same factors, vastly cheaper, and statistically valid in the large-$N$ limit.

**The interpretability problem:** Eigenvectors are only identified up to rotation. Any orthogonal rotation of the factors fits identically. So statistical factors have no stable names: "factor 7" this month may be a blend of last month's factors 6 and 9. Practitioners regress statistical factors on named factors to label them, but the labels are approximations. This is the family's defining trade-off: it captures _whatever_ is in the data (including drivers nobody has named yet, and fast-moving crisis structure) at the price of explaining nothing to a portfolio manager.

**Where they are used:** Short-horizon risk for statistical arbitrage, where reacting to current correlation structure beats interpreting it. The bigger use is as a _diagnostic_: if PCA on a fundamental model's residuals finds a large common component, the fundamental model is missing a factor ([Chapter 15](15-modifying-the-model.md)).

## 4.4 Hybrid models

**The most common hybrid models:** a fundamental core, plus a small number of statistical factors extracted from the fundamental model's _residuals_, sweeping up systematic risk the named factors miss. Vendors sell exactly this as "fundamental + statistical" variants. Other combinations: macro factors regression-mapped onto a fundamental model's factor space, giving a macro lens on a fundamental engine, or characteristic-based exposures with PCA-cleaned covariance. The price is always interpretability at the margin. The benefit is robustness to the unknown-missing-factor problem.

## 4.5 Comparison

So far the families have been split by which ingredient is observed. Here they are side by side on the properties that actually drive the choice:

| Property          | Time-series (macro)                  | Cross-sectional (fundamental)          | Statistical                     |
| ----------------- | ------------------------------------ | -------------------------------------- | ------------------------------- |
| What's known      | factor series                        | exposures                              | nothing                         |
| What's estimated  | exposures, $F$, $\Delta$             | factor returns, $F$, $\Delta$          | everything (PCA)                |
| Main data need    | factor series + returns              | point-in-time fundamentals + returns   | returns panel only              |
| Reaction speed    | slow (stale betas)                   | fast (tracks data and price)           | fast (re-estimated each window) |
| Explanatory power | low (per-stock TS $R^2$ < 20%)       | high (cross-sectional $R^2$ 20–40%/mo) | high in-sample by construction  |
| New listings      | need history, must proxy             | covered on day one                     | need some history               |
| Interpretability  | high (named macro factors)           | high (named style/industry factors)    | low (rotation, unstable names)  |
| Typical use       | macro scenario, rate/oil sensitivity | risk, attribution, construction        | stat-arb, residual diagnostics  |

The data-need row is the quiet decider in practice: a fundamental model is only as good as the point-in-time fundamentals and classifications behind it ([Chapter 16](16-practical-considerations.md)), while a statistical model needs nothing but a returns panel, which is why it is the fallback when fundamentals are thin or absent.

## 4.6 Choosing a model type

| Use case                                           | Best fit                          | Why                                                              |
| -------------------------------------------------- | --------------------------------- | ---------------------------------------------------------------- |
| Institutional risk reporting, client communication | Fundamental                       | Interpretable factors. Responsive exposures. Broad coverage      |
| Performance attribution                            | Fundamental                       | Contributions must be explainable in investment-process language |
| Optimized portfolio construction                   | Fundamental (often + statistical) | Need interpretable constraints _and_ no missed common risk       |
| Stat-arb / short-horizon risk                      | Statistical                       | Reacts fastest and interpretability is irrelevant                |
| Macro scenario analysis, multi-asset               | Macro / time-series               | Factors are the very objects being stressed                      |
| Checking a fundamental model for blind spots       | Statistical on residuals          | Finds structure without needing to name it                       |

**The pragmatic summary:** fundamental for anything a human must read, statistical for anything only a machine must use, macro when the question is itself macroeconomic, and hybrids when the stakes justify the complexity.

## 4.7 Summary

- Every factor model produces the same four ingredients, $X$, $f$, $F$, and $\Delta$. The families differ only in which are observed and which are estimated.
- **Time-series (macro):** observable factor series, exposures regressed per stock. Interpretable, but slow to react, weak on single stocks, and silent on names without return history.
- **Cross-sectional (fundamental):** exposures measured from characteristics, factor returns regressed across stocks each period. Fast, high explanatory power, broad coverage, paid for with point-in-time data infrastructure. This primer's subject.
- **Statistical (PCA):** both estimated from the returns panel alone. Captures unnamed structure at the cost of interpretability. Most useful as a diagnostic on a fundamental model's residuals.
- **Hybrids:** a fundamental core plus statistical factors on its residuals (the common variant), trading a little interpretability for robustness to factors no one has named.

---

_Next: [Chapter 5: Estimation Universe and Coverage Universe](05-universes.md)_
