# Chapter 7: Factor Portfolios

_Previous: [Chapter 6: Estimating Factor Returns](06-estimating-factor-returns.md)_

---

[Chapter 6](06-estimating-factor-returns.md) ended on a structural fact: the estimated factor returns are a linear map of stock returns, $\hat f = P r$. Each row of $P$ is a portfolio, and the factor return is that portfolio's return. Understanding these portfolios turns the regression from a statistical procedure into an investment object: something with weights, leverage, turnover, and costs. It also links the risk model to factor investing and hedging.

## 7.1 From regression weights to portfolios

Recall the unconstrained WLS solution:

$$\hat f = P r, \qquad P = (X^\top W X)^{-1} X^\top W \quad (K \times N).$$

Row $k$ of $P$ assigns a weight $P_{ki}$ to every stock $i$, and $\hat f_k = \sum_i P_{ki} r_i$. Since the weights are fixed at the start of the period (they depend only on $X$ and $W$), this is a buy-and-hold portfolio for the period, and the "estimated value factor return" is the realized return of a particular long–short basket. Estimating factor returns and running factor portfolios are the same act.

## 7.2 The defining property: $PX = I$

Compute the exposure profile of factor portfolio $k$. A portfolio with weight vector $p_k$ (row $k$ of $P$, transposed) has exposures $X^\top p_k$, and stacking all $K$ portfolios:

$$P X = (X^\top W X)^{-1} X^\top W X = I_K.$$

Row $k$ of this identity says: factor portfolio $k$ has exposure exactly 1 to factor $k$ and exactly 0 to every other factor. Hence _pure_: the value factor portfolio is a unit bet on value with no market, industry, size, or momentum exposure whatsoever. Its return is the cleanest available realization of "what value did, holding everything else constant". That is precisely what a regression coefficient means, translated into portfolio language. These portfolios are known as _pure factor portfolios_ or _factor-mimicking portfolios_.

**The constrained case:** With the identification constraint of [Chapter 6](06-estimating-factor-returns.md), $P = R(R^\top A R)^{-1} R^\top A$ (where $A = X^\top W X$), and $PX = I$ holds on the constrained subspace: every style portfolio is still pure, while the market and industry portfolios have the structure the constraint dictates. In the MiniModel, the style rows of $PX$ are exactly $e_{\text{VALUE}}, e_{\text{MOM}}, e_{\text{SIZE}}$, and the MKT portfolio carries industry exposures equal to industry cap weights, meaning the "market factor portfolio" is a fully invested market-like portfolio, exactly as its interpretation requires.

**MiniModel pure portfolios** (weights per $1 of factor bet, from the [source code](18-mini-example-source-code.md). net = sum of weights, gross = sum of absolute weights):

|           | AXIOM | BINARY | CIPHER | DIGIT | EVERGREEN | FIDELIS | GUARDIAN | HARVEST | INDIGO | JUNIPER |  net | gross |
| --------- | ----: | -----: | -----: | ----: | --------: | ------: | -------: | ------: | -----: | ------: | ---: | ----: |
| **MKT**   | 0.237 |  0.135 |  0.066 | 0.018 |     0.192 |   0.103 |    0.030 |   0.149 |  0.051 |   0.020 | 1.00 |  1.00 |
| **VALUE** | −0.83 |  −2.16 |   2.01 |  0.98 |      1.80 |   −0.18 |    −1.62 |    0.96 |  −1.16 |    0.20 | 0.00 | 11.87 |
| **MOM**   | 0.101 | −0.254 | −0.157 | 0.310 |     0.206 |  −0.034 |   −0.172 |   0.111 | −0.086 |  −0.026 | 0.00 |  1.46 |

Check against [Chapter 6](06-estimating-factor-returns.md): the MKT row is long-only, fully invested, with weights close to (slightly flattened versions of) cap weights: the market portfolio through a $\sqrt{\text{cap}}$ lens. The style rows are exactly dollar-neutral (weights sum to 0). And multiplying any row into the month-1 return vector reproduces the [Chapter 6](06-estimating-factor-returns.md) factor returns to the last decimal: e.g. the MOM row x $r$ = +1.962%.

**Purity has a price: leverage.** The VALUE portfolio's gross leverage is 11.9x. To deliver value exposure with zero industry exposure in a 10-stock universe where value and industry are strongly entangled (the financials are all cheap, the techs all expensive), the portfolio must take large offsetting positions within each industry. The MOM portfolio, whose exposure pattern is less collinear with the industries, needs only 1.46x gross. The pattern generalizes: the more a factor's exposures are explainable by other factors, the more leveraged, costly, and crowded its pure portfolio. This is multicollinearity in portfolio space, and a preview of the redundancy tests of [Chapter 15](15-modifying-the-model.md).

## 7.3 Characteristic portfolios: the optimization view

The regression route is not the only way to build a unit-exposure portfolio. Grinold & Kahn's _characteristic portfolio_ of a characteristic (exposure vector) $x$ is the minimum-variance portfolio with unit exposure to it:

$$\min_h\; h^\top \Sigma\, h \quad \text{s.t.} \quad x^\top h = 1 \qquad\Longrightarrow\qquad h_x = \frac{\Sigma^{-1} x}{x^\top \Sigma^{-1} x},$$

The derivation is in the [appendix](17-appendix.md). Minimum variance, but not generally zero exposure to the other factors: it neutralizes other exposures only insofar as that reduces variance.

**When the two constructions coincide:** Run the [Chapter 6](06-estimating-factor-returns.md) regression as Generalized Least Squares (GLS) with $W = \Delta^{-1}$ (inverse specific variance) and consider the resulting pure portfolio for factor $k$. It can be shown that this portfolio is exactly the minimum-variance portfolio with unit exposure to factor $k$ _and zero exposure to every other factor_, see [appendix](17-appendix.md). This is not the single-constraint $h_x$ defined just above, which leaves the other exposures free to float wherever variance is lowest. It is the characteristic-portfolio problem with the other factors' exposures pinned to zero as added constraints. In that sense the regression with statistically optimal weights solves a constrained version of the characteristic-portfolio problem. The $\sqrt{\text{cap}}$ convention is an approximation to this ideal. This is the deep link between estimation efficiency and portfolio efficiency: better regression weights = lower-variance factor portfolios = less noisy factor returns.

## 7.4 Practical properties of factor portfolios

Treating $P$'s rows as real portfolios exposes practical frictions:

- **Leverage and shorting:** Pure style portfolios are long–short with gross leverage typically 2–8x (per unit exposure) in realistic universes. The MiniModel VALUE portfolio's 11.9x above is inflated by its tiny, entangled 10-stock universe and is not representative. Implementing them requires shorting every stock in the universe in small size: feasible for an index arb desk, not for most investors.
- **Turnover:** Exposures move every period (momentum especially), so $P$ changes every period. Pure momentum portfolios turn over several hundred percent annually.
- **Liquidity:** Limited liquidity can make it difficult to get the necessary position size for some names or find stocks to borrow for a short position.
- **Trading costs:** Illiquid names with high bid/ask spreads, names that are hard to borrow, and high turnover all lead to expensive trading costs. Factor returns as estimated are gross of transaction costs. In practice the trading costs can be a significant drag on those returns.
- **Capacity and crowding:** The same trades concentrate in the same names for everyone running similar models. This is particularly an issue if you're using the same vendor models used by the majority of market participants.

## 7.5 Tradable factor portfolios

The frictions just listed make pure factor portfolios uninvestable in practice. The fix is a tradable factor portfolio: give up purity for investability by optimizing for low tracking error to the pure factor subject to real-world constraints (see [Chapter 11](11-portfolio-construction.md)'s machinery). Which constraints apply depends on the mandate. Common constraints are:

- **Position size controls:** capping the number of names or the gross position sizes.
- **Turnover controls:** capping the period-to-period rebalancing that pure momentum especially demands.
- **Trading restrictions:** dropping names that cannot be traded for compliance reasons.
- **Long-only or 130/30:** scaling down or dropping the short leg to reduce gross leverage.

Every constraint reintroduces the incidental exposures that $PX = I$ removed. A long-only value fund cannot short the expensive names, so it inherits whatever market, industry, and size tilts the cheap names carry. It tracks the pure factor with error. That gap has two parts: tracking error (the return wanders from the pure factor's) and cost drag (the transaction costs the pure return ignores). Both are measurable, and measuring them is the honest way to judge one of these portfolios.

Index providers and ETF sponsors sell ready-made versions, far easier than building and trading the book yourself. They carry the same gap plus a management fee, and that total is the recurring disappointment in "smart beta" products: a fund sold as "the value premium" delivers that premium minus tracking error, minus costs, minus fee, which in a thin or crowded factor can erase it entirely.

## 7.6 The Fama–MacBeth procedure

The academic ancestor of [Chapter 6](06-estimating-factor-returns.md)'s regression, still the standard tool for testing whether a characteristic carries a _risk premium_:

1. **First pass (time series):** estimate each stock's exposures $\hat\beta_i$ by regressing its returns on candidate factor return series (when exposures are not directly measurable).
2. **Second pass (cross-section, each period):** regress the cross-section of returns on the exposures: $r_{it} = \gamma_{0t} + \gamma_t^\top \hat\beta_i + \epsilon_{it}$, exactly a [Chapter 6](06-estimating-factor-returns.md) regression, producing a time series $\{\hat\gamma_t\}$ of premia estimates.
3. **Inference:** the estimated premium is the time-series mean $\bar\gamma = \frac{1}{T}\sum_t \hat\gamma_t$, with standard error $\mathrm{se}(\bar\gamma) = \mathrm{sd}(\hat\gamma_t)/\sqrt{T}$. The trick that makes this work: the period-by-period $\hat\gamma_t$ are (nearly) serially uncorrelated, so the time-series variability of the estimates _is_ the right uncertainty measure. When the $\hat\beta$'s are themselves estimated, the Shanken correction inflates the standard errors for the errors-in-variables.

The connection to this primer: a fundamental model runs the second pass every period with _measured_ exposures (no first pass, no errors-in-variables) and cares about the whole distribution of $\hat f_t$ (for risk), not just its mean (the premium). Same regression, different consumer.

## 7.7 Sorted-portfolio factors (the academic construction)

Fama–French factors are built without regression: rank stocks by a characteristic, form cap-weighted buckets, go long the top and short the bottom. E.g., HML = average of the two value (high B/M) portfolios minus the two growth portfolios within a 2x3 size–value sort. SMB the small-minus-big analog. UMD likewise for momentum.

|                       | Regression (pure)                        | Sorted (HML-style)                                                                                            |
| --------------------- | ---------------------------------------- | ------------------------------------------------------------------------------------------------------------- |
| Construction          | row of $P$                               | rank, bucket, long–short                                                                                      |
| Other-factor exposure | **zero by construction**                 | **incidental:** HML carries industry tilts (long financials/utilities, short tech in most eras) and size tilt |
| Leverage              | higher, less investable                  | ~2x gross, plausibly investable                                                                               |
| Transparency          | requires the whole model                 | extremely simple, replicable from a sort                                                                      |
| Best use              | risk decomposition, attribution, hedging | academic asset-pricing tests, long-horizon premia                                                             |

The incidental exposures are not a footnote: a notable fraction of HML's month-to-month variance is industry movement, so "value had a bad month" measured by HML can mean "financials had a bad month." When a clean answer to "what did value do?" is needed, as in attribution ([Chapter 10](10-performance-attribution.md)), the pure factor return is the right object.

The original Fama–MacBeth (1973) tests, like other early studies, used an equal-weighted index of NYSE stocks as the market proxy rather than the cap-weighted index standard today. Both that choice and the sort-based construction have a practical root: computation was scarce. A cap-weighted index needs every stock's shares outstanding and price each period. An equal-weighted average or a rank-and-bucket sort needs far less. What is a trivial calculation today was a real cost when these methods were developed.

## 7.8 What factor portfolios are used for

1. **Interpreting the model:** Every factor return in every later chapter is a portfolio return. When attribution says "momentum cost you 65bp in month 1" ([Chapter 10](10-performance-attribution.md)), it means "a position you implicitly held, a specific dollar amount of the pure momentum portfolio, lost 65bp."
2. **Implementing views:** Want +0.5 of value exposure with nothing else? Hold 0.5x the pure value portfolio (or its tradable version).
3. **Hedging:** The natural instruments for removing a factor exposure ([Chapter 12](12-hedging.md)), practically via the liquid proxies (tradable factor portfolios, futures, factor ETFs) that track them.
4. **Products:** Smart beta and factor ETFs are industrialized, constrained versions of these portfolios.
5. **Diagnostics:** A factor whose pure portfolio has exploded in leverage is collinear with others, a structural warning ([Chapter 15](15-modifying-the-model.md)).

## 7.9 Summary

- $\hat f = Pr$: estimation is portfolio formation. Pure factor portfolio $k$ has unit exposure to factor $k$, zero to the rest ($PX = I$, suitably qualified under constraints).
- The optimization view (characteristic portfolios, $h$ proportional to $\Sigma^{-1}x$) and the regression view coincide under GLS weighting: efficiency in estimation and efficiency in portfolio variance are the same property.
- Leverage and turnover are the cost of purity. Tradable versions give up purity for implementability. Sorted academic factors sit at the simple-but-impure end of the spectrum.

---

_Next: [Chapter 8: Risk Model Assembly](08-risk-model-assembly.md)_
