with Raymond Kan and Jay Shanken, 2013, Journal of Finance 68, pp. 2617-2649. [Paper] [Internet Appendix] [Additional Material] [Code]. Over the years, many asset pricing studies have employed the sample cross-sectional regression (CSR) R2 as a measure of model performance. We derive the asymptotic distribution of this statistic and develop associated model comparison tests, taking into account the impact of model misspecification on the variability of the CSR estimates. We encounter several examples of large R2 differences that are not statistically significant. A version of the intertemporal CAPM exhibits the best overall performance, followed by the three-factor model of Fama and French (1993). Interestingly, the performance of prominent consumption CAPMs is sensitive to variations in experimental design.