# PMSAMPSIZE: Stata module to calculate the minimum sample size required for developing a multivariable prediction model

*Joie Ensor* ()

Additional contact information

Joie Ensor: Institute of Applied Health Research, University of Birmingham

Statistical Software Components from Boston College Department of Economics

**Abstract:**
pmsampsize computes the minimum sample size required for the development of a new multivariable prediction model using the criteria proposed by Riley et al. 2018. pmsampsize can be used to calculate the minimum sample size for the development of models with continuous, binary or survival (time-to-event) outcomes. Riley et al. lay out a series of criteria the sample size should meet. These aim to minimise the overfitting and to ensure precise estimation of key parameters in the prediction model. For continuous outcomes, there are four criteria: i) small overfitting defined by an expected shrinkage of predictor effects by 10% or less, ii) small absolute difference of 0.05 in the model's apparent and adjusted R-squared value, iii) precise estimation of the residual standard deviation, and iv) precise estimation of the average outcome value. The sample size calculation requires the user to pre-specify (e.g. based on previous evidence) the anticipated R-squared of the model, and the average outcome value and standard deviation of outcome values in the population of interest. For binary or survival (time-to-event) outcomes, there are three criteria: i) small overfitting defined by an expected shrinkage of predictor effects by 10% or less, ii) small absolute difference of 0.05 in the model's apparent and adjusted Nagelkerke's R-squared value, and iii) precise estimation (within +/- 0.05) of the average outcome risk in the population for a key timepoint of interest for prediction.

**Language:** Stata

**Requires:** Stata version 12.1

**Keywords:** sample size; power; overfitting (search for similar items in EconPapers)

**Date:** 2018-12-04, Revised 2023-12-04

**Note:** This module should be installed from within Stata by typing "ssc install pmsampsize". The module is made available under terms of the GPL v3 (https://www.gnu.org/licenses/gpl-3.0.txt). Windows users should not attempt to download these files with a web browser.

**References:** Add references at CitEc

**Citations:** Track citations by RSS feed

**Downloads:** (external link)

http://fmwww.bc.edu/repec/bocode/p/pmsampsize.ado program code (text/plain)

http://fmwww.bc.edu/repec/bocode/p/pmsampsize.sthlp help file (text/plain)

**Related works:**

This item may be available elsewhere in EconPapers: Search for items with the same title.

**Export reference:** BibTeX
RIS (EndNote, ProCite, RefMan)
HTML/Text

**Persistent link:** https://EconPapers.repec.org/RePEc:boc:bocode:s458569

**Ordering information:** This software item can be ordered from

http://repec.org/docs/ssc.php

Access Statistics for this software item

More software in Statistical Software Components from Boston College Department of Economics Boston College, 140 Commonwealth Avenue, Chestnut Hill MA 02467 USA. Contact information at EDIRC.

Bibliographic data for series maintained by Christopher F Baum ().