ATS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to Personal Folders
Right arrow Download to citation manager
Right arrow Author home page(s):
Gary L. Grunkemeier
YingXing Wu
Right arrow Permission Requests
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Wu, Y.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Wu, Y.
Related Collections
Right arrow Education

Ann Thorac Surg 2004;77:1142-1144
© 2004 The Society of Thoracic Surgeons


The statistician's page

Bootstrap resampling methods: something for nothing?

Gary L. Grunkemeier, PhDa*, YingXing Wu, MDa

a Providence Health System, Portland, Oregon, USA

* Address reprint requests to Dr Grunkemeier, Providence St. Vincent Hospital and Medical Center, 9205 SW Barnes Rd, Suite 33, Portland, OR 97225, USA
e-mail: gary.grunkemeier{at}providence.org

The paper by Brunelli and colleagues [1] in this issue of The Annals of Thoracic Surgery used bootstrap resampling to select the final variables for a logistic regression model to predict air leak after pulmonary lobectomy. Bootstrapping is a generic methodology, whose implementation involves a simple yet powerful principle: creating many repeated data samples from the single one in hand, and making inference from those samples. The name derives from the expression "pulling oneself up by one's bootstraps," meaning no outside help (additional data, parametric assumptions) is involved. It seems almost magical, like getting something for nothing. Bootstrapping can be used for many statistical purposes besides variable selection. One of the most common is to compute a confidence interval (CI). As a demonstration of the technique, we will compute a bootstrap CI for the O/E (observed-to-expected) ratio of mortality after myocardial infarction.

See page 1205

Background and rationale

Bootstrap methods are more than 20 years old [2], but they are computer-intensive and have only recently become widely available in statistical programs. Powerful and widely available computing capability is transforming statistics. Originally, computers just speeded up data handling and statistical computations. Now they enable exploratory analysis of gigantic data sets (data mining), creating neural network algorithms, analyzing complex genomic data, and supplementing traditional statistical methods using large numbers of data simulations.

Simulation

Typically, we collect a data sample and compute a statistic of interest, say the mean value of the individual data points. We realize there is variability associated with this estimate, such that if we, or others, repeated our experiment or data collection maneuvers, the estimate would be different. We need a measure of the variability of this statistic, say the standard deviation (SD) or a CI. The conventional method is to assume we know the distribution of the statistic of interest, often the familiar normal (bell-shaped) distribution. Even if the underlying popu-lation's distribution is not normal, the distribution of the statistic itself may be approximately normal if the sample is large enough. (Often, however, we do have a large enough sample for this asymptotic distribution to obtain.) Given the distribution, we can use the formula for its SD and apply it to our sample.

But, if we knew the distribution of the statistic, we would not need any formulas for it. We could simply use the computer to determine them by repeated simulations, randomly making many selections from the distribution in question. Figure 1 confirms this assertion. The histograms represent four computer simulations with increasing sample sizes from the standard normal distribution (with mean = 0 and SD = 1). As the number of simulations increases, the distribution is approximated to as close a degree as desired. The simulation in the last panel produced a distribution with a mean of 0.003 and SD of 1.002. The formulas tell us that for the standard normal distribution, 95% of the values are between the critical values of −1.960 and +1.960 (this is where the "plus and minus 2 SD" for 95% limits comes from). The corresponding values from the last panel in Figure 1 are −1.959 and +1.959. We could make these even closer to the theoretical values by adding more simulations—on a 0.8-GHz laptop it took less than 0.5 seconds to generate 100,000 and less than 4 seconds to generate 1,000,000—but this gives more precision than needed.



View larger version (43K):
[in this window]
[in a new window]
 
Fig 1. Simulation (random draws) from a population can reproduce the original distribution exactly, if enough draws are made. With increasing sample size, simulations from a standard normal distribution can approximate the underlying distribution to whatever accuracy is desired.

 
Resampling

This exercise confirms that fact that simulations can recreate critical values for any known distribution. But there are other situations in which we do not know or do not want to assume the distribution. In such cases we can use a simulation technique, similar to that demonstrated in Figure 1, called bootstrap resampling. Instead of generating observations from a known theoretical distribution as before, we generate observations from the distribution of the sample itself—the empirical distribution. Each simulation results in a new sample, typically of the same size as the original, by randomly selecting (with replacement) individuals from the original sample. With replacement means that at each step in the selection process, every individual from the original sample is again eligible to get selected, whether or not he has already been selected. Thus, in each bootstrap sample, some of the original individuals may not be represented and others may be represented more than once.

Example: confidence interval for the observed-to-expected ratio

A common risk-adjusted measure of mortality is the O/E ratio, the observed mortality (or number of deaths) divided by the expected mortality (or number of deaths) predicted by a risk model. If the O/E ratio is greater than 1, the observed mortality is higher than expected; if the O/E ratio is less than 1, the observed mortality is lower than expected.

We will demonstrate the bootstrap method by deriving a 95% CI for the O/E ratio for mortality after myocardial infarction with ST elevation (STEMI). Several Providence Health System hospitals participate in the National Registry of Myocardial Infarction (NRMI), sponsored by Genentech Inc (South San Francisco, CA). From October 2002 through September 2003, nine Providence hospitals treated 913 STEMI patients, of which 105 died, for an observed in-hospital mortality of 11.50%. The expected mortality was derived from the NRMI national registry of 36,214 myocardial infarction patients from 1,288 participating hospitals. Table 1 contains data stratified by TIMI (thrombolysis in myocardial infarction) risk scores [3, 4]. For each level of risk (TIMI score), the expected number of deaths equals the number of patients in that level times the expected mortality (expressed as a decimal). The overall Providence Health System mortality was 11.50%, and the expected mortality was 11.04%, for an O/E ratio of 1.04 (11.50 divided by 11.04).


View this table:
[in this window]
[in a new window]
 
Table 1. Observed and Expected Deaths of Patients With Myocardial Infarction With ST Elevation by Thrombolysis in Myocardial Infarction Risk Scores

 
Like all statistics, a point estimate, such as the O/E ratio, is of little value unless it is accompanied by an interval estimate, which measures its precision. If the 95% CI interval includes 1, then there is insufficient evidence to say that the ratio is statistically different from 1. In general, only if the lower confidence limit is greater than 1 should one conclude that the mortality is worse than expected, and only if the upper confidence limit is less than 1 should one conclude that the mortality is better than expected. (And even then, for many technical reasons, including the multiple-comparison problem, one should be cautious in such conclusions.) The CI protects us from overreacting to the observed data.

Conventional method

The usual way to compute a CI uses a mathematical expression derived from assumptions about the underlying statistical distribution. For example, assume the statistic has a normal distribution, compute the SD of the statistic, and then use the fact that 95% of the values of a normal distribution are within ±1.96 SD of the mean. For the O/E ratio in our example, this method [5] gives a 95% CI of 0.86 to 1.22. This method has two shortcomings: the lower limit can become negative, and the CI is symmetric about the point estimate. But an O/E ratio cannot be negative, and its distribution is not symmetric. The smallest it can be is 0 (only if no deaths were observed), although it can range to an arbitrarily high value (when many more deaths than expected are observed). Thus, an alternative with better theoretical properties is based on a normal approximation to the logarithm of the O/E ratio [6], which produces an asymmetric, always-positive interval: 0.88 to 1.23 for the present example.

Bootstrap method

To produce a bootstrap CI, the number of samples (B) to be generated from the original data set is specified, and for each sample the statistic of interest is computed. The range of values of the statistic is determined by the distribution of the observations in the original sample. The simplest bootstrap CI is simply the range within which 95% of these bootstrapped statistics fall. We used four increasing values of B, and for each value repeated the exercise five times, to determine the consistency of the resulting intervals (Fig 2). With B = 1,000, the endpoints of the interval were quite stable. These intervals were produced using the percentile method, Several transformations have been proposed for improving various properties of the bootstrap CI [7]; some of these were tried but gave similar results.



View larger version (29K):
[in this window]
[in a new window]
 
Fig 2. Bootstrap confidence intervals for the observed-to-expected (O/E) ratio. Five intervals were produced for each of four increasing number of bootstrap samples (B). As B increases, the different iterations converge to the same size intervals, within as close a tolerance as desired. The solid horizontal line represents O/E = 1, or observed = expected. All of the confidence intervals include this line, so there is no evidence of mortality different than expected. The dashed horizontal lines represent the interval computed from the usual normal approximation formula.

 
Comparison

In our example, the bootstrap intervals, after only 1,000 resamplings, produced CIs similar to the normal approximation method. The advantage is that no distributional assumptions are needed, particularly important in complex statistics in which such assumptions may be difficult. Interestingly, of the two conventional intervals, the more usual method (indicated by dashed horizontal lines in Fig 2), not the one involving a logarithmic transformation and possessing theoretical advantages, corresponded better to the bootstrap intervals.

Comment

The birth of probability theory is usually taken to be 1654, when a French gambler sought the help of mathematicians to determine the probabilities of a dice game [8]. He wanted to learn, using equations derived from the properties of the statistical distribution, what the long-term results would be of his random throwing of the dice. It is ironic that we have now come full circle, using random simulation methods to derive properties of the statistical distribution. In reference to their close association with gambling, techniques using this simulation approach are called Monte Carlo methods. Two widely referenced books that provide a thorough treatment of bootstrapping methods are suggested for further reading [9, 10].

Acknowledgments

Patient data were supplied by the following hospitals: Providence Anchorage Medical Center (Anchorage, AK), Providence Everett Medical Center (Everett, WA), Providence Portland Medical Center (Portland, OR), Providence St. Vincent Medical Center (Portland, OR), Providence Milwaukie Hospital (Milwaukie, OR), Providence Newberg Hospital (Newberg, OR), Providence Seaside Hospital (Seaside, OR), Providence Medford Medical Center (Medford, OR), and Little Company of Mary Hospital (Torrance, CA). Data management for NRMI is provided by STATPROBE, Inc (Ann Arbor, MI), who kindly provided the stratified data shown in the table.

References

  1. Brunelli A, Monteverde M, Borri A, Salati M, Marasco RD, Fianchini A. Predictors of prolonged air leak after pulmonary lobectomy. Ann Thorac Surg 2004;77:1205–10
  2. Efron B. Bootstrap methods: another look at the jackknife. Ann Statist 1979;7:1-26.
  3. Morrow D.A., Antman E.M., Charlesworth A., et al. TIMI risk score for ST-elevation myocardial infarction. A convenient, bedside, clinical score for risk assessment at presentation: an intravenous nPA for treatment of infarcting myocardium early II trial substudy. Circulation 2000;102:2031-2037.[Abstract/Free Full Text]
  4. Morrow D.A., Antman E.M., Parsons L., et al. Application of the TIMI risk score for ST-elevation MI in the National Registry of Myocardial Infarction 3. JAMA 2001;286:1356-1359.[Abstract/Free Full Text]
  5. Shwartz M., Ash A.S., Iezzoni L.I. Comparing outcomes across providers. In: Iezzoni L.I., ed. Risk adjustment for measuring healthcare outcomes. Chicago: Health Administration Press, 1997:471-516.
  6. Hosmer D.W., Lemeshow S. Confidence interval estimates of an index of quality performance based on logistic regression models. Stat Med 1995;14:2161-2172.[Medline]
  7. Carpenter J., Bithell J., Swift M.B. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat Med 2000;19:1141-1164.[Medline]
  8. Weaver W. Lady luck, the theory of probability. . New York: Anchor Books, Doubleday and Co, 1963.
  9. Efron B., Tibshirani R. An introduction to the bootstrap. . New York: Chapman and Hall, 1993.
  10. Davison A.C., Hinkley D.V. Bootstrap methods and their application. . New York: Cambridge University Press, 1997.



This article has been cited by other articles:


Home page
Ann. Thorac. Surg.Home page
A. Brunelli, M. Refai, F. Xiume, M. Salati, V. Sciarra, L. Socci, and A. Sabbatini
Performance at Symptom-Limited Stair-Climbing Test is Associated With Increased Cardiopulmonary Complications, Mortality, and Costs After Major Lung Resection.
Ann. Thorac. Surg., July 1, 2008; 86(1): 240 - 248.
[Abstract] [Full Text] [PDF]


Home page
Eur. J. Cardiothorac. Surg.Home page
A. Brunelli, G. Varela, P. Van Schil, M. Salati, N. Novoa, J. M. Hendriks, M. F. Jimenez, P. Lauwers, and on behalf of the ESTS Audit and Clinical Excellenc
Multicentric analysis of performance after major lung resections by using the European Society Objective Score (ESOS)
Eur. J. Cardiothorac. Surg., February 1, 2008; 33(2): 284 - 288.
[Abstract] [Full Text] [PDF]


Home page
Eur. J. Cardiothorac. Surg.Home page
A. Brunelli, M. Refai, F. Xiume, M. Salati, R. Marasco, V. Sciarra, L. Socci, and A. Sabbatini
Oxygen desaturation during maximal stair-climbing test and postoperative complications after major lung resections
Eur. J. Cardiothorac. Surg., January 1, 2008; 33(1): 77 - 82.
[Abstract] [Full Text] [PDF]


Home page
J. Thorac. Cardiovasc. Surg.Home page
A. Brunelli, M. Salati, M. Refai, F. Xiume, G. Rocco, and A. Sabbatini
Risk-adjusted econometric model to estimate postoperative costs: An additional instrument for monitoring performance after major lung resection
J. Thorac. Cardiovasc. Surg., September 1, 2007; 134(3): 624 - 629.
[Abstract] [Full Text] [PDF]


Home page
Ann. Thorac. Surg.Home page
G. Varela, A. Brunelli, G. Rocco, M. F. Jimenez, M. Salati, and T. Gatani
Evidence of Lower Alteration of Expiratory Volume in Patients With Airflow Limitation in the Immediate Period After Lobectomy
Ann. Thorac. Surg., August 1, 2007; 84(2): 417 - 422.
[Abstract] [Full Text] [PDF]


Home page
J. Thorac. Cardiovasc. Surg.Home page
A. Brunelli and G. Rocco
Internal validation of risk models in lung resection surgery: Bootstrap versus training-and-test sampling
J. Thorac. Cardiovasc. Surg., June 1, 2006; 131(6): 1243 - 1247.
[Abstract] [Full Text] [PDF]


Home page
ICVTSHome page
A. Brunelli, F. Xiume', M. Al Refai, M. Salati, R. Marasco, and A. Sabbatini
Risk-adjusted morbidity, mortality and failure-to-rescue models for internal provider profiling after major lung resection
Interactive CardioVascular and Thoracic Surgery, April 1, 2006; 5(2): 92 - 96.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to Personal Folders
Right arrow Download to citation manager
Right arrow Author home page(s):
Gary L. Grunkemeier
YingXing Wu
Right arrow Permission Requests
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Wu, Y.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Wu, Y.
Related Collections
Right arrow Education


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
ANN THORAC SURG ASIAN CARDIOVASC THORAC ANN EUR J CARDIOTHORAC SURG
J THORAC CARDIOVASC SURG ICVTS ALL CTSNet JOURNALS