ATS
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to Personal Folders
Right arrow Download to citation manager
Right arrow Author home page(s):
Gary L. Grunkemeier
Right arrow Permission Requests
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Jin, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Jin, R.
Related Collections
Right arrow Education

Ann Thorac Surg 2001;72:323-326
© 2001 The Society of Thoracic Surgeons


Statistician’s page

Receiver operating characteristic curve analysis of clinical risk models

Gary L. Grunkemeier, PhDa, Ruyun Jin, MDa

a Providence Health System, Portland, Oregon, USA

Address reprint requests to Dr Grunkemeier, 9155 SW Barnes, #33, Portland, OR 97225
e-mail: ggrunkemeier{at}providence.org


    Abstract
 Top
 Abstract
 Introduction
 Clinical material
 Dichotomous risk factor:...
 Receiver operating...
 Area under the curve...
 Receiver operating...
 Comment
 Acknowledgments
 References
 
Receiver operating characteristic (ROC) curve analysis is a useful method to measure the ability of a clinical risk model to discriminate between hospital deaths and survivors. Its use in medicine originated as a method for synthesizing the specificity and sensitivity of diagnostic tests across a spectrum of possible cut points. The area under the ROC curve can be interpreted as a probability of correct classification or prediction. We illustrate its use in three steps: first, with a dichotomous variable to introduce specificity and sensitivity; next, with a categorical risk factor (surgical urgency) to produce a primitive ROC curve; and finally, with a continuous risk factor (age) to approximate the usual ROC curve used for clinical risk models.


    Introduction
 Top
 Abstract
 Introduction
 Clinical material
 Dichotomous risk factor:...
 Receiver operating...
 Area under the curve...
 Receiver operating...
 Comment
 Acknowledgments
 References
 

This editorial inaugurates a planned series of articles, published at irregular intervals, designed to aid cardiothoracic surgeons in the interpretation and utility of statistics used in evaluating data obtained from both clinical and laboratory research. Dr Grunkemeier, Associate Editor for Statistics, is spearheading this effort that will include both full editorials with appropriate, highly selected references and/or linked commentary regarding the interpretation of a statistic used in one of the articles in that particular issue of The Annals.

L. Henry Edmunds, Jr, MD, Editor

There is increasing use of risk models for predicting outcomes from cardiac procedures. For each patient, such a model produces an estimate risk, which can then be compared with the observed outcome to evaluate the accuracy of the model. When the outcome is a continuous variable, such as cost or length of stay, the difference between the estimated value (dollars, days) and the observed value for a particular patient will usually be small if the model fits well.

But when the outcome is a binary (yes/no) variable, such as hospital mortality, and the estimated value (prediction) is a probability between 0 (0%) and 1 (100%), then the observed value—either no (0) or yes (1)—never exactly matches the prediction and agreement can only be achieved by aggregating groups of similar individuals. If 10 patients were all given a 10% risk of mortality and one of them died then, in aggregate, the prediction was correct, but it was wrong for each of the 10. It was off by 10% for those who lived and off by 90% for the one who died.

Two different properties can be used to evaluate the predictive accuracy of such a model: reliability and discrimination [1]. The first property, also called calibration [2], measures the ability of the model to assign appropriate risk, and is evaluated by dividing the patients into groups according to expected risk and comparing expected-to-observed mortality in those groups (such as in the 10-patient example). The second property, also called resolution [3], measures the model’s ability to discriminate among those who die or live and is evaluated by receiver operating characteristic (ROC) curve analysis. Of the two, discrimination is more important, as model adjustments can be made to overcome poor calibration [1].


    Clinical material
 Top
 Abstract
 Introduction
 Clinical material
 Dichotomous risk factor:...
 Receiver operating...
 Area under the curve...
 Receiver operating...
 Comment
 Acknowledgments
 References
 
To illustrate the derivation of ROC curves, we use data provided by a consortium of nine Providence Health System hospitals in four states. From 1995 to 2000, 14,601 patients underwent coronary artery bypass grafting, excluding concomitant or previous valve operations, but including other associated procedures.


    Dichotomous risk factor: specificity and sensitivity
 Top
 Abstract
 Introduction
 Clinical material
 Dichotomous risk factor:...
 Receiver operating...
 Area under the curve...
 Receiver operating...
 Comment
 Acknowledgments
 References
 
Receiver operating characteristic curve analysis is a technique for assessing diagnostic tests, based on the notions of specificity and sensitivity, which can also be used to assess predictive models. There is an analogy between a diagnostic test for the presence of a disease, and a predictive model for hospital death (or other binary outcome). When a diagnostic test is given to detect the presence of a disease, the result of the test is either positive (has the disease) or negative (does not have the disease). In the case of a risk model for mortality, positive equates to high risk (for dying) and negative to low risk. Table 1 contains a simple risk model where a dichotomous variable based on surgical urgency is used to predict outcome (alive or dead). Compared to the observed outcome, the prediction is either true or false, resulting in one of four possible results for each patient: true positive, true negative, false positive, or false negative. The column percentages in Table 1 provide estimates of the sensitivity, the percentage of deaths correctly classified or predicted (60.2%), and specificity, the percentage of survivors correctly classified (54.3%).


View this table:
[in this window]
[in a new window]
 
Table 1. Relationship Between Surgical Urgency, as a Dichotomous (Two-Level) Variable, and Hospital Mortality

 

    Receiver operating characteristic curve for surgical urgency (polychotomous risk factor)
 Top
 Abstract
 Introduction
 Clinical material
 Dichotomous risk factor:...
 Receiver operating...
 Area under the curve...
 Receiver operating...
 Comment
 Acknowledgments
 References
 
Surgical urgency really has five levels, and was collapsed to two levels (Table 1) to introduce specificity and sensitivity. Figure 1 shows the distribution of patients and the hospital mortality for each of the five levels. Table 2 shows the number of patients in each level of urgency, and is divided by the rows with bold-face lettering into 4 dichotomies at the cut points B to E. The row labeled cut point B contrasts elective (row above) with the combined four levels of urgent or emergent (rows below). If we predict patients below this row to be deaths, and those above it to be survivors, then we would correctly identify 60.2% of the 354 deaths (sensitivity), and 54.3% of the 14,247 survivors (specificity), the same combination used in Table 1.



View larger version (30K):
[in this window]
[in a new window]
 
Fig 1. Distribution of patients undergoing coronary artery bypass grafting according to surgical urgency. Hospital survivors are shown by gray bars and hospital deaths by black bars, with mortality percentages above them.

 

View this table:
[in this window]
[in a new window]
 
Table 2. Relationship Between Surgical Urgency, as a Five-Level Categorical Variable, and Hospital Mortality

 


View larger version (18K):
[in this window]
[in a new window]
 
Fig 2. The receiver operating characteristic curve based on the risk factor for surgical urgency. The circled letters correspond to the cut points in Table 2. (AUC = area under the curve.)

 
Moving the cut point down to the row marked C would decrease the sensitivity to 23.2% and increase the specificity to 93.4%. Two extreme possibilities for cutoff values are also shown. Using the cutoff indicated by A (predicting that all patients die) results in 100% sensitivity, as all deaths are identified, but the specificity is zero, because none of the survivors are correctly identified. At the other extreme, using cut point F would predict that all patients survive, ensuring 100% specificity, but resulting in 0% sensitivity.

We now have all the information to construct the ROC curve for this risk model based only on levels of urgency. The ROC curve is the plot of the sensitivity versus specificity, or more commonly, versus 100 - specificity, connected by line segments (Fig 2).


    Area under the curve statistic
 Top
 Abstract
 Introduction
 Clinical material
 Dichotomous risk factor:...
 Receiver operating...
 Area under the curve...
 Receiver operating...
 Comment
 Acknowledgments
 References
 
The ROC curve in Figure 2 synthesizes the entire spectrum of sensitivity/specificity information in Table 2 into a single summary statistic, the area under the curve. This area, 0.607 in Figure 2, equals the probability that a randomly chosen death will have a higher risk than a randomly chosen survivor [4]. The area under the curve is also called the c-index or c-statistic [1], and is equivalent to the Wilcoxon signed rank statistic [4]. A model with no discrimination ability would have an area of 0.5 under the curve (the diagonal line in Fig 2), meaning that the probability of correctly identifying the death (from a randomly chosen pair) would be 50%.


    Receiver operating characteristic curve for age (continuous risk factor)
 Top
 Abstract
 Introduction
 Clinical material
 Dichotomous risk factor:...
 Receiver operating...
 Area under the curve...
 Receiver operating...
 Comment
 Acknowledgments
 References
 
In the usual risk models, there are not only five categories of risk as in the previous example, but many, defined by all possible combinations of the risk factors in the model. We approximate such a continuous risk model by using patient age, which has many levels. Figure 3 shows a bar graph of patient ages, with survivors shown by gray bars and deaths by black bars. The older ages can be seen to have a higher proportion of deaths. We compute the specificity and sensitivity associated with all possible cut points, which now correspond to individual values of age ranging from 25 to 95 years. The area under the curve for the resulting curve (Fig 4) is 0.705, which is near the lower limit of published risk models. (However, the curve in our model was applied to the date from which it was derived, and for which it was thus optimized.)



View larger version (51K):
[in this window]
[in a new window]
 
Fig 3. Distribution of patients undergoing coronary artery bypass grafting according to patient age. Hospital survivors are shown by gray bars and hospital deaths by black bars.

 


View larger version (15K):
[in this window]
[in a new window]
 
Fig 4. The receiver operating characteristic curve of ages. This receiver operating characteristic curve is based on patient age only, used as a risk model. Each dot represents a single year of age. (AUC = area under the curve.)

 
The ROC curve corresponding to a perfect test would consist of a broken line connecting the lower left, upper left, and upper right corners of the graph. The resulting area under the curve (1.0) would imply 100% discrimination. In this perfect model, the bars representing alive (gray) and dead (black) in Figures 1 and 3 would not overlap but would occupy separate regions of the horizontal axis, with all of the living patients being to the left of all of the dead patients.


    Comment
 Top
 Abstract
 Introduction
 Clinical material
 Dichotomous risk factor:...
 Receiver operating...
 Area under the curve...
 Receiver operating...
 Comment
 Acknowledgments
 References
 
The ROC curve methodology was developed to evaluate signal detection, the ability to separate signal from noise [5]. It was introduced into medicine 30 years ago to address problems in the area of radiographic identification [6]. The ROC curves have been used to assess prognosis in patients with coronary artery disease and in cardiology diagnosis, and more recently to evaluate cardiac surgical risk models. A MedLine search found that it was first used as a Medical Subject Heading abstraction term in 1987, and its use has increased steadily since (Fig 5).



View larger version (51K):
[in this window]
[in a new window]
 
Fig 5. Number of papers abstracted by MedLine with a "ROC curve" Medical Subject Headings, by year of publication.

 
The ROC curves are a natural way of integrating information from a predictive model for a binary outcome. They are easy to use, produce an interesting visual display, and have an intuitive interpretation. The graph allows inspection of the performance of the test over different areas (probabilities) of interest, and helps in the comparison of different models whose ROC curves can be superimposed for visual inspection. A general reference for further reading is an excellent overview by Hanley, with 78 references [7].


    Acknowledgments
 Top
 Abstract
 Introduction
 Clinical material
 Dichotomous risk factor:...
 Receiver operating...
 Area under the curve...
 Receiver operating...
 Comment
 Acknowledgments
 References
 
Thanks to the following Providence Health System hospitals for use of their cardiac surgery data: Washington—Providence Everett Medical Center, Providence Seattle Medical Center, Providence St. Peter Hospital (Olympia), Providence Yakima Medical Center; Oregon—Providence Portland Medical Center, Providence St. Vincent Medical Center (Portland); California—Providence St. Joseph Medical Center (Burbank), Providence Holy Cross Medical Center (Mission Hills); Alaska—Providence Anchorage Medical Center. Special thanks to Vicki Anderson for coordinating the data collection, organization, and merging.


    References
 Top
 Abstract
 Introduction
 Clinical material
 Dichotomous risk factor:...
 Receiver operating...
 Area under the curve...
 Receiver operating...
 Comment
 Acknowledgments
 References
 

  1. Harrell F.E., Jr, Lee K.L., Califf R.M., Pryor D.B., Rosati R.A. Regression modelling strategies for improved prognostic prediction. Statist Med 1984;3:143-152.
  2. Morise A.P., Duval R.D., Detrano R., Bobbio M., Diamond G.A. Comparison of logistic regression and Bayesian-based algorithms to estimate posttest probability in patients with suspected coronary artery disease undergoing exercise ECG. J Electrocardiol 1992;25:89-99.[Medline]
  3. Diamond G.A. Future imperfect: the limitations of clinical prediction models and the limits of clinical prediction. J Am Coll Cardiol 1989;14:12A-22A.
  4. Hanley J.A., McNeil B.J. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982;143:29-36.[Abstract/Free Full Text]
  5. Centor R.M. Signal detectability: the use of ROC curves and their analyses. Med Decision Making 1991;11:102-106.
  6. Lusted L.B. Signal detectability and medical decision-making. Science 1971;171:1217-1219.[Free Full Text]
  7. Hanley J.A. Receiver operating characteristic (ROC) methodology: the state of the art. Crit Rev Diagnostic Imaging 1989;29:307-335.



This article has been cited by other articles:


Home page
CirculationHome page
R. H. Mehta, J. D. Grab, S. M. O'Brien, D. D. Glower, C. K. Haan, J. S. Gammie, E. D. Peterson, and on Behalf of the Society of Thoracic Surgeons Nati
Clinical Characteristics and In-Hospital Outcomes of Patients With Cardiogenic Shock Undergoing Coronary Artery Bypass Surgery: Insights From the Society of Thoracic Surgeons National Cardiac Database
Circulation, February 19, 2008; 117(7): 876 - 885.
[Abstract] [Full Text] [PDF]


Home page
CirculationHome page
R. H. Mehta, J. D. Grab, S. M. O'Brien, C. R. Bridges, J. S. Gammie, C. K. Haan, T. B. Ferguson, E. D. Peterson, and for the Society of Thoracic Surgeons National Card
Bedside Tool for Predicting the Risk of Postoperative Dialysis in Patients Undergoing Cardiac Surgery
Circulation, November 21, 2006; 114(21): 2208 - 2216.
[Abstract] [Full Text] [PDF]


Home page
Eur. J. Cardiothorac. Surg.Home page
G. Varela, A. Brunelli, G. Rocco, R. Marasco, M. F. Jimenez, V. Sciarra, J. L. Aranda, and T. Gatani
Predicted versus observed FEV1 in the immediate postoperative period after pulmonary lobectomy.
Eur. J. Cardiothorac. Surg., October 1, 2006; 30(4): 644 - 648.
[Abstract] [Full Text] [PDF]


Home page
Ann. Thorac. Surg.Home page
A. S. Bryant, R. J. Cerfolio, K. M. Klemm, and B. Ojha
Maximum Standard Uptake Value of Mediastinal Lymph Nodes on Integrated FDG-PET-CT Predicts Pathology in Patients with Non-Small Cell Lung Cancer
Ann. Thorac. Surg., August 1, 2006; 82(2): 417 - 423.
[Abstract] [Full Text] [PDF]


Home page
J. Thorac. Cardiovasc. Surg.Home page
A. Brunelli and G. Rocco
Internal validation of risk models in lung resection surgery: Bootstrap versus training-and-test sampling
J. Thorac. Cardiovasc. Surg., June 1, 2006; 131(6): 1243 - 1247.
[Abstract] [Full Text] [PDF]


Home page
Ann. Thorac. Surg.Home page
P. E. Falcoz, S. Chocron, F. Laluc, M. Puyraveau, D. Kaili, M. Mercier, and J. P. Etievent
Gender analysis after elective open heart surgery: a two-year comparative study of quality of life.
Ann. Thorac. Surg., May 1, 2006; 81(5): 1637 - 1643.
[Abstract] [Full Text] [PDF]


Home page
Ann. Thorac. Surg.Home page
M. Berman, A. Stamler, G. Sahar, G. P. Georghiou, E. Sharoni, R. Brauner, B. Medalion, B. A. Vidne, and A. Kogan
Validation of the 2000 Bernstein-Parsonnet Score Versus the EuroSCORE as a Prognostic Tool in Cardiac Surgery
Ann. Thorac. Surg., February 1, 2006; 81(2): 537 - 540.
[Abstract] [Full Text] [PDF]


Home page
J. Thorac. Cardiovasc. Surg.Home page
F. Langer, M. Bauer, D. Tscholl, R. Schramm, T. Kunihara, H. Lausberg, T. Georg, H. Wilkens, and H.-J. Schafers
Circulating big endothelin-1: An active role in pulmonary thromboendarterectomy?
J. Thorac. Cardiovasc. Surg., November 1, 2005; 130(5): 1342 - 1347.
[Abstract] [Full Text] [PDF]


Home page
CirculationHome page
G. Ambler, R. Z. Omar, P. Royston, R. Kinsman, B. E. Keogh, and K. M. Taylor
Generic, Simple Risk Stratification Model for Heart Valve Surgery
Circulation, July 12, 2005; 112(2): 224 - 231.
[Abstract] [Full Text] [PDF]


Home page
Clin. Cancer Res.Home page
Y. Li, M. A. R. St. John, X. Zhou, Y. Kim, U. Sinha, R. C. K. Jordan, D. Eisele, E. Abemayor, D. Elashoff, N.-H. Park, et al.
Salivary Transcriptome Diagnostics for Oral Cancer Detection
Clin. Cancer Res., December 15, 2004; 10(24): 8442 - 8450.
[Abstract] [Full Text] [PDF]


Home page
ICVTSHome page
C.-C. Chen, C.-C. Wang, S.-R. Hsieh, H.-W. Tsai, H.-J. Wei, and Y. Chang
Application of European system for cardiac operative risk evaluation (EuroSCORE) in coronary artery bypass surgery for Taiwanese
Interactive CardioVascular and Thoracic Surgery, December 1, 2004; 3(4): 562 - 565.
[Abstract] [Full Text] [PDF]


Home page
Ann. Thorac. Surg.Home page
R. J. Cerfolio, A. S. Bryant, T. S. Winokur, B. Ohja, and A. A. Bartolucci
Repeat FDG-PET After Neoadjuvant Therapy is a Predictor of Pathologic Response in Patients With Non-Small Cell Lung Cancer
Ann. Thorac. Surg., December 1, 2004; 78(6): 1903 - 1909.
[Abstract] [Full Text] [PDF]


Home page
Ann. Thorac. Surg.Home page
D. M. Shahian, E. H. Blackstone, F. H. Edwards, F. L. Grover, G. L. Grunkemeier, D. C. Naftel, S. A.M. Nashef, W. C. Nugent, and E. D. Peterson
Cardiac Surgery Risk Models: A Position Article
Ann. Thorac. Surg., November 1, 2004; 78(5): 1868 - 1877.
[Abstract] [Full Text] [PDF]


Home page
Ann OncolHome page
M. R. van Dijk, E. W. Steyerberg, S. P. Stenning, and J. D. F. Habbema
Identifying subgroups among poor prognosis patients with nonseminomatous germ cell cancer by tree modelling: a validation study
Ann. Onc., September 1, 2004; 15(9): 1400 - 1405.
[Abstract] [Full Text] [PDF]


Home page
ChestHome page
F. Verschuren, G. Liistro, R. Coffeng, F. Thys, J. Roeseler, F. Zech, and M. Reynaert
Volumetric Capnography as a Screening Test for Pulmonary Embolism in the Emergency Department
Chest, March 1, 2004; 125(3): 841 - 850.
[Abstract] [Full Text] [PDF]


Home page
J. Thorac. Cardiovasc. Surg.Home page
Y. Wu, G. L. Grunkemeier, and J. R. Handy Jr
Coronary artery bypass grafting: Are risk models developed from on-pump surgery valid for off-pump surgery?
J. Thorac. Cardiovasc. Surg., January 1, 2004; 127(1): 174 - 178.
[Abstract] [Full Text] [PDF]


Home page
Ann. Thorac. Surg.Home page
P.-E. Falcoz, S. Chocron, L. Stoica, D. Kaili, M. Puyraveau, M. Mercier, and J.-P. Etievent
Open heart surgery: one-year self-assessment of quality of life and functional outcome
Ann. Thorac. Surg., November 1, 2003; 76(5): 1598 - 1604.
[Abstract] [Full Text] [PDF]


Home page
ICVTSHome page
G. Varela, N. Novoa, M.F. Jimenez, and G. Santos
Applicability of logistic regression (LR) risk modelling to decision making in lung cancer resection
Interactive CardioVascular and Thoracic Surgery, March 1, 2003; 2(1): 12 - 15.
[Abstract] [Full Text] [PDF]


Home page
Ann. Thorac. Surg.Home page
R. P. Anderson, R. Jin, and G. L. Grunkemeier
Understanding logistic regression analysis in clinical reports: an introduction
Ann. Thorac. Surg., March 1, 2003; 75(3): 753 - 757.
[Full Text] [PDF]


Home page
Cancer Res.Home page
R. M. Adam, T. Danciu, D. L. McLellan, J. G. Borer, J. Lin, D. Zurakowski, M. H. Weinstein, P. H. Rajjayabun, J. K. Mellon, and M. R. Freeman
A Nuclear Form of the Heparin-binding Epidermal Growth Factor-like Growth Factor Precursor Is a Feature of Aggressive Transitional Cell Carcinoma
Cancer Res., January 15, 2003; 63(2): 484 - 490.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to Personal Folders
Right arrow Download to citation manager
Right arrow Author home page(s):
Gary L. Grunkemeier
Right arrow Permission Requests
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Jin, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Grunkemeier, G. L.
Right arrow Articles by Jin, R.
Related Collections
Right arrow Education


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
ANN THORAC SURG ASIAN CARDIOVASC THORAC ANN EUR J CARDIOTHORAC SURG
J THORAC CARDIOVASC SURG ICVTS ALL CTSNet JOURNALS