|
|
||||||||
Ann Thorac Surg 2002;73:1576-1581
© 2002 The Society of Thoracic Surgeons
a Division of Thoracic Surgery and Division of Pulmonology, Hospital de Clínicas "José de San Martín," University of Buenos Aires, Buenos Aires, Argentina
b Department of Pathology, UCLA School of Medicine, Cedars-Sinai Medical Center, Los Angeles, California, USA
Accepted for publication January 5, 2002.
* Address reprint requests to Dr Hugo Esteva, Av. San Martín 1039, (1661) Bella Vista, Provincia de Buenos Aires, República Argentina
e-mail: hesteva{at}intramed.net.ar
| Abstract |
|---|
|
|
|---|
Methods. Ninety-six clinical and laboratory features from each one of 141 patients who underwent lung resection were retrospectively collected. The variables were used as input data for the software. Cases were divided into a training set (n = 113) and a test set (n = 28). Four NN models were trained using the data from the training set: (1) using all variables; (2) using only the Goldman and Torrington scores; (3) using all variables except for the two scores. A fourth NN was programmed with all variables to estimate the development of major postoperative complications. The trained NN models were tested with the test set data.
Results. The NN using all variables with or without the scores were able to correctly classify all 28 test cases against actual outcome. The NN using all variables also estimated major postoperative complications correctly in all 28 test cases. The NN using only two indices (Goldman and Torrington) yielded 6 of 28 errors in classification.
Conclusions. These data suggest that NN can integrate results from multiple data predicting the individual outcome for patients, rather than assigning them to less-precise risk group categories.
| Introduction |
|---|
|
|
|---|
Several risk indices, such as the Torrington [1] and the Goldman [2] indices, have been developed using a variety of clinical and physiologic features, in attempts at stratifying the patients into risk groups. These risk indices accurately classify groups of patients into high-risk and low-risk groups [3], but have limited specificity and sensitivity. Moreover, these risk indices cannot estimate operative risk for individual patients.
Risk of dying or risk of developing major complications during the early postoperative period may be evaluated by multivariate statistical analysis (such as multiple logistic regression) that have been designed to compare population groups and to sort out the varied nature of risk factors and their relative contribution to outcome-eliminating confounders. Logistic regression analysis is only applicable to problems that have binary solution and provides likelihood ratios for each population group [4]. Artificial neural networks (NN) are computerized models designed to emulate intelligence by attempting to reproduce the architecture of the human neural system. They were conceived in the 1940s as early attempts at artificial intelligence, and their use became more prevalent with the advent of powerful and relatively inexpensive microcomputers. The architecture of NN is composed of processing units or neurons organized in parallel layers. The neurons of each layer are connected to those in other layers but not among themselves. The neurons consist of mathematical functions that process a numerical input and provide an output that is transferred to all neurons in another layer. A large number of NN models have been proposed, including backpropagation NN, probabilistic, general regression, polynomial, Kohonen, and others. The first neuronal layer of an NN is the input layer, composed of a variable number of processing numbers that compute the data collected from observations. Additional neuronal layers compose the hidden layers, constructed to generate a variable number of numerical combinations. The last neuronal layer of an NN is the output layer that generates numbers that represent the output or answer generated by the system. The outputs of the input and hidden neuronal layers are kept in a matrix of numbers, the so-called neuronal connections. This matrix represents the knowledge of an NN. Most NN models are based on the concept of supervised training. The NN are provided with a set of data that includes known answers or correct output. The NN processes the data multiple times, progressively correcting the various neuronal connections, until it arrives automatically at the numerical combination that yields the best possible output. At this point the NN is trained. The trained NN can be then used to test cases with answers that are known to the investigator but not to the system. This testing process validates the accuracy of the model. Thereafter, the NN can be used to test true unknown cases in prospective studies.
Neural networks have been used in one of our laboratories for the estimation of the long-term prognosis of patients with lung, breast, and colon cancer, using multiple clinical, pathologic, immunocytochemical, and molecular features [57]. Neural networks provide prognostic information for individual patients, in contrast to multivariate statistical methods designed to calculate the risk of various patient populations.
This retrospective study was designed to develop probabilistic NN that estimate the immediate postoperative prognosis of patients undergoing resection at the University Hospital of Buenos Aires, analyzing a large number of clinical and physiologic variables.
| Material and methods |
|---|
|
|
|---|
Ninety-six preoperative clinical, laboratory, and spirometric categorical and numerical (continuous) variables (Tables 1,2) required to build up Goldman index (cardiac risk), Torrington index (respiratory risk), and ergometric performance (cardiorespiratory risk) were retrospectively reviewed from the clinical records. Goldman index includes clinical, electrocardiographic, and laboratory data scored from 0 to 25; the final result is graded into four categories of increasing risk. Torrington index includes clinical and spirometric data scored from 0 to 12; it is graded into three categories of increasing risk. Ergometric performance is measured in metabolic equivalents (METS); 1 MET is basal oxygen consumption at rest, equivalent to 3.3 mL · kg-1 · min-1. Patients classified as Goldman grades III and IV, Torrington grade III, or patients who could not develop more than 3 METS during the ergometric test were classified as high-risk patients.
|
|
Neural network design
Four different probabilistic NN models were trained using Neuroshell 2 software (Ward Systems Group, Frederick, MD). Probabilistic NN models are superior to the more widely used backpropagation NNs for the analysis of relatively sparse, probably nonlinear data. Probabilistic NN apply mathematical functions based on Bayesian analysis, similar to those used in multivariate statistical methods such as linear discriminant analysis. Neural networks were designed with three neuronal layers: input, and hidden and output layers.
Neural network designed to estimate postoperative survival
Three NN were trained to estimate whether the patients would be dead or alive at 30 postoperative days. One NN used all 96 variables as input features. A second NN used all variables except for the Goldman and Torrington indices (ie, 92 data points for each patient). The third NN used only the two indices as input features. The hidden layer was composed of 141 neurons in all models selected to generate a large number of possible numerical combinations. The output layer of the first two NN models is composed of two neurons, one representing the output "dead," the other, "alive." The data were divided randomly into two groups: a training set used to develop the NN model (n = 113 cases) and a test set (n = 28 cases) used to validate the system. The NN was trained using genetic algorithms. No initial weight was assigned to any of the variables. The system automatically develops during the training process various connection strengths that assign more value to some variables than to others.
Neural networks learn by adjusting the values in the neuronal connection matrix during iterative training cycles, until the optimal combination that yields the desired output is obtained. This process can be achieved by a very lengthy process of iteration, modifying each of the values during each training cycle by a preset number, or learning step. Various models of genetic computing have been developed to optimize the training process by developing an optimal set of values for analysis of data using rules that emulate genetics. Techniques of selection, crossover, deletion, mutation, and others are applied to the various numbers in the neuronal connection matrix, optimizing the training process. Our probabilistic NN was trained using the genetic algorithm option provided by Neuroshell 2 software; training was completed after 68 training cycles.
Neural network designed to estimate the development of major postoperative complications
A fourth NN was trained using similar training and test sets and the same methodology described above to estimate the development of 11 major postoperative complications during the early 30-day postoperative period. The same 96 variables for each patient were included. Major complications considered were (1) pneumonia, (2) respiratory insufficiency, (3) respiratory distress, (4) pulmonary embolism, (5) myocardial infarction, (6) ventricular tachyarrhythmia, (7) ventricular fibrillation, (8) left cardiac failure, (9) cardiac death, (10) sepsis, and (11) severe surgical complications. Severe surgical complications were defined as complications because of technical reasons causing threat of death or prolonging hospitalization.
Statistical analysis
The data were analyzed with simple regression to determine which dependent variables had a statistically significant correlation with the independent variable "postoperative death." The dependent variables that were statistically significant by univariate analysis were studied with multivariate regression analysis. A logistic regression model was constructed with the statistically significant variables selected by the previous multivariate method. Data analysis were performed using Statgraphics Plus, version 2.0 software (Manugistics, Rockville, MD).
| Results |
|---|
|
|
|---|
|
|
Analyzing these 13 variables with multivariate regression analysis yielded a statistically significant model (p < 0.002; df = 13; F ratio, 2.82). However, only the variable Torrington grade contributed significantly to this model (p < 0.032). A logistic regression model using this variable was also statistically significant (p < 0.000; df = 1;
2, 20.639; estimated odds ratio, 15.921).
The NN model trained to estimate death versus survival with all variables and with all the variables except for the Goldman and Torrington grades classified correctly all 28 test cases (100% sensitivity and specificity). Otherwise the NN model trained to estimate the same dependent variables but using only the two indices as input features did not provide reliable classifications; as 6 of 28 test cases were misclassified. The NN trained to estimate the development of major complications classified all 28 test cases correctly.
| Comment |
|---|
|
|
|---|
Neural networks are best trained by using as large a database of information as possible, with numerous features of information and a large number of examples to learn from. They are different than multivariate statistical methods in that they are more flexible and can learn by adaptation.
Classifications generated by artificial NN can be used in medicine to develop objective classifications of illnesses or to estimate prognosis [1519]. For example, artificial NN have been applied in research studies for automatic tumor classifications based on data collected with image analysis methods from histologic and cytologic slides, and are currently used in new computer-based automated systems for the interpretation of gynecologic cytology smears that have been approved by the U.S. Federal Drug Administration for quality assurance in gynecologic cytology [10]. They can also be used for the development of multivariate prognostic models of survival, local tumor recurrence, or the possibility of lymph node metastasis; for the selection of optimized groups of diagnostic tests; or to predict the response of patients to particular treatments and other questions that confront physicians on a daily basis [1519].
The numerical data collected from patients are usually organized in a spreadsheet. Discrete data such as presence or absence of a symptom can be represented numerically such as 1 (present) or 0 (absent). The data are divided during the design of a study into two random groups: training and testing sets. A proportion of the cases (usually 70% to 80%) is presented to the system for adaptive learning, and the test cases are saved as unknowns to study the accuracy of the system. The software also has a facility that allows investigators to label certain columns in the data as input and others as output (the known answers provided to the system during learning).
The output layer could have a single neuron that estimates prognosis. All values calculated by the output neuron as 0 to 0.49 are equivalent to "patient alive," whereas outputs ranging from 0.5 to 1 are equivalent to "patient dead." After the architecture of the NN is defined, we proceed to study the data from a group of patients who have been operated on, using some of the data for training. The NN normalizes all data to values ranging from 0 to 1 and processes each row one at a time. During the first learning cycle each input neuron reads the value of its corresponding prognostic feature and calculates a numerical output, using a mathematical function, that is presented to the output neuron as a connection strength. The output neuron processes these given values, using a mathematical function, and calculates a value ranging from 0 to 1. The artificial NN compares this value with the correct answer (0 to 0.49 or 0.5 to 1) provided by the investigator. The training process is therefore supervised. Unsupervised training paradigms learn without the benefit of exposing the artificial NN to the correct answers during the training process. The network repeats the learning cycle but this time it modifies each connection strength using a mathematical function. The process is repeated multiple times until the artificial NN estimates the best possible combination of the connection strengths that will yield the fewest classification errors. At that time the artificial NN is trained and is no longer modified. The data from each test patient are then presented to the trained system, and analyzed in a single cycle. The results estimated by the artificial NN can then be compared with the answers already known to the investigator. If the trained NN is considered reliable, it can then be used to estimate prospectively the prognosis of other patients.
This example describes two layers of neurons composed of only a few processing elements each. However, in reality much larger numbers of possible combinations of connection strengths are needed to arrive at correct classifications. Additional combinations are created by designing artificial NN that have one or more layers of hidden neurons, as explained above.
Our preliminary data suggest that probabilistic NN can estimate with accuracy the early postoperative mortality risk of individual patients undergoing lung resection (pneumonectomy, lobectomy, minor resection). The immediate postoperative prognosis of these patients correlates with the Torrington and the Goldman risk indices and exercise test results, as previously reported by us [3]. Our division policy has always been not to deny operations to patients at high risk of postoperative complications when they were cancer patients with reasonable chance of oncologic survival after the operation. So, our current population includes a significant proportion of patients in high-risk categories, who might not be considered as surgical candidates at other institutions. In those cases we have thoroughly discussed the clinical situation with patients and families and accepted to take the risk under their consent. Approximately 5% of patients with Torrington grade II and 40% of patients with Torrington grade III died postoperatively (Table 4), but absolute reliance on this index may deny the potential benefits of surgical resection to the larger proportions of patients who survived the surgical procedures. In addition, risk indices and exercise tests can tell about a percentage of chances for a patient as member of a risk group, but tell less about the specific prognosis for a given individual. So, a more precise and individualized system would be welcome in everyday practice.
Neural networks may offer a novel prognostic tool that will further estimate the potential prognosis of these particular patients, thus avoiding highly predictable postoperative deaths. Thoughtful, individualized clinical judgment is unlikely to be replaced in the foreseeable future by computer-based prognostic models. However, artificial NN could provide very valuable objective information for surgical decision-making regarding whether to subject high-risk patients to major pulmonary resections. Prospective studies of a larger group of patients from several institutions are needed to validate our preliminary results before this methodology can be applied in everyday practice.
It is important to study the true predictive value of artificial NN, their so-called generalization rate, before they are used in clinical practice [17]. Trained artificial NN need to be tested in prospective studies analyzing large numbers of patients to understand their reliability in real life. Analysis of multivariate data with various statistical methods and NN technology can yield spurious results because of multiple combination of random events. It has been estimated theoretically that the number of training cases required to develop robust multivariate statistical classificatory methods that generalize to large populations is approximately 10 times the number of features in the model [20]. It is still controversial whether these theoretical considerations apply to artificial NN technology, but this requirement could pose daunting practical challenges.
Our current models analyze a rather large and probably somewhat redundant number of variables from each patient to allow for comparisons between NN and the risk indices that are currently used in some institutions to stratify the surgical risk of patients with lung cancer. Future NN models probably need to include fewer input variables. Other computational analysis tools, such as a data-mining approach [21], could also be useful for that purpose.
Artificial NN useful to predict individualized early morbidity and mortality of lung resections based on a large number of clinical variables can be developed. This retrospective model estimated correctly the immediate postoperative outcome of a small number of test patients assessing on individual prognosis.
An additional attractive aspect of this kind of model is its capability to specifically predict outcome in each surgical institution, regarding its technological and quality-of-care levels. To some extent, it provides a quality test, ie, no unpredictable deaths should happen unless technical mistakes are made.
Prospectively used, NN could help to decide which patient should receive special preoperative preparation and provide objective evaluation to avoid fatal operation in nonresponders. However, a prospective multiinstitutional study with a larger number of patients should be conducted to confirm our results.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
A. Brunelli, N. J. Morgan-Hughes, M. Refai, M. Salati, A. Sabbatini, and G. Rocco Risk-adjusted morbidity and mortality models to compare the performance of two units after major lung resections J. Thorac. Cardiovasc. Surg., January 1, 2007; 133(1): 88 - 96. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Shaw Genetics of postoperative complications following thoracic surgery. Seminars in Cardiothoracic and Vascular Anesthesia, December 1, 2006; 10(4): 327 - 345. [Abstract] [PDF] |
||||
![]() |
A. Brunelli, A. Fianchini, M. Al Refai, and M. Salati A model for the internal evaluation of the quality of care after lung resection in the elderly Eur. J. Cardiothorac. Surg., May 1, 2004; 25(5): 884 - 889. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| ANN THORAC SURG | ASIAN CARDIOVASC THORAC ANN | EUR J CARDIOTHORAC SURG |
| J THORAC CARDIOVASC SURG | ICVTS | ALL CTSNet JOURNALS |