Advertisement
Original Article| Volume 37, ISSUE 3, P415-422, March 2023

Independent External Validation of a Preoperative Prediction Model for Delirium After Cardiac Surgery: A Prospective Observational Cohort Study

Open AccessPublished:December 05, 2022DOI:https://doi.org/10.1053/j.jvca.2022.11.038

      Objective

      This investigation provided independent external validation of an existing preoperative risk prediction model.

      Design

      A prospective observational cohort study of patients undergoing cardiac surgery covering the period between April 16, 2018 and January 18, 2022.

      Setting

      Two academic hospitals in Switzerland.

      Participants

      Adult patients (≥60 years of age) who underwent elective cardiac surgery, including coronary artery bypass graft, mitral or aortic valve replacement or repair, and combined procedures.

      Interventions

      None.

      Measurements and Main Results

      The primary outcome measure was the incidence of postoperative delirium (POD) in the intensive or intermediate care unit, diagnosed using the Intensive Care Delirium Screening Checklist. The prediction model contained 4 preoperative risk factors to which the following points were assigned: Mini-Mental State Examination (MMSE) score ≤23 received 2 points; MMSE 24-27, Geriatric Depression Scale (GDS) >4, prior stroke and/or transient ischemic attack (TIA), and abnormal serum albumin (≤3.5 or ≥4.5 g/dL) received 1 point each. The missing data were handled using multiple imputation. In total, 348 patients were included in the study. Sixty patients (17.4%) developed POD. For point levels in the prediction model of 0, 1, 2, and ≥3, the cumulative incidence of POD was 12.6%, 22.8%, 25.8%, and 35%, respectively. The validation resulted in a pooled area under the receiver operating characteristics curve of 0.60 (median CI, 0.525-0.679).

      Conclusions

      The evaluated predictive model for delirium after cardiac surgery in this patient cohort showed only poor discriminative capacity but fair calibration.

      Key Words

      WITH APPROXIMATELY 80 MILLION surgical procedures performed in Europe each year, postoperative delirium (POD) is a major complication of surgery, and poses a significant burden for patients, families, medical, and nursing staff, as well as the healthcare system.
      • Saczynski JS
      • Marcantonio ER
      • Quach L
      • et al.
      Cognitive trajectories after postoperative delirium.
      • Aitken SJ
      • Blyth FM
      • Naganathan V.
      Incidence, prognostic factors and impact of postoperative delirium after major vascular surgery: A meta-analysis and systematic review.
      • Goettel N
      • Steiner LA.
      Postoperatives delirium: Früherkennung, prävention und therapie.
      Older patients undergoing surgery are more vulnerable to adverse postoperative outcomes due to advanced age, frailty, and medical comorbidities.
      • Story DA
      • Leslie K
      • Myles PS
      • et al.
      Complications and mortality in older surgical patients in Australia and New Zealand (the REASON study): A multicentre, prospective, observational study.
      Postoperative delirium is characterized by an acutely developing and fluctuating disturbance of awareness, attention, and cognition, and is classified as a postoperative neurocognitive disorder according to the new nomenclature.
      • Evered L
      • Silbert B
      • Knopman DS
      • et al.
      Recommendations for the nomenclature of cognitive change associated with anaesthesia and surgery-2018.
      Although POD is an acute and transient condition, it has a serious impact on the outcome and prognosis of patients.
      • Menzenbach J
      • Guttenthaler V
      • Kirfel A
      • et al.
      Estimating patients' risk for postoperative delirium from preoperative routine data - Trial design of the PRe-Operative prediction of postoperative DElirium by appropriate SCreening (PROPDESC) study - A monocentre prospective observational trial.
      Numerous epidemiologic studies reported widely divergent data on the incidence of POD, depending on the cohort of patients studied (eg, older versus younger patients), the type of surgical procedure, and treatment modalities (eg, elective versus emergency surgery).
      • Aldecoa C
      • Bettelli G
      • Bilotta F
      • et al.
      European Society of Anaesthesiology evidence-based and consensus-based guideline on postoperative delirium.
      However, POD occurs predominantly after cardiac surgery,
      • Lin Y
      • Chen J
      • Wang Z.
      Meta-analysis of factors which influence delirium following cardiac surgery.
      ,
      • Hollinger A
      • Siegemund M
      • Goettel N
      • et al.
      Postoperative delirium in cardiac surgery – an unavoidable menace?.
      with a reported incidence between 6% and 56%.
      • Cereghetti C
      • Siegemund M
      • Schaedelin S
      • et al.
      Independent predictors of the duration and overall burden of postoperative delirium after cardiac surgery in adults: An observational cohort study.
      Previous studies have shown that POD partially can be prevented by a targeted risk intervention strategy consisting of several components.
      • Inouye SK
      • Bogardus Jr, ST
      • Charpentier PA
      • et al.
      A multicomponent intervention to prevent delirium in hospitalized older patients.
      • Salvi F
      • Young J
      • Lucarelli M
      • et al.
      Non-pharmacological approaches in the prevention of delirium.
      • Hshieh TT
      • Yue J
      • Oh E
      • et al.
      Effectiveness of multicomponent nonpharmacological delirium interventions: A meta-analysis.
      In light of continuous increases in the older population, given demographic aging in industrialized countries and clear interests in improving delirium care, an accurate POD prediction model may be a powerful tool to facilitate the early implementation of prevention measures in clinical practice.
      • Menzenbach J
      • Guttenthaler V
      • Kirfel A
      • et al.
      Estimating patients' risk for postoperative delirium from preoperative routine data - Trial design of the PRe-Operative prediction of postoperative DElirium by appropriate SCreening (PROPDESC) study - A monocentre prospective observational trial.
      Over the past few decades, numerous prediction models of POD,
      • Gosselt AN
      • Slooter AJ
      • Boere PR
      • et al.
      Risk factors for delirium after on-pump cardiac surgery: A systematic review.
      such as the preoperative prediction model by Rudolph et al.,
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      have been developed for cardiac surgery. From a clinical standpoint, their prediction model appeared to be practical as it was based on just the following 4 risk factors: impaired cognition, depressive symptoms, prior stroke or TIA, and abnormal serum albumin.
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      Nevertheless, most of these prediction models either completely lacked internal or external validation
      • van Meenen LC
      • van Meenen DM
      • de Rooij SE
      • et al.
      Risk prediction models for postoperative delirium: A systematic review and meta-analysis.
      ,
      • Lee A
      • Mu JL
      • Joynt GM
      • et al.
      Risk prediction models for delirium in the intensive care unit after cardiac surgery: A systematic review and independent external validation.
      or only have been validated in a single external cohort (eg, the Rudolph et al. model).
      • Adibi A
      • Sadatsafavi M
      • Ioannidis JPA.
      Validation and utility testing of clinical prediction models: Time to change the approach.
      These findings were consistent with results from systematic reviews in which the internal and external validations were performed a third (36%)
      • Bouwmeester W
      • Zuithoff NP
      • Mallett S
      • et al.
      Reporting and methods in clinical prediction research: A systematic review.
      and a quarter (25%-29%)
      • Bouwmeester W
      • Zuithoff NP
      • Mallett S
      • et al.
      Reporting and methods in clinical prediction research: A systematic review.
      ,
      • Siontis GC
      • Tzoulaki I
      • Castaldi PJ
      • et al.
      External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination.
      of the time, respectively. Furthermore, the rate of prospective external validation of new risk-prediction models within 5 years after publication is small (16%).
      • Siontis GC
      • Tzoulaki I
      • Castaldi PJ
      • et al.
      External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination.
      A potential reason for the limited validations could be the much stronger academic incentives for the development of new models rather than the validation of previously published models.
      • Wessler BS
      • Nelson J
      • Park JG
      • et al.
      External validations of cardiovascular clinical prediction models: A large-scale review of the literature.
      However, it is essential, as well as mandatory, to test the generalizability of a model and to retest it according to new data in order to understand its robustness to distributional shifts over time and its settings before implementing it in clinical practice.
      • Moons KG
      • Kengne AP
      • Woodward M
      • et al.
      Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio)marker.
      • Debray TP
      • Vergouwe Y
      • Koffijberg H
      • et al.
      A new framework to enhance the interpretation of external validation studies of clinical prediction models.
      • Toll DB
      • Janssen KJ
      • Vergouwe Y
      • et al.
      Validation, updating and impact of clinical prediction rules: A review.
      Likewise, previously existing prediction models should be tested prior to implementation.
      The present study aimed to externally validate the Rudolph et al. preoperative prediction model (hereafter “the original model”)
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      in a prospective cohort study of patients who had undergone cardiac surgery.

      Methods

      The study authors conducted and reported this prospective observational cohort study according to the Transparent Reporting of a multivariate prediction model for Individual Prognosis or Diagnosis guidelines.
      • Moons KG
      • Altman DG
      • Reitsma JB
      • et al.
      Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): Explanation and elaboration.
      The study protocol (No. 2020-00848) was approved by the institutional review board (Ethikkommission Nordwest- und Zentralschweiz) on July 27, 2020. A prior requirement for informed consent was later waived by Ethikkommission Nordwest- und Zentralschweiz.

      Design and Selection Criteria

      This broad prospective validation study was conducted at 2 academic medical centers in Basel and Zurich, Switzerland. The inclusion and exclusion criteria were identical to the derivation cohort used in the original model of Rudolph et al.
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      Briefly, the authors included patients aged ≥60 years who underwent elective cardiac surgery, including coronary artery bypass graft, mitral or aortic valve replacement or repair, and combined procedures. The exclusion criteria were non-German speaking, living >60 miles from the study center, emergency surgery, delirium before surgery, concurrent aortic or carotid surgical procedures, and medical instability limiting preoperative assessment.

      Study Participants

      The authors consecutively included 279 patients at the University Hospital Basel from April 16, 2018 to January 18, 2022, and 69 patients at the University Hospital Zurich from January 13, 2021 to January 18, 2022. The recruitment and inclusion process is shown in Figure 1.

      Preoperative Assessment

      The 4 preoperative predictors from the original model,
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      including the Mini-Mental State Examination (MMSE; range: 0-30 points, 0 = worst), the Geriatric Depression Scale (GDS; range: 0-15 points, 15 = worst), history of TIA and/or stroke, and serum albumin concentration were assessed during the routinely held preoperative anesthesia consultation. Demographic factors, age at the time of surgery, sex, and type of surgery were collected from the electronic medical record.

      Outcome

      The primary outcome was the incidence of delirium after cardiac surgery. POD was diagnosed using the Intensive Care Delirium Screening Checklist (ICDSC) with a score of ≥4 points (maximum score = 8) during the intensive care unit (ICU) or intermediate care unit stay. The ICDSC was administered 3 times per day by trained nursing staff, blinded to the predictor variables, until the patient was discharged from the ICU or intermediate care unit. The ICDSC is an 8-item screening instrument based on the Diagnostic and Statistical Manual of Mental Disorders (DSM)-IV-TR criteria, which was specifically designed for the intensive care setting.
      • Devlin JW
      • Fong JJ
      • Schumaker G
      • et al.
      Use of a validated delirium assessment tool improves the ability of physicians to identify delirium in medical intensive care unit patients.
      The checklist contains the following items, which are rated as absent or present: (1) consciousness (ie, comatose, stuporous, awake, or hypervigilant); (2) orientation; (3) hallucinations or delusions; (4) psychomotor activity; (5) inappropriate speech or mood; (6) attentiveness; (7) sleep-wake cycle disturbances; and (8) fluctuation of symptoms. The items are rated on the patient's behavior at the time of screening, and interrater reliability among intensive care staff is considered adequate.
      • Bergeron N
      • Dubois MJ
      • Dumont M
      • et al.
      Intensive Care Delirium Screening Checklist: evaluation of a new screening tool.

      Surgical Procedures

      All patients underwent cardiac surgery under general anesthesia. The anesthesia protocol, the operative procedure, and the postoperative care (eg, pain control) were performed according to local hospital policies and practice protocols. The use of aortic cross-clamp, cardiopulmonary bypass, high-dose heparin, and hypothermia was at the discretion of the attending surgeon. The intraoperative data were extracted from the surgical notes.

      Sample Size

      There are no generally accepted approaches or empirical evidence to estimate the sample size requirements for validation studies of risk prediction models.
      • Moons KG
      • Altman DG
      • Reitsma JB
      • et al.
      Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): Explanation and elaboration.
      Therefore, the authors determined their sample size according to the events per variable rule. This common rule of thumb was originally adapted to ensure stability in regression covariates and postulates that at least 10 events (cases with POD) must occur for each candidate predictor in the model.
      • Copas JB.
      Using regression models for prediction: Shrinkage and regression to the mean.
      In the authors’ analysis, they included 15 patients with POD per predictor variable. Therefore, the required sample size was a minimum of 60 patients presenting with POD (4 predictors × 15 events).

      Missing Data

      In the overall cohort, data on POD were missing in 5%, education was missing in 7%, GDS in 8%, MMSE in 6%, and the serum albumin concentration in 2%. There were no missing values of age, sex, and history of TIA and/or stroke. The authors assumed the missing data occurred at random, and they performed multiple imputations using the multivariate imputation by chained equation procedure with the predictive mean-matching method. The missing values were predicted based on the demographic variables (ie, age, sex, and education), all predictor variables, and outcome. The continuous variables were maintained as continuous in the imputation and only subsequently categorized for the final predictive model. In accordance with the original model, the authors created 20 multiple imputed datasets.
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      They reported all results from the pooled dataset. Rubin's rules were used to pool the regression coefficient estimates from the imputed datasets. The authors also reported the results of the original dataset with missing data.

      Statistical Analysis

      For descriptive analysis, all continuous variables are presented as mean ± SD. The categorical variables are reported as frequencies and percentages. The preoperative characteristics of patients from Basel were compared to those recruited from Zurich using a t test for the continuous variables. The categorical variables were compared with a chi-square test. Before applying the clinical prediction model, which was developed in a previous study, to the overall cohort dataset, the continuous risk factors were categorized using identical clinically meaningful cutoff points as used in the original model.
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      Therefore, GDS was dichotomized at >4 points, which indicates clinical depression. The MMSE was categorized as not impaired (range: 28-30 points), mild impairment (range: 24-27 points), and definitive impairment (≤23 points). The variables TIA and/or history of stroke were combined into one variable. Serum albumin concentration was classified into a normal value (3.6-4.4 g/dL) versus an abnormal value (≤3.5 or ≥4.5 g/dL). The clinical prediction model points were assigned as follows: MMSE ≤23 points received 2 points; MMSE 24 to 27 points, GDS >4 points, prior stroke/TIA, and abnormal serum albumin received 1 point each.
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      The incidence of POD is presented with increasing clinical prediction model points and a risk ratio relative to the lowest risk group. The summary statistics of the original model in the derivation cohort are based on the bootstrapping method, which was used for variable selection. Because the authors did not perform variable selection (model selection), they did not require bootstrapping. However, to make the results of the derivation cohort comparable to their validation cohort, the authors calculated the raw risk ratio, including associated CIs of the prediction model for each score in their cohort and the derivation cohort of Rudolph et al. For model validation, the authors assessed the model performance using measures of discrimination and calibration. In the dataset, they assessed model discrimination with the area under the receiver operating characteristic curve (AUROC; identical to the c-statistics) in each imputed dataset, and reported the median AUROC. Calibration was assessed using the Hosmer-Lemeshow test for goodness of fit in the imputed datasets. In a sensitivity analysis, the authors examined the c-statistics, excluding “off-pump” patients. All analyses were computed using IBM SPSS Statistics V.28.0.1.0 (IBM SPSS, Inc, Armonk, NY) for Windows.

      Results

      Participants

      Among the 348 patients in this combined external validation cohort, 17.4% (n = 60) developed POD after cardiac surgery. The baseline characteristics of patients from Basel and Zurich were similar, with the exception that patients from Zurich had a slightly higher incidence of POD. Compared to Zurich, patients from Basel were more likely to be female patients, have a low serum albumin concentration, and present with more depressive symptoms (Table 1). The mean patient age at surgery was 70.9 ± 5.7 years. Twenty-two patients underwent “off-pump” surgery.
      Table 1Baseline Characteristics of the External Swiss validation Cohort and the Derivation Cohort of Rudolph and Colleagues
      CharacteristicBasel Cohort (n = 279)Zurich Cohort (n = 69)All (n = 348)Derivation Cohort
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      (n = 122)
      Data collection periodApril 2018-January 2022January 2021-January 2022April 2018-January 2022September 2002-October 2004
      Study designProspective cohortProspective cohortProspective cohortProspective cohort
      SettingAcademic medical center in Basel, SwitzerlandAcademic medical center in Zurich, Switzerland2 academic medical centers in Switzerland2 academic medical centers and 1 VA hospital
      OutcomePresence of PODPresence of PODPresence of PODPresence of POD
      Reference standardICDSC score ≥4ICDSC score ≥4ICDSC score ≥4CAM/CAM-ICU
      Incidence of POD42 (15.1%) Missing: 1718 (26.1%)60 (17.4%) Missing: 1763 (52%)
      Age, y71.0 (5.7)70.4 (5.7)70.9 (5.7)74.7 (6.3)
      Female sex62 (22.0%)10 (14.5%)72 (20.7%)25 (20%)
      Education, y
      Maximum is 20 years of education.
      13.1 (3.4)14.0 (3.3) Missing: 2313.2 (3.4) Missing: 23-
      Education was reported as follows: <high school: 19 (17%); high school: 44 (36%); >high school: 59 (49%).
      TIA/stroke40 (14.3%)12 (17.4%)52 (14.9%)26 (22%)
      GDS1.6 (1.8) Missing: 291.2 (1.6)1.5 (1.8) Missing: 293.3 (3.0)
      MMSE28.4 (1.5) Missing: 2328.3 (1.8)28.4 (1.6) Missing: 2326.9 (2.6)
      Albumin concentration, g/dLMissing: 2Missing: 4Missing: 6
       3.6-4.4 (normal value)219 (78.5%)52 (75.4%)271 (78.0%)61 (64%)
       ≤ 3.5 or ≥4.5 (abnormal value)58 (21.0%)13 (19.0%)71 (20.4%)34 (36%)
      NOTE. Data are shown as mean (SD) or n (%).
      Abbreviations: CAM, Confusion Assessment Method; CAM-ICU, Confusion Assessment Method for Intensive Care Unit; GDS, Geriatric Depression Scale; ICDSC, Intensive Care Delirium Screening Checklist; MMSE, Mini-Mental State Examination; POD, postoperative delirium; TIA, transient ischemic attack; VA, Veteran's Affairs.
      low asterisk Maximum is 20 years of education.
      Education was reported as follows: <high school: 19 (17%); high school: 44 (36%); >high school: 59 (49%).
      In comparison to the original model in the derivation cohort, patients in this study were slightly younger (70.9 ± 5.7 v 74.7 ± 6.3 years), mostly male patients (79.3%), and showed a much lower incidence of POD (17.4% v 52%). The prevalence of TIA and/or stroke was lower (14.9% v 22%) for the authors’ cohort, as well as the mean GDS (1.5 ± 1.8 v 3.3 ± 3.0 points). The mean MMSE was higher (28.4 ± 1.6 v 26.9 ± 2.6 points). Moreover, the authors’ cohort had a higher percentage of the normal value of serum albumin concentration, but the abnormal serum albumin values were lower. Furthermore, most of the patients in their study had a high level of education (Table 1), similar to that reported by Rudolph et al.
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.

      External Validation

      The authors calculated the clinical prediction model points and applied them to the overall Swiss cohort. The increasing risk score was associated with an increased risk of POD. The number of patients with a score ≥3 was far too small (6 patients) and was not representative. However, POD was identified in 12.6% with a low-risk score, 22.8% with a moderate-risk score, 25.8% with a high-risk score, and 35% with a very-high-risk score. When applying the risk stratification system with no points as reference, the presence of ≥1 point increased the delirium risk by 1.5; 2 points or more doubled the delirium risk, and ≥3 points more nearly tripled the delirium risk (Table 2). The Hosmer-Lemeshow test for goodness of fit showed good agreement between the observed numbers and numbers estimated in the logistic regression model 1.000 (χ2 = 0.000) in the imputed datasets. The median AUROC (identical to the c-statistics) was 0.60 (median CI, 0.525-0.679). Graphical representation of discrimination is shown in Figure 2. In the original dataset with missing data, the Hosmer-Lemeshow test showed good agreement between the observed numbers and numbers estimated in the logistic regression model 1.000 (χ2 = 0.000) as well, and the AUROC was 0.60 (95% CI, 0.524-0.681). Excluding “off-pump” patients, the median AUROC was 0.61 (median CI, 0.530-0.685) in the imputed dataset; in the original dataset with missing data, the AUROC was 0.61 (95% CI, 0.529-0.688). Overall, compared to Rudolph et al., there was a degradation of model performance in the authors’ validation cohort. The β coefficients for the logistic model based on the 4 preoperative predictors are presented in Table 1 in the supplement.
      Table 2Performance of the Clinical Prediction Model in the Swiss External Validation Cohort Compared to the Derivation Cohort of Rudolph and Colleagues
      Risk GroupPrediction Model PointsDelirium RateRisk Ratio (95% CI)
      The authors applied formulae for a single sample.
      C-Statistic
      Swiss validation cohort (n = 348)0.60
      021.5/170 (12.6%)Reference
      129.5/129 (22.8%)1.8 (1.1-3.0)
      211.1/43 (25.8%)2.0 (1.1-3.9)
      ≥32.1/6 (35%)2.8 (0.9-8.8)
      Derivation cohort (n = 122)
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      0.74
      05/25 (19%)Reference
      120/44 (47%)2.3 (1.0-5.3)
      223/36 (63%)3.2 (1.4-7.3)
      ≥315/18 (86%)4.2 (1.9-9.4)
      low asterisk The authors applied formulae for a single sample.
      Fig 2
      Fig 2Area under the receiver operator characteristic curve (AUROC) showing the ability of the delirium prediction model by Rudolph et al. to correctly classify those with and without postoperative delirium after cardiac surgery in the underlying independent external Swiss validation cohort. AUROC = 0.5 indicates no discrimination, whereas AUROC = 1.0 indicates perfect discrimination. The black dotted reference line refers to no discrimination.

      Discussion

      The aim of this prospective observational study was to externally validate a previously published clinical prediction model for predicting POD in an independent cohort of cardiac surgery patients in Switzerland, in line with recent framework guidelines.
      • Collins GS
      • Reitsma JB
      • Altman DG
      • et al.
      Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. The TRIPOD Group.
      Independent of the authors’ agreement with the inclusion and exclusion criteria according to Rudolph et al.,
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      the prediction model validated in their contemporary patient cohort was conflicting in that it showed fair calibration but a degradation (AUROC = 0.60) in the prediction of POD after cardiac surgery. To observe substantial decrements in discrimination during validations (compared with performance on the derivation dataset) was not surprising, as it was in line with previous reports.
      • Siontis GC
      • Tzoulaki I
      • Castaldi PJ
      • et al.
      External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination.
      ,
      • Damen JA
      • Hooft L
      • Schuit E
      • et al.
      Prediction models for cardiovascular disease risk in the general population: Systematic review.
      There were several potential reasons for this. First, the observed magnitude of the AUROC may be explained by case mix and heterogeneity in the characteristics of the cohorts/populations. There was variability in the derivation and the authors’ validation cohort, especially in the outcome measure of POD (52% v 17.4%), as well as in the predictor variables. In comparison to the original model of Rudolph et al.,
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      patients undergoing cardiac surgery in the authors’ sample reported fewer depressive symptoms (1.5 ± 1.8 v 3.3 ± 3.0 points), showed a lower prevalence of TIA and/or stroke (14.9% v 22%), and performed better on the MMSE (28.4 ± 1.6 v 26.9 ± 2.6 points). Moreover, the authors’ cohort had a higher percentage of normal-value serum albumin concentrations. However, the abnormal serum albumin values were lower compared to the original model.
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      In addition, Rudolph et al. validated their prediction model in a US population, whereas the authors evaluated the prediction model in Switzerland. However, according to a previous large-scale review, this substantially larger decrease in discriminatory performance might be expected to be more pronounced when models are evaluated in populations that are dissimilar to the derivation population.
      • Wessler BS
      • Nelson J
      • Park JG
      • et al.
      External validations of cardiovascular clinical prediction models: A large-scale review of the literature.
      Second, Rudolph et al. originally developed the clinical prediction model based on data from 2002 to 2004.
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      The prediction model may not be applicable to current patients undergoing cardiac surgery due to improvements in general healthcare, technical and technologic advances; and the establishment of preventive measures against delirium, such as dexmedetomidine infusion during surgery, at least in Zurich, may have resulted in a drift of the clinical prediction model performance over time.
      • Damen JA
      • Hooft L
      • Schuit E
      • et al.
      Prediction models for cardiovascular disease risk in the general population: Systematic review.
      ,
      • Siregar S
      • Nieboer D
      • Versteegh MIM
      • et al.
      Methods for updating a risk prediction model for cardiac surgery: A statistical primer.
      In this study, the authors used the ICDSC to diagnose POD. This corresponded to the standard procedure at the 2 academic institutions instead of the Confusion Assessment Method (CAM) or CAM-ICU for intubated patients, which was used by Rudolph et al.
      • Inouye SK
      • van Dyck CH
      • Alessi CA
      • et al.
      Clarifying confusion: The confusion assessment method. A new method for detection of delirium.
      However, in 2 meta‑analyses, the pooled sensitivity of CAM‑ICU was found to be 75.5%-to-80.0%, and specificity was 95.8% to 95.9% for detection of delirium; whereas the pooled sensitivity for the ICDSC was found to be 74.0%-to-80.1%, and specificity was 74.6%-to-81.9%. Therefore, it can be assumed that both instruments are highly valid when compared to the gold standard (DSM‑IV criteria) in detecting POD.
      • Neto AS
      • Nassar Jr, AP
      • Cardoso SO
      • et al.
      Delirium screening in critically ill patients: A systematic review and meta-analysis.
      ,
      • Gusmao-Flores D
      • Salluh JI
      • Chalhub RÁ
      • et al.
      The confusion assessment method for the intensive care unit (CAM-ICU) and intensive care delirium screening checklist (ICDSC) for the diagnosis of delirium: A systematic review and meta-analysis of clinical studies.
      Although some cases of delirium may have been missed, the observed incidence of POD in the authors’ study was relatively low compared to the derivation cohort of Rudolph et al.
      • Rudolph JL
      • Jones RN
      • Levkoff SE
      • et al.
      Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
      Several aspects may have contributed to this. First, the reported incidence of delirium varied from 6%-to-56%,
      • Cereghetti C
      • Siegemund M
      • Schaedelin S
      • et al.
      Independent predictors of the duration and overall burden of postoperative delirium after cardiac surgery in adults: An observational cohort study.
      depending on the definition used, timing, characteristics of the studied population, selected assessment tool, type of surgical procedure, and the mode of treatment.
      • Aldecoa C
      • Bettelli G
      • Bilotta F
      • et al.
      European Society of Anaesthesiology evidence-based and consensus-based guideline on postoperative delirium.
      Rudolph et al. used further instruments in addition to the CAM and/or CAM-ICU, such as the Delirium Symptom Interview
      • Albert MS
      • Levkoff SE
      • Reilly C
      • et al.
      The delirium symptom interview: An interview for the detection of delirium symptoms in hospitalized patients.
      and the Memorial Delirium Assessment Scale,
      • Breitbart W
      • Rosenfeld B
      • Roth A
      • et al.
      The Memorial Delirium Assessment Scale.
      which also capture delirium symptoms and their severity. This may have contributed to the higher rate of POD in their sample. However, information regarding the duration of the CAM and/or CAM-ICU assessments, and whether the assessors were blind to the predictors, was lacking. This may have led to a possible bias in the POD rate. Second, the prevalence of delirium increases with age. Many studies have found age to be a significant predictive factor of POD, despite regression analysis to control for confounders. Age >60 years may be considered an implicit element of the original model by Rudolph et al., because patients <60 years were excluded. However, patients in the authors’ cohort had a mean age of 70.9 years, which was younger than in the derivation cohort of Rudolph et al. (74.7 ± 6.3 years). Third, besides advanced age, baseline cognitive impairment is the most highly cited factor associated with an increased risk of delirium.
      • Vasilevskis EE
      • Han JH
      • Hughes CG
      • et al.
      Epidemiology and risk factors for delirium across hospital settings.
      ,
      • van der Sluis FJ
      • Buisman PL
      • Meerdink M
      • et al.
      Risk factors for postoperative delirium after colorectal operation.
      In the authors’ cohort, patients had a better preoperative test performance (MMSE, 28.4 ± 1.6 points) compared to the derivation cohort (MMSE, 26.9 ± 2.6 points). According to established, clinically important ranges, 28.4 points indicate no impairment, whereas 26.9 points indicate mild impairment.
      • Folstein MF
      • Folstein SE
      • McHugh PR.
      "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician.
      ,
      • Crum RM
      • Anthony JC
      • Bassett SS
      • et al.
      Population-based norms for the Mini-Mental State Examination by age and educational level.
      Fourth, the risk prediction model was applied retrospectively. Although this could have caused some errors in the risk stratification of individual patients, the authors herein think that this effect was small because all data used for the application of the Rudolph et al. prediction model were collected prospectively. Fifth, in recent years, guidelines have been developed that recommend the use of multicomponent, nonpharmacologic interventions to reduce delirium.
      • Devlin JW
      • Skrobik Y
      • Gélinas C
      • et al.
      Clinical practice guidelines for the prevention and management of pain, agitation/sedation, delirium, immobility, and sleep disruption in adult patients in the ICU.
      There are several simple, single-component interventions, such as reducing environmental stressors (eg, avoiding excessive noise, maintaining daylight and nighttime rhythm) and frequent orientation of patients to time and place, which can be implemented relatively easily.
      • Salvi F
      • Young J
      • Lucarelli M
      • et al.
      Non-pharmacological approaches in the prevention of delirium.
      However, although these measures seem relatively inexpensive at first sight, there are considerable “hidden costs,” such as higher nurse-to-patient ratios and specific training requirements for caregivers. Given the high burden on scarce human and material resources, these multicomponent interventions are most cost-effective when targeted at high-risk patients.
      • Inouye SK
      • Bogardus Jr, ST
      • Charpentier PA
      • et al.
      A multicomponent intervention to prevent delirium in hospitalized older patients.
      Therefore, it is useful to identify patients with an increased risk of POD at an early stage (ie, before surgery) with specific tools.
      • Monsch RJ
      • Burckhardt AC
      • Berres M
      • et al.
      Development of a novel self-administered cognitive assessment tool and normative data for older adults.
      In addition, there is high variability among different institutions, which may or may not apply preventative measures against delirium, and it is still uncertain as to which interventions are most effective. Therefore, the authors assumed that preventative measures, as administered in both participating institutions, may have played a role in lowering the incidence of POD in their cohort. Moreover, advances in surgical and anesthetic techniques and developments in cardiopulmonary bypass technology may have contributed to a lower delirium incidence as compared to 20 years ago.
      Overall, the poor result of discriminative performance (AUROC = 0.60) of the Rudolph et al. prediction model in the authors’ sample was in line with a previously published large head-to-head comparison study.
      • Wong CK
      • van Munster BC
      • Hatseras A
      • et al.
      Head-to-head comparison of 14 prediction models for postoperative delirium in elderly non-ICU patients: An external validation study.
      The aim of this previous study was to identify clinical prediction models for delirium developed and published since 1990, and to compare their performance head-to-head. In this large analysis, the model discrimination of the Rudolph et al. prediction model was considered poor (AUROC = 0.610).
      • Wong CK
      • van Munster BC
      • Hatseras A
      • et al.
      Head-to-head comparison of 14 prediction models for postoperative delirium in elderly non-ICU patients: An external validation study.

      Strengths and Limitations

      There were several important strengths to this study. To the best of the authors’ knowledge, this was the first broad validation of the Rudolph et al. preoperative prediction model for POD after cardiac surgery in a German-speaking, Swiss population using real-world data and, therefore, was wholly independent of the development and validation sample of the original study. Furthermore, patients were recruited from more than 1 hospital in Switzerland. Second, the authors’ sample size was larger (almost 3 times larger) compared to Rudolph et al. Third, the primary outcome (POD) was ascertained by investigators blinded to the predictor variables. Finally, the authors handled missing data using multiple imputations. This is a popular statistical methodology that replaces missing values with plausible values. One can explicitly account for the uncertainty inherent in the imputed values by creating multiple imputed data sets. Moreover, this approach is superior to more historic approaches such as complete case analysis, mean imputation, and single imputation.
      • Austin PC
      • White IR
      • Lee DS
      • et al.
      Missing data in clinical research: A tutorial on multiple imputation.
      However, a number of critical considerations pertaining to the authors’ study can be made. First, the participants of this study were relatively well-educated (13.2 ± 3.4 years of education), which may have impacted the performance on the MMSE and the incidence of POD. Although all patients undergoing elective cardiac surgery at the participating institutions tested negative for SARS-CoV-2 preoperatively, possible effects of the COVID-19 pandemic during the recruitment period and seasonal variations should be kept in mind because this may limit the generalizability of the authors’ findings.
      • Vlisides PE
      • Vogt KM
      • Pal D
      • et al.
      Roadmap for conducting neuroscience research in the COVID-19 era and beyond: Recommendations from the SNACC research committee.
      Data on patients’ history of prior SARS-CoV-2 infection were not available. Second, because the authors’ purpose was to validate the prediction model externally and to avoid causing additional unnecessary distress to patients before surgery, they collected only a minimal number of variables from patients and medical reports. Hence, establishing or updating (eg, recalibrating or extending the model by adding newly discovered predictors) a new prediction model was beyond the scope of this study. In addition, a previous systematic review and meta-analysis found no strong evidence of a relationship between AUROCs and the number of predictors used in prediction models.
      • van Meenen LC
      • van Meenen DM
      • de Rooij SE
      • et al.
      Risk prediction models for postoperative delirium: A systematic review and meta-analysis.
      It seems more important that the predictors can be applied in clinical practice, when time is often short. However, given the relative scarcity of external validations, it seems reasonable to prioritize the study of existing prediction models (as opposed to developing new ones) and realize how this might be optimized for clinical use.
      • Wessler BS
      • Nelson J
      • Park JG
      • et al.
      External validations of cardiovascular clinical prediction models: A large-scale review of the literature.

      Conclusions

      Risk prediction models play an important role in current cardiac surgical practice. The study authors herein have provided an independent external validation of a previously developed preoperative prognostic model for incident POD in patients who underwent cardiac surgery in Switzerland. The evaluated prognostic model showed only poor discriminative capacity but fair calibration. However, poor performance in a single validation cohort does not reliably forecast performance on subsequent validations. Therefore, it is worth implementing further rigorous studies to evaluate the generalizability and the clinical validity of this prognostic model to realize how this might be optimized for clinical use.

      Conflict of Interest

      None.

      Acknowledgments

      The authors gratefully acknowledge the help of the numerous residents and nurses who assisted in the study implementation as well as with data collection. The authors also thank Allison Dwileski, BSc, for proofreading the manuscript.

      Appendix. Supplementary materials

      References

        • Saczynski JS
        • Marcantonio ER
        • Quach L
        • et al.
        Cognitive trajectories after postoperative delirium.
        N Engl J Med. 2012; 367: 30-39
        • Aitken SJ
        • Blyth FM
        • Naganathan V.
        Incidence, prognostic factors and impact of postoperative delirium after major vascular surgery: A meta-analysis and systematic review.
        Vasc Med. 2017; 22: 387-397
        • Goettel N
        • Steiner LA.
        Postoperatives delirium: Früherkennung, prävention und therapie.
        Swiss Medical Forum. 2013; 13: 522-526
        • Story DA
        • Leslie K
        • Myles PS
        • et al.
        Complications and mortality in older surgical patients in Australia and New Zealand (the REASON study): A multicentre, prospective, observational study.
        Anaesthesia. 2010; 65: 1022-1030
        • Evered L
        • Silbert B
        • Knopman DS
        • et al.
        Recommendations for the nomenclature of cognitive change associated with anaesthesia and surgery-2018.
        Anesthesiology. 2018; 129: 872-879
        • Menzenbach J
        • Guttenthaler V
        • Kirfel A
        • et al.
        Estimating patients' risk for postoperative delirium from preoperative routine data - Trial design of the PRe-Operative prediction of postoperative DElirium by appropriate SCreening (PROPDESC) study - A monocentre prospective observational trial.
        Contemp Clin Trials Commun. 2019; 17100501
        • Aldecoa C
        • Bettelli G
        • Bilotta F
        • et al.
        European Society of Anaesthesiology evidence-based and consensus-based guideline on postoperative delirium.
        Eur J Anaesthesiol. 2017; 34 (Erratum in: Eur J Anaesthesiol. 2018;35:718-9): 192-214
        • Lin Y
        • Chen J
        • Wang Z.
        Meta-analysis of factors which influence delirium following cardiac surgery.
        J Card Surg. 2012; 27: 481-492
        • Hollinger A
        • Siegemund M
        • Goettel N
        • et al.
        Postoperative delirium in cardiac surgery – an unavoidable menace?.
        J Cardiothorac Vasc Anesth. 2015; 29: 1677-1687
        • Cereghetti C
        • Siegemund M
        • Schaedelin S
        • et al.
        Independent predictors of the duration and overall burden of postoperative delirium after cardiac surgery in adults: An observational cohort study.
        J Cardiothorac Vasc Anesth. 2017; 31: 1966-1973
        • Inouye SK
        • Bogardus Jr, ST
        • Charpentier PA
        • et al.
        A multicomponent intervention to prevent delirium in hospitalized older patients.
        N Engl J Med. 1999; 340: 669-676
        • Salvi F
        • Young J
        • Lucarelli M
        • et al.
        Non-pharmacological approaches in the prevention of delirium.
        Eur Geriatr Med. 2020; 11: 71-81
        • Hshieh TT
        • Yue J
        • Oh E
        • et al.
        Effectiveness of multicomponent nonpharmacological delirium interventions: A meta-analysis.
        JAMA Intern Med. 2015; 175 (Erratum in: JAMA Intern Med 2015;175:659): 512-520
        • Gosselt AN
        • Slooter AJ
        • Boere PR
        • et al.
        Risk factors for delirium after on-pump cardiac surgery: A systematic review.
        Crit Care. 2015; 19: 346
        • Rudolph JL
        • Jones RN
        • Levkoff SE
        • et al.
        Derivation and validation of a preoperative prediction rule for delirium after cardiac surgery.
        Circulation. 2009; 119: 229-236
        • van Meenen LC
        • van Meenen DM
        • de Rooij SE
        • et al.
        Risk prediction models for postoperative delirium: A systematic review and meta-analysis.
        J Am Geriatr Soc. 2014; 62: 2383-2390
        • Lee A
        • Mu JL
        • Joynt GM
        • et al.
        Risk prediction models for delirium in the intensive care unit after cardiac surgery: A systematic review and independent external validation.
        Br J Anaesth. 2017; 118: 391-399
        • Adibi A
        • Sadatsafavi M
        • Ioannidis JPA.
        Validation and utility testing of clinical prediction models: Time to change the approach.
        JAMA. 2020; 324: 235-236
        • Bouwmeester W
        • Zuithoff NP
        • Mallett S
        • et al.
        Reporting and methods in clinical prediction research: A systematic review.
        PLoS Med. 2012; 9: 1-12
        • Siontis GC
        • Tzoulaki I
        • Castaldi PJ
        • et al.
        External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination.
        J Clin Epidemiol. 2015; 68: 25-34
        • Wessler BS
        • Nelson J
        • Park JG
        • et al.
        External validations of cardiovascular clinical prediction models: A large-scale review of the literature.
        Circ Cardiovasc Qual Outcomes. 2021; 14e007858
        • Moons KG
        • Kengne AP
        • Woodward M
        • et al.
        Risk prediction models: I. Development, internal validation, and assessing the incremental value of a new (bio)marker.
        Heart. 2012; 98: 683-690
        • Debray TP
        • Vergouwe Y
        • Koffijberg H
        • et al.
        A new framework to enhance the interpretation of external validation studies of clinical prediction models.
        J Clin Epidemiol. 2015; 68: 279-289
        • Toll DB
        • Janssen KJ
        • Vergouwe Y
        • et al.
        Validation, updating and impact of clinical prediction rules: A review.
        J Clin Epidemiol. 2008; 61: 1085-1094
        • Moons KG
        • Altman DG
        • Reitsma JB
        • et al.
        Transparent Reporting of a multivariable prediction model for Individual Prognosis or Diagnosis (TRIPOD): Explanation and elaboration.
        Ann Intern Med. 2015; 162: W1-73
        • Devlin JW
        • Fong JJ
        • Schumaker G
        • et al.
        Use of a validated delirium assessment tool improves the ability of physicians to identify delirium in medical intensive care unit patients.
        Crit Care Med. 2007; 35 (quiz 2725): 2721-2724
        • Bergeron N
        • Dubois MJ
        • Dumont M
        • et al.
        Intensive Care Delirium Screening Checklist: evaluation of a new screening tool.
        Intensive Care Med. 2001; 27: 859-864
        • Copas JB.
        Using regression models for prediction: Shrinkage and regression to the mean.
        Stat Methods Med Res. 1997; 6: 167-183
        • Collins GS
        • Reitsma JB
        • Altman DG
        • et al.
        Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): The TRIPOD statement. The TRIPOD Group.
        Circulation. 2015; 131: 211-219
        • Damen JA
        • Hooft L
        • Schuit E
        • et al.
        Prediction models for cardiovascular disease risk in the general population: Systematic review.
        BMJ. 2016; 353: i2416
        • Siregar S
        • Nieboer D
        • Versteegh MIM
        • et al.
        Methods for updating a risk prediction model for cardiac surgery: A statistical primer.
        Interact Cardiovasc Thorac Surg. 2019; 28: 333-338
        • Inouye SK
        • van Dyck CH
        • Alessi CA
        • et al.
        Clarifying confusion: The confusion assessment method. A new method for detection of delirium.
        Ann Intern Med. 1990; 11: 941-948
        • Neto AS
        • Nassar Jr, AP
        • Cardoso SO
        • et al.
        Delirium screening in critically ill patients: A systematic review and meta-analysis.
        Crit Care Med. 2012; 40: 1946-1951
        • Gusmao-Flores D
        • Salluh JI
        • Chalhub RÁ
        • et al.
        The confusion assessment method for the intensive care unit (CAM-ICU) and intensive care delirium screening checklist (ICDSC) for the diagnosis of delirium: A systematic review and meta-analysis of clinical studies.
        Crit Care. 2012; 16: R115
        • Albert MS
        • Levkoff SE
        • Reilly C
        • et al.
        The delirium symptom interview: An interview for the detection of delirium symptoms in hospitalized patients.
        J Geriatr Psychiatry Neurol. 1992; 5: 14-21
        • Breitbart W
        • Rosenfeld B
        • Roth A
        • et al.
        The Memorial Delirium Assessment Scale.
        J Pain Symptom Manage. 1997; 13: 128-137
        • Vasilevskis EE
        • Han JH
        • Hughes CG
        • et al.
        Epidemiology and risk factors for delirium across hospital settings.
        Best Pract Res Clin Anaesthesiol. 2012; 26: 277-287
        • van der Sluis FJ
        • Buisman PL
        • Meerdink M
        • et al.
        Risk factors for postoperative delirium after colorectal operation.
        Surgery. 2017; 161: 704-711
        • Folstein MF
        • Folstein SE
        • McHugh PR.
        "Mini-mental state". A practical method for grading the cognitive state of patients for the clinician.
        J Psychiatr Res. 1975; 12: 189-198
        • Crum RM
        • Anthony JC
        • Bassett SS
        • et al.
        Population-based norms for the Mini-Mental State Examination by age and educational level.
        JAMA. 1993; 269: 2386-2391
        • Devlin JW
        • Skrobik Y
        • Gélinas C
        • et al.
        Clinical practice guidelines for the prevention and management of pain, agitation/sedation, delirium, immobility, and sleep disruption in adult patients in the ICU.
        Crit Care Med. 2018; 46: e825-e873
        • Monsch RJ
        • Burckhardt AC
        • Berres M
        • et al.
        Development of a novel self-administered cognitive assessment tool and normative data for older adults.
        J Neurosurg Anesthesiol. 2019; 31: 218-226
        • Wong CK
        • van Munster BC
        • Hatseras A
        • et al.
        Head-to-head comparison of 14 prediction models for postoperative delirium in elderly non-ICU patients: An external validation study.
        BMJ Open. 2022; 12e054023
        • Austin PC
        • White IR
        • Lee DS
        • et al.
        Missing data in clinical research: A tutorial on multiple imputation.
        Can J Cardiol. 2021; 37: 1322-1331
        • Vlisides PE
        • Vogt KM
        • Pal D
        • et al.
        Roadmap for conducting neuroscience research in the COVID-19 era and beyond: Recommendations from the SNACC research committee.
        J Neurosurg Anesthesiol. 2021; 33: 100-106