Abstract
Cranioplasty is associated with a substantial burden of postoperative complications. In this multicenter study, we developed a machine learning–based clinical decision-support tool to predict the risk of postoperative complications following cranioplasty. A set of nine features was selected for model development. Among the 15 algorithms evaluated, the random forest model demonstrated the best overall performance and was validated on data from both spatial and temporal external cohorts (AUROC = 0.949, internal cross-validation; 0.930, geographical validation; and 0.932, temporal validation). Subgroup analyses by age and sex demonstrated consistently high discriminative performance (lowest AUROC = 0.927) and good calibration (O/E ratio = 1.16, 95% CI: 0.97–1.40). Analysis of causal effects of modifiable intraoperative variables on postoperative complications, with diverse counterfactual explanations and causal inference methods, including double machine learning and the T-learner framework, revealed a protective effect of subcutaneous negative-pressure drainage (ATE = −0.241) and titanium mesh (ATE = −0.191). Finally, we present the model as an accessible web-based tool for individualized, real-time clinical decision-making (http://www.cranioplastycomplicationprediction.top). These findings provide a practical framework for postoperative risk stratification and support the optimization of intraoperative decision-making in cranioplasty.
Similar content being viewed by others
Data availability
The datasets analyzed implemented during this study are available from the corresponding author upon reasonable request. The codes are uploaded on Github. (GitHub - BigEarAsk/A-Causal-and-Interpretable-Machine-Learning-Framework-for-Post-Cranioplasty-Complications: Code for training and validation).
References
Fung, C. et al. Decompressive hemicraniectomy in patients with supratentorial intracerebral hemorrhage. Stroke 43, 3207–3211 (2012).
Hutchinson, P. J. et al. Trial of Decompressive Craniectomy for Traumatic Intracranial Hypertension. N. Engl. J. Med. 375, 1119–1130 (2016).
Vahedi, K. et al. Early decompressive surgery in malignant infarction of the middle cerebral artery: a pooled analysis of three randomised controlled trials. Lancet Neurol. 6, 215–222 (2007).
Honeybul, S. & Ho, K. M. Long-term complications of decompressive craniectomy for head injury. J. Neurotrauma 28, 929–935 (2011).
Kurland, D. B. et al. Complications associated with decompressive craniectomy: a systematic review. Neurocritical Care 23, 292–304 (2015).
Feroze, A. H. et al. Evolution of cranioplasty techniques in neurosurgery: historical review, pediatric considerations, and current trends. J. Neurosurg. 123, 1098–1107 (2015).
Malcolm, J. G. et al. Complications following cranioplasty and relationship to timing: a systematic review and meta-analysis. J. Clin. Neurosci. 33, 39–51 (2016).
Alkhaibary, A. et al. Cranioplasty: a comprehensive review of the history, materials, surgical aspects, and complications. World Neurosurg. 139, 445–452 (2020).
Zanaty, M. et al. Complications following cranioplasty: incidence and predictors in 348 cases. J. Neurosurg. 123, 182–188 (2015).
Chen, R. et al. Optimal timing of cranioplasty and predictors of overall complications after cranioplasty: the impact of brain collapse. Neurosurgery 93, 84–94 (2023).
Abode-Iyamah, K. O. et al. Risk factors for surgical site infections and assessment of vancomycin powder as a preventive measure in patients undergoing first-time cranioplasty. J. Neurosurg. 128, 1241–1249 (2018).
Cho, S. M. et al. Machine learning compared with conventional statistical models for predicting myocardial infarction readmission and mortality: a systematic review. Can. J. Cardiol. 37, 1207–1214 (2021).
Drysch, M. et al. Streamlined machine learning model for early sepsis risk prediction in burn patients. NPJ Digit. Med. 8, 621 (2025).
Yin, P. et al. Prediction of functional outcomes in aneurysmal subarachnoid hemorrhage using pre-/postoperative noncontrast CT within 3 days of admission. NPJ Digit. Med. 8, 542 (2025).
Bai, Z. et al. Machine learning based CAGIB score predicts in-hospital mortality of cirrhotic patients with acute gastrointestinal bleeding. NPJ Digit. Med. 8, 489 (2025).
Shin, T. H., Ashley, S. W. & Tsai, T. C. Defining the role of machine learning in optimizing surgical outcomes. JAMA Surg. 159, 1432 (2024).
Feuerriegel, S. et al. Causal machine learning for predicting treatment outcomes. Nat. Med. 30, 958–968 (2024).
Rosenthal, G. et al. Polyetheretherketone implants for the repair of large cranial defects: a 3-center experience. Neurosurgery 75, 528–529 (2014).
Rosinski, C. L. et al. A retrospective comparative analysis of titanium mesh and custom implants for cranioplasty. Neurosurgery 86, E15–e22 (2020).
Williams, D. F. Titanium for medical applications. in: Titanium in medicine: material science, surface science, engineering, biological responses and medical applications, 13–24 (Springer, 2001).
Kimchi, G. et al. Predicting and reducing cranioplasty infections by clinical, radiographic and operative parameters - A historical cohort study. J. Clin. Neurosci. 34, 182–186 (2016).
Klieverik, V. M., Robe, P. A., Muradin, M. S. M. & Woerdeman, P. A. Development of a prediction model for cranioplasty implant survival following craniectomy. World Neurosurg. 175, e693–e703 (2023).
Lu, Y., Huo, H. & Jiang, J. A clinical prediction model for complications after cranioplasty based on modified-brain collapse ratio and comorbidity burden. World Neurosurg. 201, 124235 (2025).
Chen, S. et al. Development and validation of an explainable machine learning model for predicting postoperative pulmonary complications after lung cancer surgery: a machine learning study. EClinicalMedicine 86, 103386 (2025).
Mahajan, A. et al. Development and validation of a machine learning model to identify patients before surgery at high risk for postoperative adverse events. JAMA Netw. Open 6, e2322285 (2023).
Collins, G. S. et al. TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods. BMJ 385, e078378 (2024).
Riley, R. D. et al. Calculating the sample size required for developing a clinical prediction model. BMJ 368, m441 (2020).
Zheng, F. et al. Early or late cranioplasty following decompressive craniotomy for traumatic brain injury: A systematic review and meta-analysis. J. Int. Med. Res. 46, 2503–2512 (2018).
Bader, E. R., Kobets, A. J., Ammar, A. & Goodrich, J. T. Factors predicting complications following cranioplasty. J. Cranio Maxillo Fac. Surg. 50, 134–139 (2022).
Shepetovsky, D., Mezzini, G. & Magrassi, L. Complications of cranioplasty in relationship to traumatic brain injury: a systematic review and meta-analysis. Neurosurg. Rev. 44, 3125–3142 (2021).
Carpenter, J. R. & Smuk, M. Missing data: a statistical framework for practice. Biom. J. 63, 915–947 (2021).
Little, R. J. & Rubin, D. B. Statistical analysis with missing data (John Wiley & Sons, 2019).
Stekhoven, D. J. & Bühlmann, P. MissForest--non-parametric missing value imputation for mixed-type data. Bioinformatics 28, 112–118 (2012).
Chaurasia, S., Goyal, S. & Rajput, M. Outlier detection using autoencoder ensembles: a robust unsupervised approach. In Proc. 2020 International Conference on Contemporary Computing and Applications (IC3A) 76–80 (IEEE, 2020).
Sun, Y. et al. Modifying the one-hot encoding technique can enhance the adversarial robustness of the visual model for symbol recognition. Expert Syst. Appl. 250, 123751 (2024).
Kursa, M. B. & Rudnicki, W. R. Feature selection with the Boruta package. J. Stat. Softw. 36, 1–13 (2010).
Tibshirani, R. Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B Stat. Methodol. 58, 267–288 (1996).
Ravishankar, H. et al. Recursive feature elimination for biomarker discovery in resting-state functional connectivity. In Proc. 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 4071–4074 (IEEE, 2016).
Lambora, A., Gupta, K. & Chopra, K. Genetic algorithm-A literature review. In Proc. 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon) 380–384 (IEEE, 2019).
Lei, S. A feature selection method based on information gain and genetic algorithm. In Proc. 2012 International Conference on Computer Science and Electronics Engineering, Vol. 2, 355–358 (IEEE, 2012).
Heberle, H., Meirelles, G. V., da Silva, F. R., Telles, G. P. & Minghim, R. J. B. b. InteractiVenn: a web-based tool for the analysis of sets through Venn diagrams. BMC Bioinform. 16, 169 (2015).
Grün, B. & Miljkovic, T. J. N. A. A. J. The automated bias-corrected and accelerated bootstrap confidence intervals for risk measures. North Am. Actuar. J. 27, 731–750 (2023).
Stevens, R. J. & Poppe, K. K. Validation of clinical prediction models: what does the “calibration slope” really measure? J. Clin. Epidemiol. 118, 93–99 (2020).
Vickers, A. J. & Elkin, E. B. Decision curve analysis: a novel method for evaluating prediction models. Med. Decis. Mak. 26, 565–574 (2006).
Riley, R. D. et al. Evaluation of clinical prediction models (part 2): how to undertake an external validation study. BMJ 384, e074820 (2024).
He, H., Bai, Y., Garcia, E. A. & Li, S. ADASYN: adaptive synthetic sampling approach for imbalanced learning. In Proc. 2008 IEEE International Joint Conference on Neural Networks (IEEE world congress on computational intelligence) 1322–1328 (IEEE, 2008).
Lundberg, S. M.et al. A unified approach to interpreting model predictions. In Proc. 31st International Conference on Neural Information Processing Systems (NIPS'17), 4768–4777 (Curran Associates Inc., 2017).
Friedman, J. et al. Greedy function approximation: a gradient boosting machine. Ann. Stati. 29, 1189–1232 (2001).
Li, W. et al. Effects of heavy metal exposure on hypertension: a machine learning modeling approach. Chemosphere 337, 139435 (2023).
Mothilal, R. K., Sharma, A. & Tan, C. Explaining machine learning classifiers through diverse counterfactual explanations. In Proc. 2020 Conference on Fairness, Accountability, and Transparency, 607–617 (ACM, 2020).
Chernozhukov, V. et al. Double/debiased machine learning for treatment and structural parameters. Econ. J. 21, C1–C68 (2018).
Künzel, S. R. et al. Metalearners for estimating heterogeneous treatment effects using machine learning. Proc. Natl. Acad. Sci. USA 116, 4156–4165 (2019).
Acknowledgements
We sincerely thank Ms. Wenyun Xia for her contributions to the design of the visualization website. This study was supported by Qilu hygiene and health outstanding youth project, Shandong Provincial Natural Science Foundation (ZR2025MS1184), Shandong University project (6010124061, Study on postoperative complications of cranioplasty using polyetheretherketone material), Shaanxi Province Youth Science and Technology Rising Star (2023KJXX-028), and Tangdu Youth Independent Innovation Science Fund (2023ATDQN004). The funders played no role in study design, data collection, analysis and interpretation of data, or the writing of this manuscript.
Author information
Authors and Affiliations
Contributions
N.Y. (Ning Yang) and C.Q. (Chen Qiu) were responsible for project administration, conceptualization, and investigation of the study, and also reviewed and edited the manuscript. W.B.L. (Wenbo Li), B.W. (Bao Wang), and T.L. (Tianzun Li) contributed equally to methodology development, formal analysis, data validation, web-based application development, and drafting of the original manuscript. Y.W.M. (Yiwen Ma) contributed to formal analysis and data validation. H.J. (Haoyong Jin), J.Z. (Jiangli Zhao), Z.W.X. (Zhiwei Xue), N.S. (Nan Su), and Y.H. (Yanya He) were involved in data curation and formal analysis. J.S. (Jiaqi Shi), X.C.L. (Xuchen Liu), X.Y.L. (Xiaoyang Liu), T.W. (Tianzi Wang), J.W. (Jiwei Wang), C.L. (Chao Li), C.Y. (Can Yan), Y.M. (Yang Ma), and Q.Q. (Qichao Qi) contributed to data curation. X.Y.W. (Xinyu Wang), W.G.L. (Weiguo Li), B.H. (Bin Huang), D.W. (Donghai Wang), X.L.W. (Xuelian Wang), Y.Q. (Yan Qu), and X.G.L. (Xingang Li) provided supervision for the study. N.Y. and C.Q. had full access to all data in the study and took responsibility for the integrity of the data and the accuracy of the data analysis. The corresponding authors assumed final responsibility for the decision to submit the manuscript. All authors reviewed and approved the final manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Li, W., Wang, B., Li, T. et al. A Causal and interpretable machine learning framework for postcranioplasty risk prediction and surgical decision support. npj Digit. Med. (2026). https://doi.org/10.1038/s41746-026-02370-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41746-026-02370-6


