Article Data

  • Views 1512
  • Dowloads 170

Original Research

Open Access

Mortality prediction of patients with sepsis in the emergency department using machine learning models: a retrospective cohort study according to the Sepsis-3 definitions

  • Eun-Tae Jeon1,2
  • Juhyun Song3,*,
  • Dae Won Park4
  • Ki-Sun Lee2
  • Sejoong Ahn5
  • Joo Yeong Kim5
  • Jong-hak Park5
  • Sungwoo Moon5
  • Han-jin Cho5

1Department of Neurology, Korea University Ansan Hospital, 15355 Ansan, Republic of Korea

2Medical Science Research Center, Korea University Ansan Hospital, 15355 Ansan, Republic of Korea

3Department of Emergency Medicine, Korea University Anam Hospital, 02841 Seoul, Republic of Korea

4Division of infectious Diseases, Department of Internal Medicine, Korea University Ansan Hospital, 15355 Ansan, Republic of Korea

5Department of Emergency Medicine, Korea University Ansan Hospital, 15355 Ansan, Republic of Korea

DOI: 10.22514/sv.2023.046 Vol.19,Issue 5,September 2023 pp.112-124

Submitted: 11 September 2022 Accepted: 09 November 2022

Published: 08 September 2023

*Corresponding Author(s): Juhyun Song E-mail:


Although clinical scoring systems and biomarkers have been used to predict outcomes in sepsis, their prognostic value is limited. Therefore, machine learning (ML) models have been proposed to predict the outcomes of sepsis. This study aims to propose ML algorithms that create robust models for predicting mortality in patients with sepsis diagnosed using the Sepsis-3 definitions in the emergency department. This study was performed using a prospectively collected registry of adult patients with sepsis between January 2016 and February 2020. Among the 810 patients, 607 (75%) and 203 (25%) patients were assigned to the training and test sets, respectively. The primary outcome was 30-day mortality. Using the values of the area under the receiver operating characteristic curve (AUROC), the area under the precision-recall curve (AUPRC), balanced accuracy, and Brier score, we compared the performances of different ML algorithms with that of the logistic regression models and clinical scoring systems. The ML models’ performance was superior to that of the clinical scoring systems. A light gradient boosting machine achieved the highest AUROC among the ML models in predicting 30-day mortality. Most of the ML models had significantly higher AUROC and balanced accuracy than the logistic regression models. All the ML models exhibited higher AUPRC and lower Brier scores compared to the scoring systems and logistic regression model. The ML models can be used as supportive tools for predicting mortality in sepsis patients. In future studies, the performance of the proposed models will be validated using more data from different hospitals or departments.


Emergency department; Machine learning; Mortality; Sepsis; Septic shock

Cite and Share

Eun-Tae Jeon,Juhyun Song,Dae Won Park,Ki-Sun Lee,Sejoong Ahn,Joo Yeong Kim,Jong-hak Park,Sungwoo Moon,Han-jin Cho. Mortality prediction of patients with sepsis in the emergency department using machine learning models: a retrospective cohort study according to the Sepsis-3 definitions. Signa Vitae. 2023. 19(5);112-124.


[1] Fleischmann C, Scherag A, Adhikari NK, Hartog CS, Tsaganos T, Schlattmann P, et al. Assessment of global incidence and mortality of hospital-treated sepsis. Current estimates and limitations. American Journal of Respiratory and Critical Care Medicine. 2016; 193: 259–272.

[2] Martin-Loeches I, Levy MM, Artigas A. Management of severe sepsis: advances, challenges, and current status. Drug Design, Development and Therapy. 2015; 9: 2079–2088.

[3] Rhodes A, Evans LE, Alhazzani W, Levy MM, Antonelli M, Ferrer R, et al. Surviving sepsis campaign: international guidelines for management of sepsis and septic shock: 2016. Intensive Care Medicine. 2017; 43: 304–377.

[4] Singer M, Deutschman CS, Seymour CW, Shankar-Hari M, Annane D, Bauer M, et al. The third international consensus definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016; 315: 801–810.

[5] Morgan RW, Fitzgerald JC, Weiss SL, Nadkarni VM, Sutton RM, Berg RA. Sepsis-associated in-hospital cardiac arrest: Epidemiology, pathophysiology, and potential therapies. Journal of Critical Care. 2017; 40: 128–135.

[6] Churpek MM, Snyder A, Han X, Sokol S, Pettit N, Howell MD, et al. Quick sepsis-related organ failure assessment, systemic inflammatory response syndrome, and early warning scores for detecting clinical deterioration in infected patients outside the intensive care unit. American Journal of Respiratory and Critical Care Medicine. 2017; 195: 906–911.

[7] Gole AR, Srivastava SL, Neeraj. Prognostic accuracy of SOFA score, SIRS criteria, NEWS and MEWS scores for in-hospital mortality among adults admitted to ICU with suspected sepsis. Journal of the Association of Physicians of India. 2020; 68: 87.

[8] Khwannimit B, Bhurayanontachai R, Vattanavanit V. Comparison of the accuracy of three early warning scores with SOFA score for predicting mortality in adult sepsis and septic shock patients admitted to intensive care unit. Heart & Lung. 2019; 48: 240–244.

[9] Raith EP, Udy AA, Bailey M, McGloughlin S, MacIsaac C, Bellomo R, et al. Prognostic accuracy of the SOFA score, SIRS criteria, and qSOFA score for in-hospital mortality among adults with suspected infection admitted to the intensive care unit. JAMA. 2017; 317: 290–300.

[10] Jekarl DW, Lee S, Kim M, Kim Y, Woo SH, Lee WJ. Procalcitonin as a prognostic marker for sepsis based on SEPSIS-3. Journal of Clinical Laboratory Analysis. 2019; 33: e22996.

[11] Pierrakos C, Velissaris D, Bisdorff M, Marshall JC, Vincent J. Biomarkers of sepsis: time for a reappraisal. Critical Care. 2020; 24: 287.

[12] Ryoo SM, Lee J, Lee Y, Lee JH, Lim KS, Huh JW, et al. Lactate level versus lactate clearance for predicting mortality in patients with septic shock defined by Sepsis-3. Critical Care Medicine. 2018; 46: e489–e495.

[13] Hyland SL, Faltys M, Hüser M, Lyu X, Gumbsch T, Esteban C, et al. Early prediction of circulatory failure in the intensive care unit using machine learning. Nature Medicine. 2020; 26: 364–373.

[14] Perng JW, Kao IH, Kung CT, Hung SC, Lai YH, Su CM. Mortality prediction of septic patients in the emergency department based on machine learning. Journal of Clinical Medicine. 2019; 8: 1906.

[15] Su L, Xu Z, Chang F, Ma Y, Liu S, Jiang H, et al. Early prediction of mortality, severity, and length of stay in the intensive care unit of sepsis patients based on Sepsis 3.0 by machine learning models. Frontiers in Medicine. 2021; 8: 664966.

[16] Collins GS, Reitsma JB, Altman DG, Moons KGM. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD statement. The TRIPOD Group. Circulation. 2015; 131: 211–219.

[17] Shankar-Hari M, Phillips GS, Levy ML, Seymour CW, Liu VX, Deutschman CS, et al. Developing a new definition and assessing new clinical criteria for septic shock: for the third international consensus definitions for Sepsis and Septic Shock (Sepsis-3). JAMA. 2016; 315: 775–787.

[18] Vincent JL, Moreno R, Takala J, Willatts S, De Mendonça A, Bruining H, et al. The SOFA (sepsis-related organ failure assessment) score to describe organ dysfunction/failure. On behalf of the working group on sepsis-related problems of the european society of intensive care medicine. Intensive Care Medicine. 1996; 22: 707–710.

[19] Moreno R, Vincent JL, Matos R, Mendonça A, Cantraine F, Thijs L, et al. The use of maximum SOFA score to quantify organ dysfunction/failure in intensive care. Results of a prospective, multicentre study. Working group on sepsis related problems of the ESICM. Intensive Care Medicine. 1999; 25: 686–696.

[20] de Mendonça A, Vincent JL, Suter PM, Moreno R, Dearden NM, Antonelli M, et al. Acute renal failure in the ICU: risk factors and outcome evaluated by the SOFA score. Intensive Care Medicine. 2000; 26: 915–921.

[21] Jones M. NEWSDIG: the national early warning score development and implementation group. Clinical Medicine. 2012; 12: 501–503.

[22] Subbe CP, Kruger M, Rutherford P, Gemmel L. Validation of a modified early warning score in medical admissions. QJM: Monthly Journal of the Association of Physicians. 2001; 94: 521–526.

[23] Jakobsen JC, Gluud C, Wetterslev J, Winkel P. When and how should multiple imputation be used for handling missing data in randomised clinical trials—a practical guide with flowcharts. BMC Medical Research Methodology. 2017; 17: 162.

[24] Van Buuren S, Groothuis-Oudshoorn K. mice: multivariate imputation by chained equations in R. Journal of Statistical Software. 2011; 45: 1–67.

[25] Liu FT, Ting KM, Zhou Z. Isolation-based anomaly detection. ACM Transactions on Knowledge Discovery from Data. 2012; 6: 1–39.

[26] Guyon I, Weston J, Barnhill S, Vapnik V. Gene selection for cancer classification using support vector machines. Machine Learning. 2002; 46: 389–422.

[27] Lundberg SM, Erion G, Chen H, DeGrave A, Prutkin JM, Nair B, et al. From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence. 2020; 2: 56–67.

[28] Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, et al. Lightgbm: a highly efficient gradient boosting decision tree. Advances in Neural Information Processing Systems. 2017; 30: 3146–3154.

[29] Gregorutti B, Michel B, Saint-Pierre P. Correlation and variable importance in random forests. Statistics and Computing. 2017; 27: 659–678.

[30] Chen T, Guestrin C. XGBoost: a scalable tree boosting system. Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 2016; 785–794.

[31] Chang CC, Lin CJ. LIBSVM: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST). 2011; 2: 1–27.

[32] LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015; 521: 436–444.

[33] Snoek J, Larochelle H, Adams RP. Practical bayesian optimization of machine learning algorithms. ArXiv. 2012: 1206.2944.

[34] DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988; 44: 837–845.

[35] Rufibach K. Use of Brier score to assess binary predictions. Journal of Clinical Epidemiology. 2010; 63: 938–939.

[36] Davis J, Goadrich M. The relationship between precision-recall and ROC curves. Proceedings of the 23rd International Conference on Machine Learning. 2006; 233–240.

[37] Lee SG, Song J, Park DW, Moon S, Cho H, Kim JY, et al. Prognostic value of lactate levels and lactate clearance in sepsis and septic shock with initial hyperlactatemia: a retrospective cohort study according to the Sepsis-3 definitions. Medicine. 2021; 100: e24835.

[38] Dimοpoulos G, Rovina N, Patrani M, Antoniadou E, Konstantonis D, Vryza K, et al. Past history of stage I/II solid tumor malignancy impacts considerably on sepsis mortality: a propensity score matching analysis from the hellenic sepsis study group. BMC Infectious Diseases. 2019; 19: 831.

[39] Weng L, Zeng X, Yin P, Wang L, Wang C, Jiang W, et al. Sepsis-related mortality in China: a descriptive analysis. Intensive Care Medicine. 2018; 44: 1071–1080.

[40] Karlsson A, Stassen W, Loutfi A, Wallgren U, Larsson E, Kurland L. Predicting mortality among septic patients presenting to the emergency department-a cross sectional analysis using machine learning. BMC Emergency Medicine. 2021; 21: 84.

[41] Cao C, Yu M, Chai Y. Pathological alteration and therapeutic implications of sepsis-induced immune cell apoptosis. Cell Death & Disease. 2019; 10: 782.

Abstracted / indexed in

Science Citation Index Expanded (SciSearch) Created as SCI in 1964, Science Citation Index Expanded now indexes over 9,200 of the world’s most impactful journals across 178 scientific disciplines. More than 53 million records and 1.18 billion cited references date back from 1900 to present.

Journal Citation Reports/Science Edition Journal Citation Reports/Science Edition aims to evaluate a journal’s value from multiple perspectives including the journal impact factor, descriptive data about a journal’s open access content as well as contributing authors, and provide readers a transparent and publisher-neutral data & statistics information about the journal.

Chemical Abstracts Service Source Index The CAS Source Index (CASSI) Search Tool is an online resource that can quickly identify or confirm journal titles and abbreviations for publications indexed by CAS since 1907, including serial and non-serial scientific and technical publications.

Index Copernicus The Index Copernicus International (ICI) Journals database’s is an international indexation database of scientific journals. It covered international scientific journals which divided into general information, contents of individual issues, detailed bibliography (references) sections for every publication, as well as full texts of publications in the form of attached files (optional). For now, there are more than 58,000 scientific journals registered at ICI.

Geneva Foundation for Medical Education and Research The Geneva Foundation for Medical Education and Research (GFMER) is a non-profit organization established in 2002 and it works in close collaboration with the World Health Organization (WHO). The overall objectives of the Foundation are to promote and develop health education and research programs.

Scopus: CiteScore 1.0 (2022) Scopus is Elsevier's abstract and citation database launched in 2004. Scopus covers nearly 36,377 titles (22,794 active titles and 13,583 Inactive titles) from approximately 11,678 publishers, of which 34,346 are peer-reviewed journals in top-level subject fields: life sciences, social sciences, physical sciences and health sciences.

Embase Embase (often styled EMBASE for Excerpta Medica dataBASE), produced by Elsevier, is a biomedical and pharmacological database of published literature designed to support information managers and pharmacovigilance in complying with the regulatory requirements of a licensed drug.

Submission Turnaround Time