- Research
- Open access
- Published:
Integrating SHAP analysis with machine learning to predict postpartum hemorrhage in vaginal births
BMC Pregnancy and Childbirth volume 25, Article number: 529 (2025)
Abstract
Objective
This study aimed to develop a machine learning (ML) model integrated with SHapley Additive exPlanations (SHAP) analysis to predict postpartum hemorrhage (PPH) following vaginal deliveries, offering a potential tool for personalized risk assessment and prevention in clinical settings.
Methods
We conducted a retrospective multicenter cohort study in Northeast China, including women who had vaginal deliveries at three tertiary hospitals from September 2018 to December 2023. Data were extracted from electronic medical records. The dataset was split into a training set (70%) and an internal validation set (30%) to prevent overfitting. External validation was performed on a separate dataset. Several evaluation metrics, including the area under the receiver operating characteristic curve (AUC), were used to compare prediction performance. Features were ranked using SHAP, and the final model was explained.
Results
The XGBoost model demonstrated superior predictive accuracy for PPH, with an AUC of 0.997 in the training set. SHAP value-based feature selection identified 15 key features contributing to the model’s predictive power. SHAP dependence and summary plots provided intuitive insights into each feature’s contribution, enabling the identification of anomalies. The final model maintained high predictive power, with an AUC of 0.894 in internal validation and 0.880 in external validation.
Conclusion
This study successfully developed an interpretable ML model that predicts PPH with high accuracy. Future studies with larger and more diverse datasets are necessary to further validate and refine the model, particularly to assess its generalizability across different populations and healthcare settings.
Introduction
Postpartum hemorrhage (PPH) is a significant global health concern that can lead to severe and potentially fatal complications for women, particularly in low-resource settings. It has been extensively studied due to its status as one of the leading causes of maternal mortality, particularly in developing countries [1]. The majority of these deaths are preventable through the establishment of clinical guidelines and policies [2], as well as the promotion of relevant research and training.
In routine clinical practice, physicians typically estimate the probability of PPH by assessing clinical history, conducting physical examinations, and performing laboratory tests. However, the limited sensitivity and specificity of these assessments, combined with the low incidence of PPH, mean that traditional bleeding assessment tools [3, 4], such as structured history taking and systematic evaluation scales, have shown low efficacy in assessing the incidence of PPH. With the advancement of medical big data and artificial intelligence, predictive models for PPH have begun to emerge, offering new opportunities for early risk assessment and intervention. However, most current studies do not differentiate between PPH following cesarean section and vaginal delivery [4]. Moreover, the selection of predictive factors in model construction is often constrained by data collection limitations and sample size [5], resulting in a lack of comprehensive assessment in most clinical predictive models.
In recent years, the rise of smart medicine and artificial intelligence has highlighted the unparalleled advantages of machine learning (ML) techniques over traditional statistics [6]. ML involves fitting predictive models to data or identifying informative patterns within datasets [7], leveraging data features to establish automated data analysis processes that enhance predictive capabilities for new data. Scholars have applied ML algorithms to various fields to construct predictive models, such as disease prediction and diagnosis [8], prognosis or mortality prediction [9], drug interaction prediction [10], rehospitalization prediction [11], and patient care needs prediction [12], all of which have shown good predictive performance [13].
Despite its advantages, ML research in the medical field faces several challenges, including handling missing data, avoiding model overfitting, and accounting for interrelationships among dataset attributes [14]. Additionally, the “black box” issue, where model inputs and operations are not visible to users or stakeholders, complicates interpretability [15]. Due to the complexity and multi-dimensionality of its algorithmic structure, understanding ML models can be difficult for clinicians. SHapley Additive exPlanations (SHAP), a method inspired by game theory and proposed by Lundberg et al. [16], addresses this issue by assigning a value to each input feature, indicating how the feature contributes to the prediction for a specific data point. Some factors positively impact the prediction probability, while others have a negative effect [17]. This can help clinicians quantify risk factors, improving their ability to focus on and prevent them in clinical practice.
This study aims to use various ML algorithms to construct an optimal PPH prediction model for vaginal delivery. Additionally, it seeks to evaluate and quantify risk factors, providing a highly reliable reference for personalized assessment and prevention of PPH in high-risk pregnant women.
Methods
Study population
This retrospective multicenter cohort study was conducted in Northeast China, focusing on women who underwent vaginal deliveries to develop and validate predictive models for PPH. The derivation cohort included women who delivered vaginally at three independent tertiary hospitals (Shengjing Hospital of China Medical University, Liaoning Maternal and Child Health Hospital, and Shenyang Women’s and Children’s Hospital) from September 2018 to December 2023. At the time of admission, all women were informed that their clinical data, excluding personally identifiable information, might be used for research purposes. Those who consented after being fully informed were included in the study. Exclusion criteria were: (1) age less than 18 years or more than 50 years; (2) gestational age at delivery less than 37 weeks or more than 42 weeks; (3) multiple births; and (4) stillbirth, neonatal death, or any induced labor performed with the intention of terminating the fetus’s life.
This study was conducted in accordance with the Declaration of Helsinki and was approved by the Institutional Review Boards of Shengjing Hospital (No. 2016PS344K, Date: 17/12/2016).
Data collection and processing
Using the electronic medical systems of the hospitals, data were collected on basic characteristics, obstetric history, pregnancy complications, delivery processes, and neonatal conditions to identify features for constructing predictive models. The data categories included: 1.Basic Characteristics: Age, ethnicity, education level, occupation (classified into three categories based on physical labor intensity: light physical labor (LPL), moderate physical labor (MPL), and heavy physical labor (HPL)), family per capita monthly income, pre-pregnancy Body Mass Index (BMI), smoking status, and alcohol consumption status. 2.Obstetric History and Pregnancy Complications: Gravidity, parity, history of miscarriage, spontaneous abortion history, induced and medical abortion history, history of labor induction for fetal demise (induced with the intention of terminating a nonviable or deceased fetus), use of assisted reproductive technology, gestational age at delivery, gestational diabetes, pregnancy-induced hypertension (PIH, including gestational hypertension, preeclampsia, and eclampsia), anemia during pregnancy, coagulation dysfunction, uterine fibroids or adenomyosis, polyhydramnios, umbilical cord entanglement, premature rupture of membranes, placental abruption, vaginal bleeding during pregnancy, and presence of a scarred uterus. 3.Delivery Process and Neonatal Conditions: Delivery time, total duration of labor, first stage of labor time (including latent and active phases), second stage duration, third stage duration, placental retention/adhesion/implantation, instrumental assistance in delivery, cervical, vaginal, and perineal lacerations, newborn weight, and newborn length.
PPH was defined as vaginal bleeding exceeding 500 ml within 24 h after vaginal delivery, corresponding to the clinical concept of early postpartum hemorrhage. PPH was primarily measured using the weighing method, which calculates the difference in weight of the absorbent materials before and after blood collection. In cases of heavier bleeding, blood was collected in a container and measured using a graduated cup. Due to the potential impact of multicollinearity on predictive accuracy, features with a high correlation (correlation coefficient > 0.6) in Spearman’s correlation analysis were handled by removing one of the two correlated features based on its lower correlation with the outcome. This result is illustrated in Supplementary Figure S1.
Model development and comparison
Data from the derivation cohort, collected from September 2018 to December 2022 at the three independent tertiary hospitals, was divided into a training set (70%) and a validation set (30%) to prevent over-fitting. An additional test data-set from admissions between January 2023 and December 2023, with the same inclusion and exclusion criteria as the derivation cohort, was used for external validation.
A total of 34 features were used to develop predictive models. Missing data were handled using median imputation, a common approach for dealing with missing values in clinical datasets. Six ML models were employed to predict PPH in critically ill pregnant women: eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LGBM), Gradient Boosting Decision Tree (GBDT), Gradient Boosting Machine (GBM), Adaptive Boosting (AdaBoost), and Bernoulli Naive Bayes (BNB). Common evaluation metrics, including the area under the receiver operating characteristic (ROC) curve (AUC), accuracy, precision, recall, and F1 score, were employed to assess the reliability and performance of the models. Additionally, Decision Curve Analysis (DCA) and calibration curves were applied to validate the predictive models on both the internal validation dataset and the external validation dataset.
Feature selection and model explanation
To ensure clinicians can accept and understand the predictive models, the SHapley Additive exPlanations (SHAP) methodology was used to calculate the contribution of each variable to the prediction, thereby explaining the output of the final model. This interpretability approach provides two types of explanations: a global explanation that describes the overall functionality of the model at the feature level, and a local explanation that shows how individual features impact the model’s output through a dependence plot.
SHAP values were utilized to assist in feature selection, ranking the features of the predictive model by importance and selecting those with the strongest predictive power for further analysis. The non-parametric method of Delong et al. was used to compare differences in AUC using MedCalc version 19.6 (https://www.medcalc.org). Features of the selected ML models were gradually reduced until a significant decrease in AUC occurred.
Webpage deployment tool
To enhance the clinical utility of the model, the final predictive model was implemented into a web application using the Streamlit Python framework. This application allows users to input values for the corresponding features of the final model, returning the probability of PPH and a force plot for individual sub-items.
Statistical analysis
Data analysis was conducted using Python version 3.6.5 (https://www.python.org) and SPSS statistical software version 23.0 (https://www.ibm.com/spss). Continuous variables with a skewed distribution are presented as medians with interquartile ranges and were compared using the Mann-Whitney U test or Kruskal-Wallis H test. Categorical variables are presented as numbers with percentages and were compared using the chi-square test or Fisher’s exact test. Analysis of covariance (ANCOVA) was used to adjust for confounding factors. AUCs were used to evaluate predictive efficacy. DCA was performed using R version 4.1.0 (https://www.r-project.org). A two-tailed p-value < 0.05 was considered statistically significant.
Results
Patient characteristics
This retrospective study included a total of 30,745 parturients for the identification of the predictive model cohort. During the study period from September 2018 to December 2022, 27,389 parturients were admitted to the obstetrics departments of the three hospitals, with 2,556 excluded based on the study’s exclusion criteria. The remaining 24,833 parturients who met the inclusion criteria were randomly assigned to separate training and internal validation groups (see Table 1). Additionally, in the external validation cohort admitted from January 2023 to December 2023, 257 parturients were excluded, resulting in 3,099 parturients included (Supplementary Table 1). Study design details are shown in Fig. 1.
Model development and performance comparison
Data collected during pregnancy and within 24 h after delivery were used to generate six ML models to predict the likelihood of PPH in parturients during the perinatal period. Among the six models, the XGBoost model (AUC = 0.997, CI: 0.997–0.998) demonstrated the best predictive performance for PPH, followed by the LGBM model (AUC = 0.980, CI: 0.977–0.984) and the GBDT model (AUC = 0.966, CI: 0.960–0.972). The ROC curves for all six ML models with all features included are shown in Fig. 2, with predictive values detailed in Table 2.
An initial predictive model incorporating all 34 identified risk factors was constructed, and SHAP value analysis was applied for feature selection. By plotting the SHAP values for each feature across all samples, we gained an intuitive understanding of the overall patterns in the data, which also facilitated the detection of outlier predictions. In these visualizations (Fig. 3 and Supplementary Fig. 2), each row corresponds to a different feature, with the horizontal axis representing the SHAP values. Individual data points represent samples, color-coded to indicate the magnitude of feature values—red for high and blue for low. This approach allowed us to discern the contribution of each feature to the model’s predictions and to identify any anomalies that might suggest a need for further investigation.
Identification of the final model
During the feature reduction process, based on feature importance ranking, the XGBoost model’s AUC and F1 score demonstrated that the model maintained good predictive power, with no significant change in predictive ability when the number of features was reduced to 15 (Fig. 4A-B, Supplementary Table 2, Supplementary Fig. 3). Thus, the final model was selected when the feature set was narrowed down to 15 features.
Multicollinearity among the 15 features was assessed to determine its potential impact on predictive accuracy. A correlation coefficient close to 0 indicates low correlation, with values less than 0.8 generally considered not correlated. Figure 4C shows that each feature exhibits independence, suggesting that multicollinearity is not a significant issue in this model.
Model explanation
As illustrated in the SHAP summary plot (Fig. 5), the contribution of the 15 selected features to the model was evaluated using average SHAP values, displayed in descending order. Figure 6 depicts the relationship between the actual values and SHAP values of these 15 features. SHAP values above zero correspond to a higher risk of PPH in the model’s positive class prediction. For instance, parturients with a newborn weight ≥ 3500 g or a second stage of labor ≥ 100 min have SHAP values above zero, pushing the decision towards PPH.
SHAP dependence plot. Each dependence plot shows how a single feature affects the output of the prediction model, and each dot represents a single patient. The SHAP values for specific features exceeding zero push the decision towards the “PPH” class. LPL: light physical labor; MPL: moderate physical labor; HPL: heavy physical labor; PROM: premature rupture of membranes; BMI: body mass index
Local explanations analyze how specific predictions for individual patients are made by combining personalized input data. Figure 7A and C, and 7E show parturients who did not experience PPH within 24 h postpartum, illustrating the impact of the selected features on the model’s output. According to the predictive model, the x-axis in Fig. 8A represents the probability of the sample being predicted as non-PPH, and the y-axis represents the selected features and their corresponding values. The waterfall plot starts with the expected model output on the x-axis (E[f(X)] = -3.051). This “baseline” value of -3.051 is the average predicted probability of the test set. The combination of positive contributions (red) and negative contributions (blue) shifts the expected value output to the final model output (f(x) = -6.327). Positive SHAP values increase the probability of the sample being classified as PPH, while negative SHAP values decrease it. The force plot provides further insights through an additive force layout (Fig. 7B).
Local model explanation by the SHAP method. (A) Waterfall plot of risks contributed by each feature for individual patient at low; (B) Waterfall plot of risks contributed by each feature for individual patient at high; (C) Force plot of risks contributed by each feature for individual patient at low; (D) Force plot of risks contributed by each feature for individual patient at high; (E) Evolution of risks contributed by each feature for individual patient at low; (F) Evolution of risks contributed by each feature for individual patient at high
Model evaluation. (A) ROC of train cohort; (B) ROC of internal validation cohort; (C) ROC of external validation cohort; (D) calibration curve of train cohort; (E) DCA curve of train cohort; (F) calibration curve of internal validation cohort; (G) DCA curve of internal validation cohort; (H) calibration curve of external validation cohort; (I) DCA curve of external validation cohort. ROC: receiver operating characteristic curve; AUC: area under curve; DCA: decision curve analysis
Similarly, Fig. 7B and D, and 7F show parturients who experienced PPH within 24 h postpartum. Figure 7B highlights the features that push or pull the decision towards the PPH category and their actual measured values, indicating that the decision for this case inclines towards PPH, with a probability of 32.3%.
Validation of the final model
To verify the robustness of the model and ensure an adequate sample size, we applied internal and external validation datasets. The AUC for the internal validation dataset was 0.894 (95% CI: 0.875–0.912) and for the external validation dataset was 0.880 (95% CI: 0.855–0.905), as Fig. 8. Although these AUC values are slightly lower than those observed in the training set, they still indicate strong predictive performance in both internal and external validations. The calibration curves and DCA also showed improvement in the internal and external validation datasets, addressing the imbalance in positive data seen in the test set.
Convenient application for clinical utility
The final predictive model has been implemented into a web application to facilitate practical use in clinical scenarios. By entering the actual values of the 15 features required by the model, the application predicts the risk of PPH for individual parturients. It also displays a force plot for each parturient, indicating the features contributing to the decision on PPH: features on the right side in blue indicate factors pushing the prediction towards “non-PPH,” while features on the left side in red push the prediction towards “PPH.” This web application is accessible online at https://postpartum-hemorrhage-prediction-model6.streamlit.app/.
Discussion
The era of big data has revolutionized clinical healthcare, and the application of ML in disease prediction and prognosis is increasingly prevalent. This study leverages SHAP values to assist in identifying risk factors and constructs a ML predictive model for PPH to predict the risk in women undergoing vaginal delivery. By utilizing big data to develop a diagnostic system for PPH, we can significantly enhance the accuracy of PPH diagnosis. While artificial intelligence, including ML, is making strides in obstetric disease diagnosis and treatment globally, the application of interpretable ML in clinical practice is still in its infancy. This study represents an important step towards improving the standardization of obstetric medical care and reducing maternal mortality, particularly by providing a tool for early identification of high-risk patients.
Currently, three widely used postpartum hemorrhage risk assessment tools globally are the California Maternal Quality Care Collaborative (CMQCC) [18] toolkit, the Association of Women’s Health, Obstetric and Neonatal Nurses (AWHONN) [19] guidelines, and the New York State Department of Health (NYSBOH) [3] guidelines. These tools have summarized and classified risk factors for PPH into low, medium, and high categories based on expert consensus. However, a comparative study of these three risk assessment scales found that they only have moderate reliability in predicting severe PPH in high-risk cesarean section groups [20]. In these tools, the incidence of PPH is significantly higher in the high-risk group only when a pregnant woman is classified as such [20]. Additionally, in Dilla et al.‘s study, the sensitivity of the CMQCC toolkit in predicting PPH requiring transfusion was only 22%, and the probability of severe PPH in the low-risk group was still 0.4–0.6% [21]. Therefore, adding more assessment indicators and improving modeling methods may enhance the accuracy of PPH prediction.
In 2021, Venkatesh et al. published a study utilizing ML to predict PPH [22]. This study included 152,279 childbirth cases, of which 7,279 (4.8%) experienced PPH exceeding 1000 milliliters. They included 55 risk factors and used random forest and extreme gradient boosting algorithms to develop ML models. The extreme gradient boosting algorithm achieved the best performance (AUC: 0.93; 95% CI: 0.92–0.93), followed by the random forest, demonstrating the high predictive performance of ML. However, this study primarily focused on PPH cases associated with cesarean sections, with 28% of the patients undergoing cesarean delivery, and 91% of PPH cases occurring in cesarean section patients.
Akazawa’s study [23] from 1995 to 2020 at the Tokyo Women’s Medical University East Center, involving 9,894 women who underwent vaginal delivery, applied eleven clinical variables to create a ML model predicting PPH, defined as blood loss > 1000mL. The study utilized an ensemble learning approach with five ML classifiers, including logistic regression, support vector machine, random forest, boosting tree, decision tree, and a deep learning model consisting of two-layer neural networks. The deep learning model demonstrated the best performance, achieving an AUC of 0.708 for PPH prediction, with an accuracy of 0.686, false positive rate (FPR) of 0.312, and false negative rate (FNR) of 0.398.
However, previous models lacked interpretability, as they did not explain the results of the ML algorithm predictions due to their black box nature. SHAP, a ML model interpretation method based on Shapley values from game theory, addresses this issue by assigning the contribution of model predictions to each feature, thus explaining the decision-making process of the model [24]. SHAP values quantify the impact of each feature on the model’s prediction results, aiding in understanding why the model gives specific predictions. In this study, SHAP was employed in the XGBoost model for its superior predictive performance and interpretability. Personalized explanations constructed through SHAP force analysis help doctors understand why the model makes specific high-risk recommendations, enhancing understanding of the decision-making process.
To further validate the contribution of risk factors to the model, SHAP feature importance and feature effects were calculated. Then 15 key variables that significantly predict PPH were identified. The most important input parameter for PPH was newborn weight, followed by stages of labor, nature of work, premature rupture of membranes, among others. The clinical significance of these variables is consistent with existing literature, emphasizing the importance of these common clinical characteristics in predicting PPH. Previous studies have demonstrated that larger neonatal birth weight is an independent risk factor for PPH [25]. Heavier infants are generally associated with uterine distension, prolonged labor, and difficult placental separation, all of which increase the risk of bleeding [26]. Prolonged labor, which has also been confirmed as a significant predictor of PPH, may indicate insufficient uterine contractions, abnormal fetal position, or difficulty in placental separation, thus increasing the likelihood of postpartum hemorrhage [27,28,29]. Interestingly, occupational factors were also identified as important predictors of PPH. Individuals engaged in moderate to heavy physical labor occupations had a lower probability of developing PPH, which may reflect the potential influence of lifestyle and socioeconomic factors. Additionally, older maternal age and obesity were found to be associated with a higher risk of PPH. Pregnant women delivering at > 40 weeks of gestation require enhanced surveillance, aligning with findings from previous research, especially for those delivering between 41 and 42 weeks [30]. Furthermore, we observed that certain pregnancy complications significantly increased the risk of PPH, highlighting the importance of timely preventive and management measures in clinical practice, such as infection prevention, blood pressure control, and anemia correction, to reduce the risk of PPH. Thus, SHAP analysis in this study not only helped us gain a deeper understanding of the predictive mechanisms of the ML model but also provided a bridge for applying the model in clinical practice, significantly enhancing its clinical feasibility.
However, this study has several limitations. Firstly, the data were derived solely from the Shenyang area, potentially introducing significant selection bias and limiting generalizability. Secondly, while the predictive model incorporated multiple risk factors and demonstrated high overall efficacy, the restrictive selection criteria may have influenced PPH prediction, hindering objective clinical utility assessment. Thirdly, while ML techniques require ‘big data’ for predictive model construction, there are no established standards for calculating the sample size needed. Therefore, caution should be exercised in interpreting the conclusions, and further evidence is warranted to confirm these findings in diverse populations.
Conclusion
In conclusion, our study has successfully developed an interpretable ML model capable of predicting postpartum hemorrhage (PPH) in patients undergoing vaginal delivery using readily available clinical data extracted from the hospital information system (HIS). his model represents a promising tool for early risk assessment and intervention in clinical practice. The final XGBoost model exhibited outstanding predictive performance for PPH, as validated internally and externally. Moving forward, prospective studies are essential to assess whether implementing individualized and timely treatment measures guided by our predictive model can lead to improved maternity outcomes, particularly in reducing PPH-related morbidity and mortality. This represents a crucial step towards personalized healthcare in obstetrics, potentially enhancing patient care and reducing maternal morbidity and mortality rates.
Data availability
The data that support the findings of this study are available from the corresponding author upon reasonable request.
References
Patek K, Friedman P. Postpartum Hemorrhage-Epidemiology, risk factors, and causes. Clin Obstet Gynecol. 2023;66(2):344–56.
Federspiel JJ, Eke AC, Eppes CS. Postpartum hemorrhage protocols and benchmarks: improving care through standardization. Am J Obstet Gynecol MFM. 2023;5(2s):100740.
Zheutlin AB, Vieira L, Shewcraft RA, Li S, Wang Z, Schadt E, Gross S, Dolan SM, Stone J, Schadt E, et al. Improving postpartum hemorrhage risk prediction using longitudinal electronic medical records. J Am Med Inf Assoc. 2022;29(2):296–305.
Faysal H, Araji T, Ahmadzia HK. Recognizing who is at risk for postpartum hemorrhage: targeting anemic women and scoring systems for clinical use. Am J Obstet Gynecol MFM. 2023;5(2s):100745.
Gonzalez-Brown V, Schneider P. Prevention of postpartum hemorrhage. Semin Fetal Neonatal Med. 2020;25(5):101129.
Bi Q, Goodman KE, Kaminsky J, Lessler J. What is machine learning?? A primer for the epidemiologist. Am J Epidemiol. 2019;188(12):2222–39.
Mitchell TMJCA. Machine learning and data mining. 1999, 42(11).
Li R, Chen Y, Ritchie MD, Moore JH. Electronic health records and polygenic risk scores for predicting disease risk. Nat Rev Genet. 2020;21(8):493–502.
Heo J, Yoon JG, Park H, Kim YD, Nam HS, Heo JH. Machine Learning-Based model for prediction of outcomes in acute stroke. Stroke. 2019;50(5):1263–5.
McCoubrey LE, Gaisford S, Orlu M, Basit AW. Predicting drug-microbiome interactions with machine learning. Biotechnol Adv. 2022;54:107797.
Leong KTG, Wong LY, Aung KCY, Macdonald MR, Cao Y, Lee SSG, Chow WL, Doddamani S. Richards AMJTAjoc: risk stratification model for 30-Day heart failure readmission in a multiethnic South East Asian community. Am J Cardiol. 2017;119 9:1428–32.
Lindberg DS, Prosperi M, Bjarnadottir RI, Thomas J, Crane M, Chen Z, Shear K, Solberg LM, Snigurska UA, Wu Y, et al. Identification of important factors in an inpatient fall risk prediction model to improve the quality of care using EHR and electronic administrative data: A machine-learning approach. Int J Med Inf. 2020;143:104272.
Wang H, Fu T, Du Y, Gao W, Huang K, Liu Z, Chandak P, Liu S, Van Katwyk P, Deac A, et al. Scientific discovery in the age of artificial intelligence. Nature. 2023;620(7972):47–60.
Lewanowicz A, Wiśniewski M, Oronowicz-Jaśkowiak W. The use of machine learning to support the therapeutic process - strengths and weaknesses. Postep Psychiatr Neurol. 2022;31(4):167–73.
Handelman GS, Kok HK, Chandra RV, Razavi AH, Huang S, Brooks M, Lee MJ, Asadi H. Peering into the black box of artificial intelligence: evaluation metrics of machine learning methods. AJR Am J Roentgenol. 2019;212(1):38–43.
Lundberg SM, Lee S-I. A Unified Approach to Interpreting Model Predictions. In: Neural Information Processing Systems: 2017; 2017.
Ali S, Akhlaq F, Imran AS, Kastrati Z, Daudpota SM, Moosa M. The enlightening role of explainable artificial intelligence in medical & healthcare domains: A systematic literature review. Comput Biol Med. 2023;166:107555.
Main EK, Goffman D, Scavone BM, Low LK, Bingham D, Fontaine PL, Gorlin JB, Lagrew DC, Levy BS. National partnership for maternal safety: consensus bundle on obstetric hemorrhage. Anesth Analg. 2015;121(1):142–8.
Bingham D, Scheich B, Bateman BT. Structure, process, and outcome data of AWHONN’s postpartum hemorrhage quality improvement project. J Obstetric Gynecologic Neonatal Nurs. 2018;47(5):707–18.
Kawakita T, Mokhtari N, Huang JC, Landy HJ. Evaluation of Risk-Assessment tools for severe postpartum hemorrhage in women undergoing Cesarean delivery. Obstet Gynecol. 2019;134(6):1308–16.
Dilla AJ, Waters JH, Yazer MH. Clinical validation of risk stratification criteria for peripartum hemorrhage. Obstet Gynecol. 2013;122(1):120–6.
Venkatesh KK, Strauss RA, Grotegut CA, Heine RP, Chescheir NC, Stringer JSA, Stamilio DM, Menard KM, Jelovsek JE. Machine learning and statistical models to predict postpartum hemorrhage. Obstet Gynecol. 2020;135(4):935–44.
Akazawa M, Hashimoto K, Katsuhiko N, Kaname Y. Machine learning approach for the prediction of postpartum hemorrhage in vaginal birth. Sci Rep. 2021;11(1):22620.
Lundberg SM, Nair B, Vavilala MS, Horibe M, Eisses MJ, Adams T, Liston DE, Low DK-W, Newman S-F, Kim J, et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat Biomedical Eng. 2018;2(10):749–60.
Beta J, Khan N, Khalil A, Fiolna M, Ramadan G, Akolekar R. Maternal and neonatal complications of fetal macrosomia: systematic review and meta-analysis. Ultrasound Obstet Gynecol. 2019;54(3):308–18.
Biguzzi E, Franchi F, Ambrogi F, Ibrahim B, Bucciarelli P, Acaia B, Radaelli T, Biganzoli E, Mannucci PM. Risk factors for postpartum hemorrhage in a cohort of 6011 Italian women. Thromb Res. 2012;129(4):E1–7.
Nyflot LT, Stray-Pedersen B, Forsen L, Vangen S. Duration of labor and the risk of severe postpartum hemorrhage: A case-control study. PLoS ONE. 2017;12(4):e0175306.
Ladfors LV, Liu X, Sandstrom A, Lundborg L, Butwick AJ, Muraca GM, Snowden JM, Ahlberg M, Stephansson O. Risk of postpartum hemorrhage with increasing first stage labor duration. Sci Rep. 2024;14(1):22152.
Frolova AI, Stout MJ, Tuuli MG, Lopez JD, Macones GA, Cahill AG. Duration of the third stage of labor and risk of postpartum hemorrhage. Obstet Gynecol. 2016;127(5):951–6.
Reale SC, Bateman BT, Farber MK. Exploring new risk factors for postpartum hemorrhage: time to consider gestational age?? Anesthesiology. 2021;134(6):832–4.
Acknowledgements
We would like to express our gratitude to all those who helped us during the writing of this manuscript. Thanks to all the peer reviewers for their opinions and suggestions.
Funding
This study was supported in part by grants from 345 Talent Project of Shengjing Hospital of China Medical University (No. M0946), and Medical Education Research Project of Liaoning Province (No. 2024-N004-03), Liaoning Province Science and Technology Plan Joint Program (2024-MSLH-561).
Author information
Authors and Affiliations
Contributions
ZS and DZ designed the study and drafted the manuscript. HL, MS, and XW done the data collection. YZ and XC designed the statistical analysis plan. ZS has participated the training and reviewed and co-authored the manuscript with DZ.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Ethics approval and consent to participate: The study was approved by the Ethics Committee of Shengjing Hospital of China Medical University (No. 2016PS344K, Date.17/12/2016). All participants provided informed consent.
Consent for publication
Not Applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Song, Z., Lin, H., Shao, M. et al. Integrating SHAP analysis with machine learning to predict postpartum hemorrhage in vaginal births. BMC Pregnancy Childbirth 25, 529 (2025). https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12884-025-07633-w
Received:
Accepted:
Published:
DOI: https://doiorg.publicaciones.saludcastillayleon.es/10.1186/s12884-025-07633-w