Exploring the potential of cell-free RNA and Pyramid Scene Parsing Network for early preeclampsia screening

Zhao, Zhuo; Liu, Xiaoxu; Guan, Yonghui; Li, Chunfang; Wang, Zheng

doi:10.1186/s12884-025-07503-5

Research
Open access
Published: 14 April 2025

Exploring the potential of cell-free RNA and Pyramid Scene Parsing Network for early preeclampsia screening

Zhuo Zhao^1,2,3,
Xiaoxu Liu⁴,
Yonghui Guan⁵,
Chunfang Li⁶ &
…
Zheng Wang^1,2

BMC Pregnancy and Childbirth volume 25, Article number: 445 (2025) Cite this article

534 Accesses
Metrics details

Abstract

Background

Circulating cell-free RNA (cfRNA) is gaining recognition as an effective biomarker for the early detection of preeclampsia (PE). However, the current methods for selecting disease-specific biomarkers are often inefficient and typically one-dimensional.

Purpose

This study introduces a Pyramid Scene Parsing Network (PSPNet) model to predict PE, aiming to improve early risk assessment using cfRNA profiles.

Methods

The theoretical maximum Preeclamptic Risk Index (PRI) of patients clinically diagnosed with PE is defined as “1”, and the control group (NP) is defined as “0”, referred to as the clinical PRI. A data preprocessing algorithm was used to screen relevant cfRNA indicators for PE. The cfRNA expression profiles were obtained from the Gene Expression Omnibus (GSE192902), consisting of 180 normal pregnancies (NP) and 69 preeclamptic (PE) samples, collected at two gestational time points: ≤ 12 weeks and 13–20 weeks. Based on the differences in cfRNA expression profiles, the Calculated Ground Truth values of the NP and PE groups in the sequencing data were acquired (Calculated PRI). The differential algorithm was embedded in the PSPNet neural network and the network was then trained using the generated dataset. Subsequently, the real-world sequencing dataset was used to validate and optimize the network, ultimately outputting the PRI values of the healthy control group and the PE group (PSPNet-based PRI). The model’s predictive ability for PE was evaluated by comparing the fit between Calculated PRI (Calculated Ground Truth) and PSPNet-based PRI.

Results

The mean absolute error (MAE) between the Calculated Ground Truth the PSPNet-based PRI was 0.0178 for cfRNA data sampled at ≤ 12 gws and 0.0195 for data sampled at 13–20 gws. For cfRNA data sequenced at ≤ 12 gws and 13–20 gws, the corresponding loss values, maximum absolute errors, peak-to-valley error values, mean absolute errors, and average prediction times per sample were 0.0178 (0.0195).

Conclusions

The present PSPNet model is reliable and fast for cfRNA-based PE prediction and its PRI output allows for continuous PE risk monitoring, introducing an innovative and effective method for early PE prediction. This model enables timely interventions and better management of pregnancy complications, particularly benefiting densely populated developing countries with high PE incidence and limited access to routine prenatal care.

Peer Review reports

Background

Preeclampsia (PE) is a critical pregnancy complication marked by the emergence of hypertension after 20 weeks of gestation, often leading to multi-organ dysfunction in the mother. This condition is a significant global health concern, accounting for around 70,000 maternal deaths and 500,000 fetal and neonatal deaths each year [1,2,3,4]. Despite extensive research efforts utilizing maternal risk factors, mean arterial pressure, uterine artery pulsatility index, and various biochemical markers like pregnancy-associated plasma protein A, soluble vascular endothelial growth factor receptor 1, soluble endoglin, placental growth factor (PlGF), and soluble fms-like tyrosine kinase 1 (sFlt-1) [5,6,7,8,9,10], PE is frequently diagnosed late or missed, underscoring the necessity for more precise and early biomarkers and predictive tools.

Recently, circulating cell-free RNA (cfRNA) has emerged as a promising area of study. CfRNA consists of a diverse mixture of transcripts, including microRNA, long non-coding RNA, circular RNA, transfer RNA, and messenger RNA, derived from various cell types. Its association with numerous health conditions and presence in multiple body fluids have made cfRNA a valuable target for clinical applications such as bone marrow transplantation, neurodegeneration, cardiovascular diseases, oncology, and obstetrics [11,12,13,14,15,16,17,18,19]. A pivotal study by Quake et al. highlighted that a set of 18 cfRNA markers [20], identifiable between 5 to 16 weeks of gestation, could form the basis of a liquid biopsy test to predict potential PE cases well before symptoms appear. This correlation between cfRNA levels and organ health in PE suggests cfRNA’s potential as a vital biomarker.

Traditionally, cfRNA studies have focused on the overexpression and mutations of known genes, with polymerase chain reaction (PCR) being the primary technique used. However, the biomarker selection process for specific diseases remains largely inefficient and predominantly one-dimensional. Developing a comprehensive cfRNA data analysis approach could significantly enhance the use of extensive sequencing data, leading to more accurate early PE screening.

Artificial intelligence, particularly deep learning, offers promising advancements in medical predictions and diagnostics. These models, trained to learn from data, have been successfully applied in various medical fields such as interpreting chest radiographs, identifying hypertension, and classifying breast cancer [21,22,23,24,25,26]. Schmidt et al [27]. recently demonstrated that integrating extensive medical history, current condition, and laboratory data into machine learning algorithms, such as gradient-boosted trees and random forests, can effectively predict adverse PE outcomes. As more data is incorporated and algorithms are refined, the accuracy of these predictive models is expected to improve, making them invaluable tools for PE prediction.

In our research, we have developed a deep learning algorithm to calculate a Preeclamptic Risk Index (PRI) for pregnant women using cfRNA profiling. We implemented a Pyramid Scene Parsing Network (PSPNet) [28, 29], which achieved a remarkable alignment with the ground truth, exhibiting an average prediction error of 0.043 across 249 samples and a computational time of 10^–4 s per sample. This innovative method facilitates rapid and precise PE risk assessment, offering significant potential to transform prenatal care by enabling timely intervention and personalized monitoring for at-risk pregnancies (Fig. 1).

Methods

Ethical statement

This study was exempt from ethics approval by the Ethical Committee of Xi’an Jiaotong University as it involved the analysis of publicly available data from the Gene Expression Omnibus (GSE192902) database and did not involve direct interaction with human subjects or animal models.

Study design and prediction mechanism

To predict PE risk, we approached it as a data regression problem, creating a mapping between maternal plasma cfRNA profiles and probability vectors. Initially, we preprocessed the cfRNA sequencing data to filter out markers with significant differences between normotensive (NP) and preeclamptic (PE) groups. Based on the guidelines from the American College of Obstetricians and Gynecologists (ACOG), PE is defined as new-onset hypertension (systolic ≥ 140 mmHg or diastolic ≥ 90 mmHg) occurring after 20 weeks of gestation, accompanied by proteinuria or signs of end-organ dysfunction. Women with normotensive, uncomplicated pregnancies served as the normal control group. These filtered markers were used to construct training and validation datasets for our neural network model. The trained network predicts PE risk by generating a Preeclamptic Risk Index (PRI) based on variations in cfRNA profiles during early pregnancy (Fig. 2A).

Real-World cfRNA profiling data

We obtained standardized and cleaned cfRNA sequencing data from the Gene Expression Omnibus (GSE192902) dataset. PE diagnosis was based on guidelines from the American College of Obstetrics and Gynecology (ACOG), and women with uncomplicated pregnancies served as the normal control group. Exclusion criteria ensured no participants had chronic hypertension or gestational diabetes. Additionally, the NP and PE groups were matched for race and ethnicity. Detailed demographic data was shown in Supplementary Table 1 and differences were analyzed using a chi-squared test for categorical variables and ANOVA for continuous variables. A total of 87 cfRNA profile sets were used as the training dataset, and 249 sets were used as the validation dataset.

Data filtration

To establish a robust relationship between multidimensional cfRNA expression profiling and PE risk, we filtered the data to select cfRNAs with significant changes that could serve as PE risk indicators. The preprocessing of cfRNA sequencing data involved four key steps (Fig. 2B): The selection of significant cfRNA indicators is mainly based on the data mathematical statistics from the initial sequencing dataset. There are 4 rules for cfRNA selection in data preprocessing stage. 1) The cfRNA indicators both contain “0” expressive abundance in PE and NP group would not be suitable for PRI evaluation. It can be easily understood that PE probability contribution of this indicator will never be distinguished for PE or NP. 2) The indicator ranges of expressive abundance have 100% overlapping between PE and NP are not suitable for PRI evaluation. Because certain sequencing data may both contribute to the probability for the risks of PE and non-PE. 3) The indicators have overlapping rate below 0.6 between PE and NP distribution range are suitable for PE probability evaluation; 4) The mean values of indicator distribution ranges should have significant difference between PE and NP group. Here, the mean value difference is 1.5 ~ 2.0 multiple between PE and NP distribution ranges of this indicator. By leverage conditions (1), (2), (3), (4), those cfRNA indicators with significant difference in distribution ranges between PE and NP group are used for PRI evaluation. Each indicator contributes equal weight for the final PRI probability. Through analysis, data preprocessing is a tradeoff between the data dimensions in PRI evaluation and the filtered cfRNA indicators with significant differences. The more cfRNA indicators are selected, the more biological plausibility is taken into account, and the more scientific results will be obtained. On the other hand, more cfRNA indicators with lower differences are selected in the evaluation, the lower PRI discrimination will be produced between PE and NP probability. Therefore, the parameters/thresholds in data preprocessing stage are optimized by the final PRI result.

There are only 29 cfRNA indicators are selection from total 7163 sequencing data for 0 ~ 12 weeks pregnancy PRI evaluation and 25 cfRNA indicators for 13 ~ 20 weeks evaluation.

Dataset generation and definition of Preeclamptic Risk Index (PRI)

To train our neural network model, we needed a substantial dataset. This was achieved by analyzing real-world cfRNA profiling data and generating a synthetic dataset [30,31,32]单击或点击此处输入文字。. The dataset construction process is detailed as follows:

Before filtering, cfRNAs in NP and PE groups were represented as CN [cfRNA, normal pregnancy] ([r, n]) and CP [cfRNA, preeclamptic pregnancy] ([r, p]), respectively. Initially, we had 7,160 cfRNAs, where “r” is the identification number for each cfRNA, and “n” and “p” are identifiers for participants in NP and PE groups.

After filtering, cfRNAs in NP and PE groups were represented as SN [selected cfRNA, normal pregnancy] ([s, n]) and SP [selected cfRNA, preeclamptic pregnancy] ([s, p]), respectively. Here, “s” denotes the number of cfRNAs post-filtration, determined by the parameters of the preprocessing algorithm.

Using these filtered cfRNAs, we built a training dataset for the proposed PSPNet model. The training dataset consisted of the parameters x_train and y_train, and the validation dataset comprised x_test and y_test. In this context, x_train and x_test are the cfRNA expression matrices, and y_train and y_test are the corresponding probability vectors contributing to PE risk. The mean value of the PE probability vector was denoted as the PRI. Our dataset included two components: 1) actual cfRNA expression sequencing data from maternal peripheral blood and 2) values generated randomly using a Gaussian function. The expression quantities in x_train and x_test adhered to practical sequencing range distributions. The dataset generation using the Gaussian function is outlined in Eq. (1):

$$\begin{gathered} \left\{ {\begin{array}{*{20}c} {x\_train[i] = rands\left\{ {Max\left[ {Max\left( {SP[i]} \right),Max\left( {SN[i]} \right)} \right],Min\left[ {Min\left( {SP[i]} \right),Min\left( {SN[i]} \right)} \right],M} \right\}} \\ {x\_test[i] = rands\left\{ {Max\left[ {Max\left( {SP[i]} \right),Max\left( {SN[i]} \right)} \right],Min\left[ {Min\left( {SP[i]} \right),Min\left( {SN[i]} \right)} \right],Q} \right\}} \\ \end{array} } \right. \hfill \\ i \in [1,2,3, \cdots ,N] \hfill \\ \end{gathered}$$

(1)

In this equation, rands() denotes the Gaussian random function, Max() and Min() are the maximum and minimum value functions, and M and Q are the numbers of samples to be generated. We calculated the contribution of each cfRNA to the occurrence of PE or NP based on the expression levels and clinical diagnosis. For instance, if the expression of ENSG00000000460 was significantly higher in the PE group compared to the NP group, it was assigned a contribution value of “1” to PE.

The cfRNA contribution vector sets (y_train and y_test) were computed from x_train and x_test and clinical diagnosis data. This process is described by Eqs. (2) and (3):

$$\begin{gathered} \left\{ {\begin{array}{*{20}c} {y\_train[i] = \frac{x\_train[i] - Min(x\_train[i])}{{Max(x\_train[i]) - Min(x\_train[i])}} \, , \, if \, avg(SP[i])> avg(SN[i])} \\ {y\_train[i] = \frac{Max(x\_train[i]) - x\_train[i]}{{Max(x\_train[i]) - Min(x\_train[i])}} \, , \, if \, avg(SP[i]) < avg(SN[i])} \\ \end{array} } \right. \hfill \\ i \in [1,2,3, \cdots N] \hfill \\ \end{gathered}$$

(2)

$$\begin{gathered} \left\{ {\begin{array}{*{20}c} {y\_test[i] = \frac{x\_test[i] - Min(x\_test[i])}{{Max(x\_test[i]) - Min(x\_test[i])}} \, , \, if \, avg(SP[i])> avg(SN[i])} \\ {y\_test[i] = \frac{Max(x\_test[i]) - x\_test[i]}{{Max(x\_test[i]) - Min(x\_test[i])}} \, , \, if \, avg(SP[i]) < avg(SN[i])} \\ \end{array} } \right. \hfill \\ i \in [1,2,3, \cdots N] \hfill \\ \end{gathered}$$

(3)

where avg() is the average value function, and N = s.

Ultimately, we obtained x_train and y_train with dimensions [M, N] and x_test and y_test with dimensions [Q, N]. These values were reshaped to [1, N, M] and [1, N, Q], respectively, through dimension transformation. Here, N = s, M = 8000, and Q = 500, indicating that there were 8000 training sample vectors in 1 × N and 500 validation sample vectors in 1 × N. The PRI was computed as the average of y_test. Table 1 provides details of the dataset.

Table 1 Dataset for PSPNet training and validation

Full size table

Pyramid Scene Parsing Network (PSPNet) construction

Model architecture

The PSPNet was designed to predict PRI using cfRNA expressions from maternal peripheral blood samples collected within 12 weeks and between 13–20 weeks of gestation. The model comprises three primary modules: the Convolutional Neural Network (CNN) module, the Pyramid Pooling module, and the Output module (Fig. 3A, B).

CNN module

(1) The CNN module serves as the input layer, extracting feature maps from the cfRNA expression dataset and capturing low-level semantic information essential for subsequent layers; (2) Configuration: Convolutional kernel size is 3 × 3, with 128 channels and a stride of 5, Relu is adopted as activation function.

Pyramid pooling module

(1) This module employs a pyramid structure to capture multi-scale global contextual features from each sub-region. Intermediate feature maps are further processed through max-pooling layers to produce refined feature maps at different scales. The convolutional layers extract semantic information from low to high levels; (2) Configuration: Convolutional kernel size is 3 × 3, with 128 channels and a stride of 5, Relu is adopted as activation function.

Output module

(1) The Output module concatenates local pointwise features with learned multi-scale contextual features, resulting in more accurate predictions than using a baseline model alone. The final PE probability vector, generated by a dense layer, reveals the contributions of significant cfRNA expressions to PE risk; (2) Configuration: Convolutional kernel size is 3 × 3, with 128 channels and a stride of 5, Relu is adopted as activation function. The dense layer kernel size is 1 × 1, with 128 channels.

Preprocessing and data filtering

Before inputting the data into PSPNet, the cfRNA sequencing data undergo preprocessing to filter out cfRNAs with significant features. This step ensures that only the most relevant cfRNAs are used as input, with the predicted PE probability (PRI) as the output.

Overfitting prevention

To mitigate overfitting during model training, Batch Normalization layers are inserted before each convolutional layer to normalize input features. Additionally, a dropout operator with a parameter set to 0.25 is added after the convolutional layers.

Training configuration

The training of PSPNet is configured with the following hyperparameters (Fig. 3C). Optimizer: Adam; Learning Rate: 0.0005; Loss Function: Mean Absolute Error (MAE); Metrics: Accuracy; Batch Size: 16; Epochs: 500; Validation Split: 0.05.

The MAE loss function is defined as:

$$MAE\left( {y_{i} ,\hat{y}_{i} } \right) = \frac{1}{N}\sum\limits_{i = 1}^{N} {\left| {y_{i} - \hat{y}_{i} } \right|}$$

where y_i is the ground truth of PE probability, $\hat{y}_{i}$ is the predicted PE probability, and N is the number of samples.

Deep learning technique exerts advantages in data fitting, target classification, and information prediction. The more sample data used in training and validation stage, the better performance will be obtained in subsequent applications. Though sequencing data of NP and PE reaches 249 in total, the quantity is still limited for mainstream deep learning model to make optimal training and validation. Therefore, we take the following measures to enhance dataset and ensure the effectiveness. 1) Set multiple rules to select significant cfRNA indicators for prediction; 2) according to a distribution of those indicators, expand the sample data and dataset using Gaussian random function; 3) randomly select 30% real-world sequencing data and generate virtual dataset for model training and validation; 4) leverage 100% real-world sequencing data for final test and evaluation of proposed technique. The experimental dataset has non-overlap with those in training stage. All the results in this research article are based on the true sequencing data. Here, training-validation split (95–5) is just 95% samples in dataset for training and 5% for the validation of the loss value.

Computational efficiency

PSPNet provides an effective global contextual prior for single cfRNA expression-level scene parsing. The pyramid pooling module collects multi-level information more representatively than global pooling. PSPNet does not significantly increase computational cost compared to the original dilated Fully Convolutional Network (FCN). Both the global pyramid pooling module and the local FCN features are optimized simultaneously in end-to-end learning.

Hardware configuration

The PSPNet model was trained and validated on a computer with the following specifications. CPU: Intel Core i5 13600 K 3.5 GHz 14C/20 T; RAM: DDR4 3000 MHz 32 GB; GPU: Nvidia RTX2080Ti 11 GB; Storage: SSD M.2 3600 Mb/s 1 TB. This hardware configuration ensured efficient handling of the computational demands during the PSPNet training process.

Results

Data filtration for cfRNA indicators

A total of 7,160 cfRNAs were initially detected and subsequently filtered through multiple tests and parameter optimizations, we identified cfRNAs with significant differences between the NP and PE groups. Specifically, 29 cfRNAs were selected as PE indicators for samples collected at ≤ 12 weeks of gestation (gws), and 25 cfRNAs were chosen for samples collected at 13–20 gws (Table 2).

Table 2 Filtered cfRNA indicators for different sampling time

Full size table

Calculation of PRI ground truth

Clinical diagnosis outcomes for enrolled women were classified as either NP or PE. The PRI for women diagnosed with PE was defined as “1,” while for those with NP, it was defined as “0”. The PRI for each participant was calculated accordingly. Using Eqs. (2) and (3) to process the cfRNA profiles, we derived the PRI for each sampling time, which served as the “calculated PRI” or ground truth.

For samples collected at ≤ 12 gws, the average calculated PRI was 0.05 for the NP group and 0.35 for the PE group (Fig. 4A). For samples collected at 13–20 gws, the average calculated PRI was 0.39 for the NP group and 0.56 for the PE group (Fig. 4B). The distinct differences in the average calculated PRI between NP and PE groups underscore the effectiveness of the filtered cfRNA indicators in distinguishing between these two conditions, supporting the model’s clinical applicability.

PSPNet-based PRI verification

Training and validation of the PSPNet model reduced MAE of probability prediction to 0.019, resulting in an optimized model. To validate the model’s accuracy, real-world cfRNA expression data were used as the input set x_test for the PSPNet model to obtain the PSPNet-based PRI. This PRI was then compared with the ground truth (calculated PRI) from real-world cfRNA profiling. For samples collected at ≤ 12 gws, the predicted PSPNet-based PRI closely matched the ground truth, indicating the model’s fitting ability (Fig. 4C). Abscissa axis of figure is the sample index under prediction and the value of each point is the corresponding PE PRI. The MAE between the prediction and ground truth was only 0.0178, effectively distinguishing PE from NP using the average PSPNet-based PRI. Similarly, for samples collected at 13–20 gws, the PSPNet predictions approximated the ground truth, with an MAE of only 0.0195 (Fig. 4D).

Prediction error and time efficiency of PSPNet

The error amplitude of the PSPNet-based PRI for cfRNA samples collected at ≤ 12 gws is shown in Fig. 5A. The maximum absolute error, peak-to-valley (PV) error, and mean absolute error were 0.098, 0.114, and 0.032, respectively. For samples collected at 13–20 gws (Fig. 5B), the maximum absolute error was 0.13, the PV error was 0.195, and the mean absolute error was 0.055. Overall, the prediction error for PRI was well-contained within a small range, demonstrating the PSPNet model’s strong data-fitting capabilities.

We also evaluated the processing efficiency for large datasets from population screenings, using the prediction time efficiency as a benchmark. The time required to output a PRI value was recorded for cfRNA profiling samples fed into the trained PSPNet model. As shown in Fig. 5C and D, across 15 consecutive experiments, the average time to output a PRI was 10^–4 s per sample.

Comparative experiments and comprehensive evaluation

To further evaluate the effectiveness of proposed method, we have added comparative experiments to test the prediction performance in multi-dimensions. Convolutional neural network (CNN) and Multilayer Perceptron (MLP) are wildly applied techniques in data prediction, classification and fitting, that play key roles in sequencing data analysis. The same way, 12 gws and 13–20 gws dataset are engaged in the prediction tests by leveraging CNN and MLP whose corresponding results can be compared with that of proposed method (Fig. 4C and D). Figure 6A and B show the prediction results of 12 gws and 13–20 gws groups that produced by MPL; Fig. 6C and D shows the prediction results of 12 gws and 13–20 gws groups that produced by CNN. Abscissa axis of each plot is the sample index under prediction and the value of each point is the corresponding PE PRI. It can be found from the plot that predicted PRI still have significant difference with the ground truth in both 12gws and 13-20gws. While CNN produces a more similar trend than MLP in PRI distributions. MLP predicts more accurate PRI in NP samples and CNN model have advantages in PE PRI prediction.

From Fig. 4C and D, it can be found that the proposed method shows the smallest error between prediction results and ground truth in both 12 gws and 13 ~ 20 gws data. It shows excellent performance in both NP and PE PRI predictions. While in prediction accuracy, CNN has a better performance than MLP, especially for higher PRI samples. Based on the comparative experiments, proposed model, Convolutional neural network (CNN) and Multilayer Perceptron (MLP) are all join the comprehensive evaluation under the matric of MAE, Precision, Recall, AUC, ROC curve, and F1-score. Here, comprehensive evaluation results on 12 gws and 13–20 gws data are shown in the Supplementary Table 3 and Supplementary Table 4 respectively. All the superior data in the tables are enhanced in each method and matric. The less MAE score is obtained the smaller prediction error will be produced. As for the Precision, Recall, AUC, ROC curve, and F1-score, the higher scores are gained the more accurate classifier will be attained.

Concluded from the comparative experiments, the proposed show the excellent performance in PRI prediction. In the evaluation of classifier effectiveness, Receiver Operating Characteristic are shown in the Fig. 7 reflects the classification performance of each model. Overall, the proposed method has a higher ROC curve and larger AUC in diagrams than others, which shows better classification effective in PE samples.

Discussion

In this study, we identified 29 cfRNAs as indicators of PE for samples sequenced at ≤ 12 gws and 25 cfRNAs for samples sequenced at 13–20 gws. During the training and validation phases, we developed a PSPNet model that processes cfRNA profiling data to generate the PRI. The MAE between the predicted PRI and the ground truth was 0.0178 for cfRNA data sampled at ≤ 12 gws and 0.0195 for data sampled at 13–20 gws. The predicted PRI values closely matched the ground truth, with maximum absolute error values of 0.098 and 0.13, PV error values of 0.114 and 0.195, and mean absolute errors of 0.032 and 0.055 for samples at ≤ 12 gws and 13–20 gws, respectively. Additionally, the average prediction time for PRI was 10^–4 s per sample. These results demonstrate the strong fitting ability of our PSPNet model, suggesting its potential for effective clinical implementation to predict PE before 20 gws almost instantaneously for individual patients.

Early prediction of PE significantly enhances prophylactic measures, benefiting maternal and neonatal healthcare. Quake et al. constructed a logistic regression model with an elastic net penalty and identified a panel of 18 cfRNAs from 5 to 16 gws, forming the basis of a liquid biopsy test for early PE detection [31]. Compared to the sFlt-1/PlGF ratio used in mid-gestation PE prediction [32], cfRNA offers earlier and more sensitive predictive capabilities. Our approach involved downloading cfRNA profiles and applying our data filtration principles, resulting in the selection of 29 cfRNAs as PE risk indicators for samples sequenced at ≤ 12 gws and 25 cfRNAs for samples sequenced at 13–20 gws. Unlike previous studies, we did not filter the same cfRNAs identified by Quake et al. Our analysis revealed common cfRNAs (RN7SL5P, RN7SL665P, RN7SL674P, RN7SL736P, RN7SL752P, RNA5SP202, and RNA5SP267) across both time points, suggesting further investigation into their roles in PE pathogenesis.

Advancements in artificial intelligence have also contributed to PE detection and diagnosis. Maric et al. developed a machine learning-based PE prediction model using statistical learning methods to analyze clinical and laboratory data from routine prenatal visits, achieving an area under the curve (AUC) of 0.89 for early-onset PE prediction. Schmidt et al. integrated real-world medical history, current condition, and laboratory variables into a machine learning-based algorithm using gradient-boosted tree and random forest models [27]. These examples underscore the potential of machine learning to integrate conventional maternal risk factors, biophysical markers, and maternal plasma protein levels in PE prediction. In contrast, our PSPNet model, a deep learning algorithm distinct from statistical learning methods, demonstrated significant advantages in multi-object classification, image segmentation, data fitting, and prediction. Unlike previous studies relying on medical history and laboratory variables as input data, we utilized novel cfRNA biomarkers to train and evaluate our PSPNet model.

Our study achieved an average prediction time of 10^–4 s per sample, a metric not addressed in previous research. This rapid prediction capability is crucial for processing large datasets from population screenings, allowing thousands of sequencing data points to be analyzed in seconds. This high-speed prediction method is suitable for clinical practice as an auxiliary diagnostic tool, particularly in remote rural areas with limited access to prenatal care.

The integration of sensitive cfRNA biomarkers with our PSPNet model facilitates consistent evaluation of PRI from the first clinical prenatal visit. This approach enables continuous monitoring of PE risk and serves as a comprehensive response indicator for prophylactic treatments in pregnant women. Given the PSPNet model’s rapid processing time of 10^–4 s per sample, it is feasible to implement a cloud laboratory system to predict PE from cfRNA profiling samples across China, providing early warnings for women with hidden-onset PE, especially in remote areas.

Our findings suggest several avenues for future research. Integrating medical history, laboratory variables, and additional biomarkers into the current PSPNet model could enhance the accuracy of PRI predictions for pregnant women, enabling precise PE risk assessment at any gestational week. As cfRNA profiling is associated with PE-related tissue and organ function, the model could be modified to provide warnings about specific tissues or organs compromised by PE development. With the rapid advancement of AI and the increasing use of public databases in healthcare research, it is crucial to ensure patient privacy protection and responsible data usage. Additionally, the future of medical artificial intelligence will require improved clinical data availability and interoperability, necessitating the construction of large-scale medical information and data storage facilities.

This study introduces a novel deep learning-based PSPNet model using sensitive cfRNA biomarkers, providing a foundation for further research. The model can evaluate the PRI of pregnant women as early as 12 gws, earlier than previous methods. The prediction error of our PSPNet model is well-controlled within 0.043 (mean value), and the processing time is only 10^–4 s, indicating excellent potential for clinical application.

However, the study has limitations. Deep learning models are prone to fitting errors and may overfit the training data due to their high dimensionality. Additionally, the small size of real-world cfRNA profiling data poses challenges, as no algorithm can fully replicate human-derived sequencing profiles. Factors such as hereditary differences, individual variations, preanalytical conditions, background noise, quantification strategies, batch effects, and operational errors can affect cfRNA levels, compromising reproducibility, interpretability, and specificity.

Conclusions

In this study, we utilized novel cfRNA biomarkers in conjunction with a PSPNet model to develop a reliable PRI for predicting PE. This approach demonstrates significant potential for rapid, minimally invasive monitoring of individual PE risk. The integration of cfRNA biomarkers and advanced deep learning techniques facilitates early detection and continuous risk assessment, contributing to enhanced maternal and neonatal healthcare. The PSPNet model’s high accuracy, low prediction error, and rapid processing time position it as a valuable tool for clinical applications, especially in regions with limited access to prenatal care.

Data availability

The data underlying this article are available in the article and in its online supplementary material. The cfRNA employed in current study were downloaded from the Gene Expression Omnibus (GSE192902).

Abbreviations

cfRNA:: Circulating cell-free RNA
PE:: Preeclampsia
PSPNet:: Pyramid Scene Parsing Network
PRI:: Preeclamptic Risk Index
NP:: Control group
MAE:: Mean absolute error

References

Brown MA, Magee LA, Kenny LC, et al. Hypertensive disorders of pregnancy: ISSHP classification, diagnosis, and management recommendations for international practice. Hypertension. 2018;72(1):24–43. https://doi.org/10.1161/HYPERTENSIONAHA.117.10803.
Article CAS PubMed Google Scholar
Burton GJ, Redman CW, Roberts JM, Moffett A. Pre-eclampsia: pathophysiology and clinical implications. The BMJ. 2019;366:1–15. https://doi.org/10.1136/bmj.l2381.
Article Google Scholar
Kristensen JH, Basit S, Wohlfahrt J, Damholt MB, Boyd HA. Pre-eclampsia and risk of later kidney disease: Nationwide cohort study. BMJ (Online). 2019;365:1–9. https://doi.org/10.1136/bmj.l1516.
Article Google Scholar
Bartsch E, Medcalf KE, Park AL, et al. Clinical risk factors for pre-eclampsia determined in early pregnancy: systematic review and meta-analysis of large cohort studies. BMJ. 2016;353:i1753. https://doi.org/10.1136/bmj.i1753.
Article PubMed PubMed Central Google Scholar
Chaiworapongsa T, Chaemsaithong P, Korzeniewski SJ, Yeo L, Romero R. Pre-eclampsia part 2: prediction, prevention and management. Nat Rev Nephrol. 2014;10(9):531–40. https://doi.org/10.1038/nrneph.2014.103.
Article CAS PubMed PubMed Central Google Scholar
Myatt L. The prediction of preeclampsia: the way forward. Am J Obstet Gynecol. 2022;226(2):S1102–S1107.e8. https://doi.org/10.1016/j.ajog.2020.10.047.
Article PubMed Google Scholar
Marić I, Tsur A, Aghaeepour N, et al. Early prediction of preeclampsia via machine learning. Am J Obstet Gynecol MFM. 2020;2(2):1–17. https://doi.org/10.1016/j.ajogmf.2020.100100.
Article Google Scholar
Nicolaides KH, Sarno M, Wright A. Ophthalmic artery Doppler in the prediction of preeclampsia. Am J Obstet Gynecol. 2022;226(2):S1098–101. https://doi.org/10.1016/j.ajog.2020.11.039.
Article CAS PubMed Google Scholar
Wright D, Wright A, Nicolaides KH. The competing risk approach for prediction of preeclampsia. Am J Obstet Gynecol. 2020;223(1):12–23.e7. https://doi.org/10.1016/j.ajog.2019.11.1247.
Article CAS PubMed Google Scholar
Gibbone E, Wright A, Vallenas Campos R, Sanchez Sierra A, Nicolaides KH, Charakida M. Maternal cardiac function at 19–23 weeks’ gestation in prediction of pre-eclampsia. Ultrasound Obstet Gynecol. 2021;57(5):739–47. https://doi.org/10.1002/uog.23568.
Article CAS PubMed Google Scholar
Toden S, Zhuang J, Acosta AD, et al. Noninvasive characterization of Alzheimer’s disease by circulating, cell-free messenger RNA next-generation sequencing. Sci Adv. 2020;6(50):1–10. https://doi.org/10.1126/sciadv.abb1654.
Article Google Scholar
Marques FK, Campos FMF, Filho OAM, Carvalho AT, Dusse LMS, Gomes KB. Circulating microparticles in severe preeclampsia. Clin Chim Acta. 2012;414:253–8. https://doi.org/10.1016/j.cca.2012.09.023.
Article CAS PubMed Google Scholar
Biró O, Fóthi Á, Alasztics B, Nagy B, Orbán TI, Rigó J. Circulating exosomal and Argonaute-bound microRNAs in preeclampsia. Gene. 2019;692(January):138–44. https://doi.org/10.1016/j.gene.2019.01.012.
Article CAS PubMed Google Scholar
Moufarrej MN, Wong RJ, Shaw GM, Stevenson DK, Quake SR. Investigating Pregnancy and Its Complications Using Circulating Cell-Free RNA in Women’s Blood During Gestation. Front Pediatr. 2020;8(December):1–8. https://doi.org/10.3389/fped.2020.605219.
Article Google Scholar
Moufarrej MN, Wong RJ, Shaw GM, Stevenson DK, Quake SR. Investigating pregnancy and its complications using circulating Cell-Free RNA in women’s blood during gestation. Front Pediatr. 2020;8(December):1–8. https://doi.org/10.3389/fped.2020.605219.
Article Google Scholar
Moufarrej MN, Vorperian SK, Wong RJ, et al. Early prediction of preeclampsia in pregnancy with cell-free RNA. Nature. 2022;602(7898):689–94. https://doi.org/10.1038/s41586-022-04410-z.
Article CAS PubMed PubMed Central Google Scholar
Moufarrej MN, Wong RJ, Shaw GM, Stevenson DK, Quake SR. Investigating pregnancy and its complications using circulating cell-free RNA in women’s blood during gestation. Front Pediatr. 2020;8(December):1–8. https://doi.org/10.3389/fped.2020.605219.
Article Google Scholar
Rasmussen M, Reddy M, Nolan R, et al. RNA profiles reveal signatures of future health and disease in pregnancy. Nature. 2022;601(7893):422–7. https://doi.org/10.1038/s41586-021-04249-w.
Article CAS PubMed PubMed Central Google Scholar
Ngo TTM, Moufarrej MN, Rasmussen MLH, et al. Noninvasive blood tests for fetal development predict gestational age and preterm delivery. Science (1979). 2018;360(6393):1133–6. https://doi.org/10.1126/science.aar3819.
Article CAS Google Scholar
Moufarrej MN, Vorperian SK, Wong RJ, et al. Early prediction of preeclampsia in pregnancy with cell-free RNA. Nature. 2022;602(7898):689–94. https://doi.org/10.1038/s41586-022-04410-z.
Article CAS PubMed PubMed Central Google Scholar
Galloway CD, Valys AV, Shreibati JB, et al. Development and Validation of a deep-learning model to screen for hyperkalemia from the electrocardiogram. JAMA Cardiol. 2019;55905:1–9. https://doi.org/10.1001/jamacardio.2019.0640.
Article Google Scholar
Wainberg M, Merico D, Delong A, Frey BJ. Deep learning in biomedicine. Nat Biotechnol. 2018;36(9):829–38. https://doi.org/10.1038/nbt.4233.
Article CAS PubMed Google Scholar
Li J, Wei L, Zhang X, et al. DISMIR: Deep learning-based noninvasive cancer detection by integrating DNA sequence and methylation information of individual cell-free DNA reads. Brief Bioinform. 2021;22(6):1–11. https://doi.org/10.1093/bib/bbab250.
Article CAS Google Scholar
Bahado-Singh RO, Vishweswaraiah S, Aydas B, Mishra NK, Guda C, Radhakrishna U. Deep learning/artificial intelligence and blood-based DNA epigenomic prediction of cerebral palsy. Int J Mol Sci. 2019;20(9):2075. https://doi.org/10.3390/ijms20092075.
Article CAS PubMed PubMed Central Google Scholar
Chaudhary K, Poirion OB, Lu L, Garmire LX. Deep learning–based multi-omics integration robustly predicts survival in liver cancer. Clin Cancer Res. 2018;24(6):1248–59. https://doi.org/10.1158/1078-0432.CCR-17-0853.
Article CAS PubMed Google Scholar
Liang N, Li B, Jia Z, et al. Ultrasensitive detection of circulating tumour DNA via deep methylation sequencing aided by machine learning. Nat Biomed Eng. 2021;5(6):586–99. https://doi.org/10.1038/s41551-021-00746-5.
Article CAS PubMed Google Scholar
Schmidt LJ, Rieger O, Neznansky M, et al. A machine-learning–based algorithm improves prediction of preeclampsia-associated adverse outcomes. Am J Obstet Gynecol. 2022;227(1):77.e1–77.e30. https://doi.org/10.1016/j.ajog.2022.01.026.
Article PubMed Google Scholar
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. http://image-net.org/challenges/LSVRC/2015/.
Fang H, Lafarge F. Pyramid scene parsing network in 3D: Improving semantic segmentation of point clouds with multi-scale contextual information. ISPRS J Photogramm Remote Sens. 2019;154:246–58.
Article Google Scholar
Sarra RR, Dinar AM, Mohammed MA, Ghani MKA, Albahar MA. A robust framework for data generative and heart disease prediction based on efficient deep learning models. Diagnostics. 2022;12(12):2899.
Article CAS PubMed PubMed Central Google Scholar
Sharma D, Lou W, Xu W. phylaGAN: data augmentation through conditional GANs and autoencoders for improving disease prediction accuracy using microbiome data. Bioinformatics. 2024;40(4):btae161.
Article CAS PubMed PubMed Central Google Scholar
Moreno-Barea FJ, Franco L, Elizondo D, Grootveld M. Application of data augmentation techniques towards metabolomics. Comput Biol Med. 2022;148:105916.
Article CAS PubMed Google Scholar

Download references

Acknowledgements

We extend our gratitude to Mira N. Moufarrej, Sevahn K. Vorperian, Ronald J. Wong, Ana A. Campos, Cecele C. Quaintance, Rene V. Sit, Michelle Tan, Angela M. Detweiler, Honey Mekonen, Norma F. Neff, Courtney Baruch-Gravett, James A. Litch, Maurice L. Druzin, Virginia D. Winn, Gary M. Shaw, David K. Stevenson, and Stephen R. Quake for their illuminating research and significant efforts in sample collection and cfRNA sequencing.

Clinical trial number

Not applicable.

Code availability

Codes and scripts developed for this study are available on reasonable request.

Funding

This work was supported by Natural Science Basic Research Plan in Shaanxi Province of China (2023-JC-QN-0954), Funding of State Key Laboratory of Oral Diseases (SKLOD2023OF010), and Xi ‘an Science and Technology Plan (24YXYJ0219) to ZW, and Funding of Clinical Research Center of Shaanxi Province for Dental and Maxillofacial Diseases, College of Stomatology, Xi’an Jiaotong University (2022YHJB02) to ZZ.

Author information

Authors and Affiliations

Key Laboratory of Shaanxi Province for Craniofacial Precision Medicine Research, College of Stomatology, Xi’an Jiaotong University, No.98, Xiwu Road, Xi’an, Shaanxi, People’s Republic of China
Zhuo Zhao & Zheng Wang
Clinical Research Center of Shaanxi Province for Dental and Maxillofacial Diseases, College of Stomatology, Xi’an Jiaotong University, No.98, Xiwu Road, Xi’an, Shaanxi, People’s Republic of China
Zhuo Zhao & Zheng Wang
State Key Laboratory for Manufacturing System Engineering, Xi’an Jiaotong University, Xi’an, China
Zhuo Zhao
Department of Physiology and Pathophysiology, School of Basic Medical Sciences, Xian Jiaotong University, Xi’an, China
Xiaoxu Liu
Department of Urology, The First Affiliated Hospital of Xinjiang Medical University, Urumqi, China
Yonghui Guan
Department of Obstetrics & Gynecology, The First Affiliated Hospital of Xi’an Jiaotong University, No.76, Yanta West Road, Xi’an, Shaanxi, People’s Republic of China
Chunfang Li

Authors

Zhuo Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Xiaoxu Liu
View author publications
You can also search for this author inPubMed Google Scholar
Yonghui Guan
View author publications
You can also search for this author inPubMed Google Scholar
Chunfang Li
View author publications
You can also search for this author inPubMed Google Scholar
Zheng Wang
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Zhuo Zhao: Software, Validation, Writing- Original draft preparation; Xiaoxu Liu: Validation, Writing- Original draft preparation; Yonghui Guan: Validation, Writing- Original draft preparation; Chunfang Li: Conceptualization, Writing- Reviewing and Editing, Supervision; Zheng Wang: Conceptualization, Writing- Reviewing and Editing, Supervision.

Corresponding authors

Correspondence to Chunfang Li or Zheng Wang.

Ethics declarations

Ethics approval and consent to participate

This study was exempt from ethics approval by the Ethical Committee of Xi’an Jiaotong University as it involved the analysis of publicly available data from the Gene Expression Omnibus (GSE192902) database and did not involve direct interaction with human subjects or animal models.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary Material 1.

Supplementary Material 2.

Supplementary Material 3.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zhao, Z., Liu, X., Guan, Y. et al. Exploring the potential of cell-free RNA and Pyramid Scene Parsing Network for early preeclampsia screening. BMC Pregnancy Childbirth 25, 445 (2025). https://doi.org/10.1186/s12884-025-07503-5

Download citation

Received: 10 December 2024
Accepted: 20 March 2025
Published: 14 April 2025
DOI: https://doi.org/10.1186/s12884-025-07503-5

Exploring the potential of cell-free RNA and Pyramid Scene Parsing Network for early preeclampsia screening

Abstract

Background

Purpose

Methods

Results

Conclusions

Background

Methods

Ethical statement

Study design and prediction mechanism

Real-World cfRNA profiling data

Data filtration

Dataset generation and definition of Preeclamptic Risk Index (PRI)

Pyramid Scene Parsing Network (PSPNet) construction

Model architecture

CNN module

Pyramid pooling module

Output module

Preprocessing and data filtering

Overfitting prevention

Training configuration

Computational efficiency

Hardware configuration

Results

Data filtration for cfRNA indicators

Calculation of PRI ground truth

PSPNet-based PRI verification

Prediction error and time efficiency of PSPNet

Comparative experiments and comprehensive evaluation

Discussion

Conclusions

Data availability

Abbreviations

References

Acknowledgements

Clinical trial number

Code availability

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Supplementary Information

Supplementary Material 1.

Supplementary Material 2.

Supplementary Material 3.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Pregnancy and Childbirth

Contact us