TY - JOUR
T1 - Ribonucleic-Acid protein interaction prediction based on deep learning
T2 - A comprehensive survey
AU - Li, Danyu
AU - Huang, Rubing
AU - Cui, Chenhui
AU - Towey, Dave
AU - Zhou, Ling
AU - Tian, Jinyu
AU - Zou, Bin
N1 - Publisher Copyright:
© 2025 Elsevier B.V.
PY - 2025/12
Y1 - 2025/12
N2 - The interaction between Ribonucleic Acids (RNAs) and proteins, also called RNA Protein Interaction (RPI), governs biological processes, including gene regulation and disease pathogenesis. This comprehensive survey examines Artificial Intelligence (AI) applications in Deep Learning-based RPI Prediction (DL-based RPIP) through eight Research Questions (RQs), analyzing 179 studies (2014–2023). The key findings include: sustained technical evolution through embryonic (2014–2017), accelerated (2018–2022), and expansion phases (2023) (RQ1); hybrid models integrating Graph Neural Networks (GNNs) (for topological interface modeling) and Transformers (for long-range dependencies) achieve state-of-the-art performance (RQ4); pretrained language models enhance small-sample learning, but the cross-species generalization declines sharply with evolutionary distance (RQ5). Critical challenges persist, including data heterogeneity across databases, the scarcity of standardized benchmarks (RQ2), and balancing the trade-off between feature encoding and information preservation (RQ3). Future advancements require biologically informed DL architectures, multi-feature fusion, and rigorous cross-validation to bridge the generalization-interpretability gap (RQ8): This would accelerate the clinical translation of predictive tools (RQ6/RQ7). As the first comprehensive analysis spanning feature encoding, modeling, evaluation, applications, and tools, this work fills a critical gap in the DL-based RPIP literature.
AB - The interaction between Ribonucleic Acids (RNAs) and proteins, also called RNA Protein Interaction (RPI), governs biological processes, including gene regulation and disease pathogenesis. This comprehensive survey examines Artificial Intelligence (AI) applications in Deep Learning-based RPI Prediction (DL-based RPIP) through eight Research Questions (RQs), analyzing 179 studies (2014–2023). The key findings include: sustained technical evolution through embryonic (2014–2017), accelerated (2018–2022), and expansion phases (2023) (RQ1); hybrid models integrating Graph Neural Networks (GNNs) (for topological interface modeling) and Transformers (for long-range dependencies) achieve state-of-the-art performance (RQ4); pretrained language models enhance small-sample learning, but the cross-species generalization declines sharply with evolutionary distance (RQ5). Critical challenges persist, including data heterogeneity across databases, the scarcity of standardized benchmarks (RQ2), and balancing the trade-off between feature encoding and information preservation (RQ3). Future advancements require biologically informed DL architectures, multi-feature fusion, and rigorous cross-validation to bridge the generalization-interpretability gap (RQ8): This would accelerate the clinical translation of predictive tools (RQ6/RQ7). As the first comprehensive analysis spanning feature encoding, modeling, evaluation, applications, and tools, this work fills a critical gap in the DL-based RPIP literature.
KW - Artificial intelligence application
KW - Deep learning
KW - Interaction prediction
KW - Protein
KW - Ribonucleic acids
UR - https://www.scopus.com/pages/publications/105015294971
U2 - 10.1016/j.asoc.2025.113795
DO - 10.1016/j.asoc.2025.113795
M3 - Review article
AN - SCOPUS:105015294971
SN - 1568-4946
VL - 184
JO - Applied Soft Computing Journal
JF - Applied Soft Computing Journal
M1 - 113795
ER -