Abstract
The protracted timeline, prohibitive costs, and high attrition rates of conventional drug discovery pipelines present formidable barriers to the development of novel small-molecule anticancer therapeutics. Artificial intelligence (AI) has emerged as a paradigm-shifting force, offering sophisticated computational methodologies to navigate the vastness of chemical space and accelerate the identification of promising therapeutic candidates. This thesis addresses critical challenges in AI-driven drug discovery by developing and applying a series of progressively advanced machine learning (ML) frameworks for molecular representation learning and virtual screening.Initially, this research demonstrates the practical efficacy of established ML techniques through a ligand-based virtual screening campaign for Son of Sevenless 1 (SOS1) inhibitors, a key target in RAS-driven cancers. This work successfully identified novel chemical scaffolds with validated inhibitory activity, affirming the utility of ML in hit identification. Building upon this, the thesis introduces the Three-branch Molecular Representation Learning Framework (TMRLF), an architecture that synergistically integrates a graph neural network with dual, complementary molecular fingerprints. This multimodal approach is designed to overcome the inherent limitations of graph-based models in capturing long-range atomic interactions and pharmacologically relevant substructures, demonstrating superior performance on benchmark datasets and in the identification of SOS1 inhibitors.
To further enhance representational power and address the complexities of intermolecular relationships, the thesis culminates in the development of the Dual-branch Molecular Property Encapsulation (DMPE) framework. DMPE features two key innovations: the Refined Interactive Graph Attention Framework (RIGAF), which captures both intramolecular topology and intermolecular structural similarities across the chemical space, and a Kolmogorov-Arnold Network (KAN)-based Embedding and Fusion (KAEF) module, which offers a more expressive and interpretable method for fusing heterogeneous features. DMPE achieves competitive performance on multiple public benchmarks and demonstrates its practical application in a successful virtual screening for novel hematopoietic progenitor kinase 1 (HPK1) inhibitors for hepatocellular carcinoma.
Collectively, this body of work contributes a suite of robust, validated computational
tools that enhance the accuracy, efficiency, and chemical insight of the anticancer drug discovery process. The progression from applying existing methods to engineering novel, sophisticated deep learning architectures underscores the transformative potential of AI to rationalize and accelerate the development of next-generation cancer therapeutics.
| Date of Award | 15 Mar 2026 |
|---|---|
| Original language | English |
| Awarding Institution |
|
| Supervisor | Bencan Tang (Supervisor), Jonathan D. Hirst (Supervisor) & Jianfeng Ren (Supervisor) |
Cite this
- Standard