Projects

BioAI

Interpretable Protein LLMs for Ubiquitination Prediction

  • Built an ESM2-centered pipeline linking representation extraction, dimensionality reduction, and prediction.
  • Developed interpretable models using embedding-based interpretation and attention-based analysis; contributed to code, data processing, and manuscripts.
Protein LLMs

Graph AI for Predicting Drug-Disease Links (Drug Repositioning)

  • Built a multi-view heterogeneous GNN to integrate noisy similarity networks and meta-paths for drug-disease association prediction.
  • Improved robustness via contrastive learning; led implementation, benchmarking, and manuscript preparation.
Drug Repositioning

Multi-Omics for Cancer Biomarker Discovery

  • Performed integrative analysis of WGS, RNA-seq, antibody-based protein profiling, and clinical metadata to support biomarker discovery and cross-cohort assessment.
  • Contributed across data preprocessing, QC, statistical analysis, and result interpretation.
Cancer Biomarker

Integrative Multi-Omics in the Non-model Malaria Vector (Anopheles sinensis)

  • Generated a de novo genome assembly and analyzed gene families associated with hematophagy and disease transmission.
  • Integrated transcriptome, microRNA, metagenome, and resequencing data from pyrethroid-susceptible/resistant populations to infer resistance mechanisms (variants, selection, regulation, temporal expression); contributed end-to-end.
Malaria Vector

BioSynthesis + BioAI

Screening, Optimization, and Rational Design of Key Enzymes in Microbial Biosynthesis of Capsaicinoids

  • Applied AI-driven approaches for enzyme screening and optimization in capsaicinoid biosynthesis pathways.
Capsaicinoids

Identification, Optimization, and Rational Design of Functional Components from Pangolin Scales and Porcine Hooves

  • Utilized computational methods to identify and optimize bioactive components for pharmaceutical applications.
Pangolin

Kaggle Competitions

My Kaggle Profile: https://www.kaggle.com/yujuanzhangai