Our research interests span bioinformatics, machine learning, genomics, and pharmacogenomics, with the goal of systematically modeling genomic and pharmacogenomic data to enhance our understanding of biology and therapeutic responses in complex diseases, such as cancer. At the cross-disciplinary nexus of UPMC Hillman Cancer Center and the University of Pittsburgh School of Medicine, we collaborate with clinical, translational, and basic scientists to align advanced computational algorithms with the unmet needs of precision medicine.

Deep learning of pharmacogenomics: implementing in silico drug and CRISPR screens and modeling the underlying mechanisms

Our aim is to address the challenge of applying cutting-edge deep learning technologies in genomics and pharmacogenomics, especially when dealing with limited sample sizes. We hypothesize that advanced computational models effectively capture important genomic characteristics of cells and predict their responses to chemical (drugs) and genetic perturbations (CRISPR gene knockouts). We have developed a suite of sophisticated deep learning models to test the hypothesis (Science Advances 2021 and BMC Medical Genomics 2019). Our models are trained to extract genomic features from high-dimensional profiles of cell lines, enabling accurate predictions of responses to a diverse array of oncology and non-oncology drugs and gene knockouts. Employing a "transfer learning" approach, the models are initially trained to identify key features of tumor genomics, thus equipping them to predict tumor behavior and achieve biologically relevant results, validated by clinical data. Our groundbreaking deep learning models, capable of deciphering complex genomics and pharmacogenomics data, have the potential to improve the selection of existing drugs for treatment and aid in the discovery of new therapeutic targets. Our promising findings from individual pharmacogenomic data modalities, such as genetic or drug screens, highlight the potential of a comprehensive, multi-modal data integration model to advance our understanding of pharmacogenomics.

Advancing FAIR principles (findable, accessible, interoperable, and reusable) in AI models and enhancing multi-omics data for AI readiness

In response to a growing demand for making advanced AI tools and large multi-omics data resources more accessible, our research program has made significant strides in developing user-friendly software. Our aim is to enhance access to cutting-edge deep learning models and large genomics/pharmacogenomic data. Recognizing the challenges faced by many biomedical researchers with limited programming expertise, we have crafted intuitive platforms to bridge this gap. Notably, our innovative web-based application, shinyDeepDR, is built on the robust R Shiny framework. It allows users to effortlessly input mutation or gene expression profiles and obtain predictions for responses to 265 oncology and non-oncology compounds (Patterns 2024). Additionally, our DepLink application serves as a comprehensive resource. It enables the exploration of intricate relationships between gene knockouts, drug treatments, and molecular signatures, integrating data from diverse sources such as CRISPR screens, drug screens, and molecular signatures (Bioinformatics Advances 2023). These tools feature interactive user interfaces with dynamic visualization, searching, and filtering capabilities. Our research underscores the feasibility and broad applicability of user-friendly tools in advancing deep learning and pharmacogenomics.

Identifying prognostic biomarkers and studying treatment resistance using integrative clinical genomics: tackling liver and blood cancers as examples

While cancer diagnosis and treatment have significantly improved over the past decade, liver and blood cancers remain more lethal than most other types. Over the past decade, we have collaborated with hematologists and oncologists to study the most prevalent type of liver cancer (hepatocellular carcinoma; HCC) and both acute (acute myeloid leukemia; AML) and chronic (myelodysplastic syndrome; MDS) forms of blood cancer. Our hypothesis is that integrated bioinformatics analyses of high-dimensional genomic profiles and clinical records can lead to the discovery of novel prognostic markers, resistance mechanisms, and therapeutic targets. Specifically in HCC, our analyses have unveiled promising targets for a prevalent “undruggable” gene mutation in β-catenin (CTNNB1), aligning consistently with findings from in vivo models (Patterns 2024). In AML, through integrated analyses of mutations, gene expression, and miRNA expression profiles of patients with de novo AML, we identified strategies for biomarker implementation based on different modalities of cancer multi-omics: miRNAs, mRNAs, gene regulations, and pathways. We proposed and verified a novel mechanism: NPM1 (a frequently mutated gene in AML) mutation modulates miRNA-gene regulation associated with prognosis (Leukemia 2016). At the pathway level, we further demonstrated that the crosstalk among the pathways of Myc, OXPHOS, mTOR, and stemness governs patients’ response to "7 + 3" induction chemotherapy (European Journal of Haematology 2019). Overall, our studies provide biological and clinical insights into prognostic markers and chemoresistance mechanisms.