Martin Aryee, Ph.D.

Associate Investigator
Molecular Pathology Unit, Mass General Research Institute
Associate Professor of Pathology
Harvard Medical School
Associate Member
Broad Institute
PhD Harvard T. H Chan School of Public Health 2008
cancer epigenetics; cancer genome analysis; cpg islands; dna methylation; dna sequence analysis; epigenesis genetic; epigenetics; gene expression profiling; genetic heterogeneity; genome; genome editing; high-throughput nucleotide sequencing; high-throughput screening assays; meta-analysis as topic; oligonucleotide array sequence analysis; programming languages; proteome


My research involves computational methods that enable us to elucidate the genetic and epigenetic basis of cancer and other diseases from large genomic datasets.

Tumor Heterogeneity

We develop statistical methods to improve our understanding of tumor cell-to-cell variability and its relationship to cancer progression. Much of this work relates to the computational and statistical challenges posed by single-cell transcriptome and epigenome data.

Different tumors, even of the same type, can harbor extremely heterogeneous genetic and epigenetic alterations. To investigate the role of epigenetic stochasticity in cancer, we recently applied a statistical model to study patterns of inter- and intra-individual tumor heterogeneity during metastasis. We established that metastatic prostate cancer patients develop distinctly unique DNA methylation signatures that are subsequently maintained across metastatic dissemination. The stability of these individualized DNA methylation profiles has implications for the promise of epigenetic alterations as diagnostic and therapeutic targets in cancer.

Epigenome Mapping

Unlike genome sequencing which has well established experimental and analytical protocols, epigenome mapping strategies are still in their infancy and, like other high-throughput techniques, are plagued by technical artifacts. A central theme of our research involves the development of methods for extracting signal from noisy high-throughput genomic assays. The goal of such preprocessing methods is transform raw data from high-throughput assays into reliable measures of the underlying biological process.

Until recently, studies of DNA methylation in cancer had focused almost exclusively on CpG dense regions in gene promoters. We helped develop the statistical tools used to analyze the first genome-scale DNA methylation assays designed without bias towards CpG islands. These tools enabled the discovery that the majority of both tissue-specific and cancer-associated variation occurs in regions outside of CpG islands. We showed that there is a strong overlap between genomic regions involved in normal tissue differentiation, reprogramming during induced pluripotency, and cancer.

Epigenomic studies of complex disease

Despite the discovery of numerous disease-associated genetic variants, the majority of phenotypic variance remains unexplained for most diseases, suggesting that non-genetic factors play a significant role. Part of the explanation will lie in a better understanding of epigenetic mechanisms. These mechanisms are influenced by both genetic and environmental effects and, as downstream effectors of these factors, may be more directly related to phenotype. However, the broad extent of epigenetic dysregulation in cancer and many other diseases complicates the search for the small subset of alterations with a causal role in pathogenesis. We are developing computational methods to integrate genome-wide genetic and epigenetic data with the goal of identifying the subset of functionally important epigenetic alterations.

See our lab website for more information:

Research lab website Publications

CNY-Building #149
149 13th Street
Charlestown, MA 02129