Center for Immunology and Inflammatory Diseases,
Mass General Research Institute
|PhD University of Wisconsin-Madison 2013
big data; bioinformatics; machine learning; single-cell genomics
The Li Laboratory focuses on developing standard operating procedures for single-cell genomics data analysis, building on state-of-the-art big data, machine learning and graphics processing unit computing technologies, and application of single-cell genomics to immunology questions.
Thanks to recent advances in biotechnology, people can now study the immune system at an unprecedented fine scale. At the same time, people are producing far more data than they can harness. How to quickly and appropriately interpret the big ocean of data people have produced already becomes a key barrier that keeps immunologists from embracing the fruits of biotechnology advancements.
The Li laboratory research program aims at developing standard operating procedures (SOPs) for single-cell genomics data analysis that can significantly lower this barrier. We utilize state-of-the-art big data and machine learning technologies to develop a series of cloud-based tools that will free immunologists from complicated computation and help them better focus on answering biology questions.
Since the bioinformatic challenges I will address are ubiquitous in the broader field of genomics, the tools we have developed will have high impacts in the whole field of genomics.
Besides developing novel tools, we will also work together with faculty members towards answering key immunology questions by leveraging a wide-range of computational tools we have developed.
Human Immune Cell Atlas Project
This is a multi-institution project aiming at mapping cell types and cell states in human immune system at the single cell level. So far, we have profiled 1.7 million single cells from a variety of human hematopoietic tissues, such as bone marrow, cord blood and peripheral blood. The Li laboratory takes the lead in the computational analysis of this project.
scCloud, cloud-based single-cell and single-nucleus genomics analysis pipeline
To be able to handle the huge amount of data produced by the human immune cell atlas project, we have developed scCloud, the first cloud-based single-cell data analysis platform that can scale up to millions of cells. scCloud is both fast and cost-effective. Besides the human immune cell atlas project, scCloud is also used in NCI’s Human Tumor Atlas Pilot Project (HTAPP).
Nucleus hashing, a novel method for multiplexing snRNA-Seq samples
Single-nucleus RNA-Seq (snRNA-Seq) is a key technology to investigate tissues that are difficult to disassociate, such as brain tissues. To populate snRNA-Seq technology, we need to find novel methods to multiplex snRNA-Seq libraries in order to reduce batch effects and cost. Together with the Regev lab at the Broad Institute, Gaublomme lab at Columbia University, we developed nucleus-hashing, a novel single-nucleus multiplexing protocol and demuxEM, a novel demultiplexing algorithm.