Latent variable models for functional genomic data
Functional genomic data (such as RNAseq or DNA methylation) are composed of many layers of overlapping signals that reflect the output of individual upstream pathways. We develop methods to automatically decouple, extract and name all biological latent variables present in a dataset.
Intrinsically interpretable models
Advances in neural networks have enabled models that accurately recapitulate complex input-output behavior of biological systems. We can now predict context specific DNA activity and gene expression directly from sequence. However, top performing models have millions of parameters and their internal representation is not interpretable. We seek to develop models with interpretable parameters that do not sacrifice performance.
Together with Nathan Clark's lab we develop methods for understanding the relationship between evolutionary forces and phenotypes. Using our Relative Evolutionary Rate (RERconverge) method we have demonstrated that eye-specific genes and non-coding sequences can be identified based on patterns of relaxation in lineages of subterranean mammals.
Automatic representation learning
Genomic data is often noisy and complex making it difficult to identify signals relevant to the underlying molecular mechanisms. We develop methods that combine machine learning techniques and insights about the biological process to learn useful data representation.
Our group is part of the Molecular Transducers of Physical Activity Consortium (MoTraPAC) . This is a large study looking at the effects of exercies through multiple genomic assays.
Epigenetic CHaracterization and Observation (ECHO)
We are part of the Epigenetic CHaracterization and Observation (ECHO) . The program is building a man-portable device that analyzes an individual’s epigenetic “fingerprint” to potentially reveal a detailed history of that individual’s exposure to infectious and chemical agents.
Assessing Immune Memory (AIM)
We are part of the Assessing Immune Memory (AIM) . The program seeks to determine early on if a vaccine candidate will later provide long-lasting immune protection in humans, a current impossibility that would benefit the warfighter and nation immensely.
Genomics of memory B cells
Memory B-cells are immune cells that long-lived immune cells that are important for "remembering" antigen exposure and are important for vaccine efficacy. The Shlomchik lab has identified functionally distinct subsets of the cells and we are working to characterize their molecular differences using a multi-omics approach.
We are working with several UPMC research teams to use single-cell assay technologies to understand the role of the tumor micro-environment in tumor progression and treatment response.