Academic year 2022-23 seminars

Friday, December 9 – Fall 2022 Student Rotation Presentations

Yujia Cai: Identifying Glioblastoma Patient Subgroups with Sequencing Data

Emma Croxford: DR.BENCH Evaluation Metrics

Jie Sheng: tbd

Angela Wu: Statistical Methods for Testing Reliability and Validity

Tinghui Xu: A Relaxed Doubly Robust Estimation

Huan Liang: Critical Deterioration in Hospitalized Children

Fall 2022 PhD Rotation Presentations Poster

Thursday, December 8 – Fall 2022 Student Rotation Presentations

Spencer Halberg: Network-based data integration and visualization provide a global understanding of regulatory mechanism in A. Fumigatus

Sierra Strutz: Pushing pCART Forward

Emma Graham Linck: Methods for predicting individualized treatment effects

Yuda Liu: An Exploratory Analysis of Zombie SARS-CoV-2

Ryan Kassab: Modeling EHR Data for Predicting Ovarian Cancer

Friday, November 11.

Speaker: Yin Li, Department of Biostatistics and Medical Informatics, UW-Madison


Title: Towards Medical Video Analysis


Video has become an increasingly common form of data recorded at clinics. Cameras are instrumented in operation rooms, at intensive care units (ICUs), around nursing homes, and during telehealth visits, resulting in a plethora of medical videos capturing various aspects of care delivery and the patients’ responses to them. At present, these videos are recorded primarily for quality assurance, and its potential to medicine has yet to be explored. A main barrier lies in the extraction of clinically relevant signals from the high dimensional video data.  

This talk will describe our effort on developing algorithms and tools for video analytics and its application to medical videos. I will introduce our work on developing computer vision models to detect individual events in videos and to model temporal relationships among these events. Further, I will present our preliminary results on using our models to analyze neonatal and adult ICU videos, with the goals of quantifying care delivery activities and monitoring high risk patients.


Friday, November 4 – DeMets Lecture

Speaker: Michael Proschan, National Institutes of Health-National Institute for Allergy and Infectious Disease. See DeMets Lectures page for abstract.

Friday, October 28

Speaker: Duygu Ucar, The Jackson Laboratory for Genomic Medicine (JAX-GM).

More information on Dr. Ucar:


Title: Sexual-dimorphism in human immune system aging and vaccine responses

Abstract: Differences in immune function and responses contribute to health- and life-span disparities between sexes. However, the role of sex in immune system aging and immune responses is not well understood. By profiling peripheral blood mononuclear cells from 172 healthy adults (22–93 years old) using ATAC-seq, RNA-seq and flow cytometry we uncovered an accelerated aging phenotype in men; where T and B cell inactivation and monocyte activation with age was more pronounced in men compared to women. To study whether there are sex-differences in vaccine responses of older adults, we recruited 40 older adults (60 years and older) and studied their responses to two available pneumococcal vaccines: Tdependent Prevnar and T-independent Pneumovax. Using flow cytometry, bulk and single cell RNA-seq we uncovered an activated baseline immune phenotype which was negatively associated with Prevnar vaccine responses. Individuals with this activated phenotype had more circulating pro-inflammatory Th17 cells and more cytotoxic CD16+ NK cells, and less Th1 cells. This immune phenotype was associated with age and sex of donors, where older men were more likely to have this phenotype and did not mount strong responses to Prevnar. Overall, our study uncovered how older adults respond to different pneumococcal vaccines and demonstrated the significance of considering biological sex and the baseline immune state while administering these vaccines.


Friday, October 21

Speaker: Huy Dinh, Assistant Professor, Department of Oncology and Department of Biostatistics and Medical Informatics, UW-Madison

Poster: Dinh, Huy Seminar Poster 10-21-2022

Title: Cancer progression and response to immunotherapy through the lens of single-cell analysis

Abstract: In this talk, I will discuss our recent data in using single-cell to stratify the granularity of immune cell heterogeneity in cancer progression and response to immunotherapy with examples in ovarian cancer and head and neck cancer.

Friday, October 14

Speaker: Assistant Professor Marinka Zitnik, Department of Biomedical Informatics, Harvard University

Poster:  Zitnik Poster

Title: Graph AI to Enable Precision Medicine

Abstract: Graph representation learning leverages knowledge, geometry, and structure to develop powerful machine learning methods. First, I will introduce Shepherd, a graph neural network for personalized diagnosis of patients with rare genetic diseases. Diagnostic delay is pervasive in patients with rare genetic conditions. It can lead to numerous problems, including redundant testing and unnecessary procedures, delays in obtaining disease-appropriate management and therapies, and even irreversible disease progression. Shepherd uses knowledge-guided geometric deep learning to gather information from different parts of a knowledge graph and logically connect a patient’s clinical-genomic information to the region in the knowledge graph relevant to diagnosis. Evaluation of patients from the Undiagnosed Diseases Network shows that Shepherd accurately identifies causal disease genes, finds other patients with the same causal gene and disease, and provides interpretable characterizations of novel diseases. Second, I will describe applications of graph neural networks in drug discovery. These are available through Therapeutics Data Commons (, an initiative to access and evaluate AI capability across therapeutic modalities and stages of drug discovery. The Commons supports the development of machine learning methods, with a strong bent towards developing the foundations for which methods are most suitable for drug discovery and why.

Friday, October 7

Speaker: Moo Chung, Associate Professor, Department of Biostatistics and Medical Informatics, UW-Madison

Poster: Chung Poster

Title: Topological Inference and Learning for Cycles in Graphs

Abstract: Cycles or loops in a network embeds higher-order interactions beyond dyadic relations. The cycles are essential for the parallel processing of information and enable feedback loops. Despite the fundamental importance of cycles in networks, identifying and extracting them are computationally prohibitive. In this talk, we propose an efficient algorithm for the systematic identification of cycle basis using the Hodge Laplacian and persistent homology. In the vector space of cycles, we can perform regression and clustering. The method is applied to modeling human brain networks obtained from resting-state functional magnetic resonance imaging (MRI). We demonstrate the dynamic pattern of the state space in brain activity itself is a heritable trait for the first time. The talk is based on Songdechakraiwut and Chung 2022 (arXiv:2012.00675) and Anand et al. 2022 (arXiv:2110.14599).

Friday, September 30

  • Speaker: Heping Zhang, PhD, Susan Dwight Bliss Professor of Biostatistics, Professor of Child Study, Professor of Statistics and Data Science, Yale University
  • Zoom:
  • Poster: Zhang Poster
  • Title: Genes, Brains, and Us
  • Abstract: Many human conditions, including cognition, are complex and depend on both genetic and environmental factors. After the completion of the Human Genome Project, genome-wide association studies have associated genetic markers such as single-nucleotide polymorphisms with many human conditions and diseases. Despite the progress, it remains difficult to identify genes and environmental factors for complex diseases – the so-called geneticist’s nightmare. Furthermore, although the impact of these discoveries on human health is not shock and awe, “drugs with support from human genetic studies for related effects succeed from phase I trials to final approval twice as often as those without such evidence.” Therefore, it is important and promising, while challenging, to identify genetic variants for complex human health-related conditions.

    This talk is not intended to provide a comprehensive review of massive progress of related methods and discoveries. Instead, I will focus on some of the work that many of my students assisted me in over the past several years. The first area is the identification of super-variants. A super-variant is a set of alleles in multiple loci of human genome although unlike the loci in a gene, contributing loci to a super-variant can be anywhere in the genome. The concept of super-variant follows a common practice in genetic studies by the means of collapsing a set of variants, specifically single nucleotide polymorphisms. The novelty and challenge lie in how to find, replicate, interpret, and eventually make use of the super-variants. Our work has been mainly based on the use of tree- and forest-based methods, and a data analytic flow that we proposed in 2007, which in retrospect resembles the spirit of “deep learning” that Hinton coined in 2006. The second area is our progress in conducting statistical inference for high dimensional and structured data objects. Such data objects not only more and more commonly appear in imaging genetic studies, but also in other areas of data science including artificial intelligence. They do not belong to a Euclidean space for which most of the statistical theory and methods such as the distribution function are developed. How do we analyze data objects in non-Euclidean spaces?

Friday, September 23

  • Speaker: Anoop Mayampurath, Assistant Professor, Department of Biostatistics and Medical Informatics, UW-Madison
  • Zoom:
  • Poster
  • Title: Using Electronic Health Record Data to Predict Deterioration in Hospitalized Children
  • Abstract: Children who are admitted to the hospital and subsequently deteriorate have a high risk of mortality. They are also more likely to experience complications in their long-term health.  Because timely intervention increases survival, it is critically important to detect deterioration in hospitalized children as early as possible. Current early warning scores that predict deterioration are subjectively derived, utilize limited data elements, and have been shown in randomized control trials to not reduce in-hospital mortality.

    This talk will focus on our recent efforts to utilize machine learning and electronic health record (EHR) data to revolutionize pediatric early warning scores. Using a cohort of approximately 56,000 patients over a 12-year period, we developed a gradient boosted machine learning model that detects deterioration earlier and more accurately than current standards in both internal and external settings. I will also highlight recent efforts within the lab to extend risk prediction to other hospital settings and the use of notes for explaining prediction model outputs. Our research facilitates early detection of pediatric patients at risk for deterioration, thereby creating opportunities for timely intervention that could decrease preventable death and improve long-term outcomes in children.

Friday, September 16

  • Annie Qu, Chancellor’s Professor, Department of Statistics, University of California – Irvine
  • Qu, Annie Poster
  • Query-augmented Active Metric Learning
  • Abstract: We propose an active metric learning method for clustering with pairwise constraints. The proposed method actively queries the label of informative instance pairs, while estimating underlying metrics by incorporating unlabeled instance pairs, which leads to a more accurate and efficient clustering process. In particular, we augment the queried constraints by generating more pairwise labels to provide additional information in learning a metric to enhance clustering performance. Furthermore, we increase the robustness of metric learning by updating the learned metric sequentially and penalizing the irrelevant features adaptively. Specifically, we propose a new active query strategy that evaluates the information gain of instance pairs more accurately by incorporating the neighborhood structure, which improves clustering efficiency without extra labeling cost. In theory, we provide a tighter error bound of the proposed metric learning method utilizing augmented queries compared with methods using existing constraints only. Furthermore, we also investigate the improvement using the active query strategy instead of random selection. Numerical studies on simulation settings and real datasets indicate that the proposed method is especially advantageous when the signal-to-noise ratio between significant features and irrelevant features is low.

Friday, September 9, 2022

Friday, September 2, 2022

  • Dr. Xuanyao Liu, Assistant Professor of Medicine and Assistant Professor of Human Genetics, University of Chicago.
  • Poster: LIU, Xuanyao Poster
  • Title: Moving beyond cis gene regulation: the next steps to understand complex trait genetics
  • Abstract: Early genome-wide association studies (GWASs) led to the surprising discovery that, for typical complex traits, most of the heritability is due to huge numbers of common variants with tiny effect sizes. Here, we provide a formal model in which genetic contributions to complex traits are partitioned into direct effects from core genes and indirect effects from peripheral genes acting in trans. We propose that most heritability is driven by weak trans-eQTL SNPs, whose effects are mediated through peripheral genes to impact the expression of core genes. In particular, if the core genes for a trait tend to be co-regulated, then the effects of peripheral variation can be amplified such that nearly all of the genetic variance is driven by weak trans effects. Detecting trans-eQTLs is very challenging. To improve the power of detecting trans-eQTLs, we propose a PC-based multivariate association pipeline, trans-PCO, that combines multiple PCs to detect trans-eQTLs of regulatory networks. We showed through simulations and real data applications that trans-PCO is a powerful and reliable tool that detects trans-eQTLs of cellular pathways and networks, which opens up new opportunities to learn the trans-regulatory mechanism of complex traits and diseases.