Abstract. Estimating the Tree of Life is one of the grand computational challenges in Science, and has applications to many areas of science and biomedical research. Despite intensive research over the last several decades, many problems remain inadequately solved. In this talk I will discuss species tree estimation from genome-scale datasets, where different regions of the genome can have different histories due to incomplete lineage sorting. I will describe the current state of the art for these problems, what is understood about these problems from a mathematical perspective, and identify some of the open problems in this area where mathematical research, drawing from graph theory, combinatorial optimization, and probability and statistics, is needed.
Short Bio. Tandy Warnow is the Founder Professor of Computer Science at the University of Illinois at Urbana-Champaign, where she is Associate Head of the Department of Computer Science. She is also a member of the Carl R. Woese Institute for Genomic Biology and an affiliate in seven other departments at UIUC (Statistics, Mathematics, Bioengineering, Electrical and Computer Engineering, Plant Biology, Animal Biology, and Entomology). Tandy received her PhD in Mathematics at UC Berkeley under the direction of Gene Lawler, and did postdoctoral training with Simon Tavaré and Michael Waterman at the University of Southern California. She received the National Science Foundation Young Investigator Award in 1994, the David and Lucile Packard Foundation Award in Science and Engineering in 1996, an Emeline Bigelow Conland Fellowship at the Radcliffe Institute for Advanced Study in 2006, and a Guggenheim Foundation Fellowship for 2011. In 2016 she was elected as an ACM Fellow, and in 2017 she was elected as an ISCB Fellow. Her research combines mathematics, computer science, and statistics to develop improved models and algorithms for reconstructing complex and large-scale evolutionary histories in both biology and historical linguistics. Her current research focuses on phylogeny and alignment estimation for very large datasets (10,000 to 1,000,000 sequences), estimating species trees from collections of gene trees, and metagenomics.