Contact +:

 

Department of Statistics and Data Science
National University of Singapore
21 Lower Kent Ridge Road             
Singapore 117546
*:zhigang.yao@nus.edu.sg  

Center of Mathematical Sciences and Applications
Harvard University
20 Garden Street              
Cambridge MA 02138
*:zhigang.yao@cmsa.fas.harvard.edu                         



 

Few words about me:


Zhigang Yao
is a tenured Associate Professor in the Department of Statistics and Data Science at the National University of Singapore (NUS), with a courtesy joint appointment in the Department of Mathematics and an affiliation with the Institute of Data Science. He received his Ph.D. in Statistics from the University of Pittsburgh in 2011 advised by Bill Eddy and Leon Gleser, and was a postdoctoral researcher at EPFL with Victor Panaretos before joining NUS in 2014. He currently serves as an Associate Editor of the Journal of the Royal Statistical Society: Series B.

 

 

From 2022, he has been a member of the Center of Mathematical Sciences and Applications (CMSA) at Harvard University, where he collaborates with Shing-Tung Yau on manifold fitting and the interface between statistics and geometry. He has also played an active role in promoting this emerging interface internationally, through the Harvard CMSA conferences on Statistics and Geometry, the ISAG series in Singapore, and related programs at Tsinghua, BIMSA and SIMIS. In 2025, he delivered a CMSA Colloquium at Harvard, and in 2026 he initiated and organized a two-month IDSG school on statistics and geometry at SIMIS. He is also a co-organizer of the 2026 HIM Bonn trimester program on Geometric Statistics, continuing these efforts to build a sustained international platform for statistics and geometry.

 

 

Few words about my work:

My primary research focuses on statistical inference for complex data, with an emphasis on the interaction between statistics and geometry.

Linearity has long been a central principle in the development of statistical methodology. Many classical methods are built on the idea of representing data in Euclidean spaces and extracting linear or approximately linear structures. Modern scientific data, however, are often high-dimensional, heterogeneous, and intrinsically nonlinear. Although each observation may be recorded as a long vector, a matrix, a shape, or another complex object, such data often contain hidden low-dimensional geometric structure.

A central theme of my work is to uncover and exploit this hidden structure. I use the term manifold fitting to describe the statistical problem of estimating a lower-dimensional manifold, or a system of submanifolds, from noisy high-dimensional data. This is related to manifold learning, but with a stronger emphasis on inference, geometric estimation, uncertainty quantification, and downstream learning tasks.

My group has developed theory and methodology for this problem, including principal flows, principal boundaries, principal submanifolds, non-Euclidean statistics, and inference for manifold-structured data. A key goal is to estimate geometric structures with statistical guarantees and to use them for tasks such as clustering, classification, denoising, and scientific interpretation.

 

These ideas have led to applications in single-cell data analysis, biomedical data science, shape and object data, cryo-electron microscopy, and other high-dimensional scientific problems. Earlier parts of my work also studied inverse problems in brain imaging and tomographic reconstruction, where useful signals are often rare, weak, and embedded in noisy high-dimensional observations.

 

Overall, my work aims to develop principled statistical tools that move beyond purely linear representations and help reveal meaningful geometric structure in complex scientific data.