|
Few words about me:
Zhigang Yao is a tenured Associate Professor in the Department of Statistics and
Data Science at the National University of
Singapore (NUS), with a
courtesy joint appointment in the Department of Mathematics and
an affiliation with the Institute of Data Science. He received his Ph.D. in Statistics
from the University of Pittsburgh in 2011 advised by Bill Eddy and Leon Gleser,
and was a postdoctoral researcher at EPFL with Victor Panaretos
before joining NUS in 2014. He currently serves as an Associate Editor of
the Journal of the Royal
Statistical Society: Series B.
From
2022, he has been a member of the Center of
Mathematical Sciences and Applications (CMSA)
at Harvard University, where he collaborates
with Shing-Tung Yau on manifold fitting and
the interface between statistics and geometry. He has also played an
active role in promoting this emerging interface internationally, through
the Harvard CMSA conferences on Statistics and Geometry, the ISAG series
in Singapore, and related programs at Tsinghua, BIMSA and SIMIS. In 2025,
he delivered a CMSA Colloquium at Harvard, and in 2026 he initiated and
organized a two-month IDSG school on statistics and geometry at SIMIS. He
is also a co-organizer of the 2026 HIM Bonn trimester program
on Geometric Statistics, continuing these
efforts to build a sustained international platform for statistics and
geometry.
Few words about my work:
My
primary research focuses on statistical inference for complex data, with
an emphasis on the interaction between statistics and geometry.
Linearity
has long been a central principle in the development of statistical
methodology. Many classical methods are built on the idea of representing
data in Euclidean spaces and extracting linear or approximately linear
structures. Modern scientific data, however, are often high-dimensional,
heterogeneous, and intrinsically nonlinear. Although each observation may
be recorded as a long vector, a matrix, a shape, or another complex
object, such data often contain hidden low-dimensional geometric structure.
A
central theme of my work is to uncover and exploit this hidden structure.
I use the term manifold fitting to describe the statistical problem of
estimating a lower-dimensional manifold, or a system of submanifolds,
from noisy high-dimensional data. This is related to manifold learning,
but with a stronger emphasis on inference, geometric estimation,
uncertainty quantification, and downstream learning tasks.
My
group has developed theory and methodology for this problem, including
principal flows, principal boundaries, principal submanifolds,
non-Euclidean statistics, and inference for manifold-structured data. A
key goal is to estimate geometric structures with statistical guarantees
and to use them for tasks such as clustering, classification, denoising,
and scientific interpretation.
These
ideas have led to applications in single-cell data analysis, biomedical
data science, shape and object data, cryo-electron microscopy, and other
high-dimensional scientific problems. Earlier parts of my work also
studied inverse problems in brain imaging and tomographic reconstruction,
where useful signals are often rare, weak, and embedded in noisy
high-dimensional observations.
Overall,
my work aims to develop principled statistical tools that move beyond
purely linear representations and help reveal meaningful geometric
structure in complex scientific data.
|