Skip to content

Principal Boundary

Overview

This project delves into the classification problem, with a specific focus on non-linear methods applied to datasets residing on embedded non-linear Riemannian manifolds within higher-dimensional ambient spaces. Our objective is to establish a classification boundary for labeled classes, leveraging the intrinsic metric of these manifolds.

In pursuit of an optimal boundary that effectively separates the classes, we introduce a novel concept - the "principal boundary." In the context of classification, the principal boundary is defined as an optimal curve positioned between the principal flows originating from two distinct classes of data. At every point along this boundary, it maximizes the margin between the two classes. We estimate the quality and direction of this boundary, guided by the two principal flows. We demonstrate that the principal boundary aligns locally with the decision boundary derived from a support vector machine, ensuring consistency in classification outcomes.

We also present optimality and convergence properties of both the random principal boundary and its population counterpart. To provide practical insights, we illustrate how to discover, apply, and interpret the principal boundary through an application to real-world data. Additional supplementary materials for this article are accessible online.

Detailed description and discussion can be found in paper:
To cite:

@article{yao2020principal,
    author = {Yao, Zhigang and Zhang, Zhenyue},
    title = {Principal Boundary on Riemannian Manifolds},
    journal = {Journal of the American Statistical Association},
    volume = {115},
    number = {531},
    pages = {1435-1448},
    year  = {2020}
}

Selected Talks

Duke Math Seminar

Department of Mathematics, Duke
March 21, 2023