My Account Log in

1 option

Canonical Correlation Analysis and Network Data Modeling : Statistical and Computational Properties / Zhuang Ma.

LIBRA HA001 2017 .M1113
Loading location information...

Available from offsite location This item is stored in our repository but can be checked out.

Log in to request item
Format:
Book
Manuscript
Thesis/Dissertation
Author/Creator:
Ma, Zhuang, author.
Contributor:
Foster, Dean P., degree supervisor, degree committee member.
Ma, Zongming, degree supervisor, degree committee member.
Brown, Lawrence D., degree committee member.
Stine, Robert A., degree committee member.
University of Pennsylvania. Department of Statistics, degree granting institution.
Language:
English
Subjects (All):
Penn dissertations--Statistics.
Statistics--Penn dissertations.
Local Subjects:
Penn dissertations--Statistics.
Statistics--Penn dissertations.
Physical Description:
xii, 162 leaves : illustrations ; 29 cm
Production:
[Philadelphia, Pennsylvania] : University of Pennsylvania, 2017.
Summary:
Classical decision theory evaluates an estimator mostly by its statistical properties, either the closeness to the underlying truth or the predictive ability for new observations. The goal is to find estimators to achieve statistical optimality. Modern "Big Data" applications, however, necessitate efficient processing of large-scale ("big-n-big-p") datasets, which poses great challenge to classical decision-theoretic framework which seldom takes into account the scalability of estimation procedures. On the one hand, statistically optimal estimators could be computationally intensive and on the other hand, fast estimation procedures might suffer from a loss of statistical efficiency. So the challenge is to kill two birds with one stone. This thesis brings together statistical and computational perspectives to study canonical correlation analysis (CCA) and network data modeling, where we investigate both the optimality and the scalability of the estimators. Interestingly, in both cases, we find iterative estimation procedures based on non-convex optimization can significantly reduce the computational cost and meanwhile achieve desirable statistical properties.In the first part of the thesis, motivated by the recent success of using CCA to learn low-dimensional feature representations of high-dimensional objects, we propose novel metrics which quantify the estimation loss of CCA by the excess prediction loss defined through a prediction-after-dimension-reduction framework. These new metrics have rich statistical and geometric interpretations, which suggest viewing CCA estimation as estimating the subspaces spanned by the canonical variates. We characterize, with minimal assumptions, the non-asymptotic minimax rates under the proposed error metrics, especially how the minimax rates depend on the key quantities including the dimensions, the condition number of the covariance matrices and the canonical correlations. Finally, by formulating sample CCA as a non-convex optimization problem, we propose an efficient (stochastic) first order algorithm which scales to large datasets.In the second part of the thesis, we propose two universal fitting algorithms for networks (possibly with edge covariates) under latent space models: one based on finding the exact maximizer of a convex surrogate of the non-convex likelihood function and the other based on finding an approximate optimizer of the original non-convex objective. Both algorithms are motivated by a special class of inner-product models but are shown to work for a much wider range of latent space models which allow the latent vectors to determine the connection probability of the edges in flexible ways. We derive the statistical rates of convergence of both algorithms and characterize the basin-of-attraction of the non-convex approach. The effectiveness and efficiency of the non-convex procedure is demonstrated by extensive simulations and real-data experiments.
Notes:
Ph. D. University of Pennsylvania 2017.
Department: Statistics.
Supervisor: Dean P. Foster; Zongming Ma.
Includes bibliographical references.
OCLC:
1312240954

The Penn Libraries is committed to describing library materials using current, accurate, and responsible language. If you discover outdated or inaccurate language, please fill out this feedback form to report it and suggest alternative language.

My Account

Shelf Request an item Bookmarks Fines and fees Settings

Guides

Using the Library Catalog Using Articles+ Library Account