地 点:闵行校区统计楼105室
报告人:邹长亮教授(南开大学)
时 间:2015年10月16日(周五)上午10:00-11:00
题 目:On Surveillance of High-Dimensional Datastreams
报告内容简介:
Monitoring high-dimensional data streams has become increasingly important for real-time detection of abnormal activities in many statistical process control (SPC) applications. This talk consists of two parts, which are the outlier identification (Phase I) and sequential detection (Phase II). In the first part, I will introduce an outlier detection procedure for high-dimensional data. The method is to replace the classical minimum covariance determinant estimator with a high-breakdown minimum diagonal product estimator. The cutoff value is obtained through the asymptotic distribution of the distance, which enables us to control the type I error and deliver robust outlier detection. In the second part, we propose a test statistic which is based on the divide-and-conquer strategy, and integrate this statistic into the multivariate EWMA charting scheme for on-line detection. The key idea is to combine many T-square statistics calculated on low-dimensional sub-vectors. The proposed procedure is computation- and storage-efficient. The control limit is obtained through the asymptotic distribution of the test statistic under some mild conditions on the dependence structure. Both (asymptotically) theoretical analysis and numerical results show that the proposed method behaves well in high-dimensional data.