I'm an Assistant Professor of Business Analytics at Towson University, where I explore how data, machine learning, and statistical reasoning can illuminate complex problems in finance, risk management, and information systems. My research focuses on high-dimensional statistics, machine learning, and categorical data analysis, with applications in bankruptcy prediction, sentiment analysis, and asset pricing. I obtained my Ph.D. degree in Business Analytics advised by Dr. Yan Yu from the University of Cincinnati, Lindner College of Business. Before that, I received an master degree in Finance from Penn State University and a master degree of Applied Statistics from Beijing University of Technology.
I'm especially interested in building interpretable models that balance statistical rigor with practical insight. I've developed several R packages—like SurrogateRsq, PAsso, and SSCI. My work has been published in journals such as Journal of Business & Economic Statistics and Decision Support Systems .
I believe staying Hungry, Foolish, and Creative is the secret to success. My long-term goal is to contribute to a form of machine intelligence that is the "epitome" of a good human being and human intelligence.
I'm interested in making research results more intuitive and understandable. Therefore, I use Shiny app for interactively telling interesting data story. Here are some latest shiny apps and R packages I created. Everyone interested in my research or software is welcome to contact me.
An R package for the Surrogate \(R^2\) measure for categorical data analysis. It can generate a point or interval measure of the surrogate \(R^2\), and a ranking measure of each variable's contribution.
This course emphasizes hands-on data analysis experience using the most recent progression in data mining and machine learning for business analytics with statistical software R. Topics include modern data wrangling techniques, data visualization, linear regression, logistic regression, variable selection, model evaluation, K-nearest neighbors, classification and regression trees (CART), etc.
SyllabusThis course focuses on using standard business analytic models to summarize and analyze data, build models, and drive impact through quantitative decision-making.
SyllabusData Wrangling with R! This course provides an intensive, hands-on introduction to Data Wrangling with the R programming language. You will learn the fundamental skills required to acquire, munge, transform, manipulate, and visualize data in a computing environment that fosters reproducibility.
Syllabus Lab NotesThis course covers time series analysis, emphasizing the appropriate models for estimation, testing, and forecasting. For example, Univariate Box-Jenkins for fitting and forecasting time series; ARIMA models, stationarity and nonstationarity; diagnosing time series models; forecasting, point, and interval forecasts, seasonal time series models, modeling volatility with ARCH, GARCH, , and other methods. The R Shiny App development is also covered to help students obtain skills in making a prototype of their models and ideas.
Undergraduate GraduateThis course develops fundamental knowledge and skills for applying statistics to business decision-making. Topics include descriptive statistics, probability distributions, sampling, confidence intervals, hypothesis testing, and computer software for statistical applications. (2018 Spring & Fall)
SyllabusThe statistical methods in these two courses include Linear Regression, Generalized Linear Models (e.g., Logistic regression), Variable Selection, Cross Validation, k-nearest neighbors, Classification and Regression Trees (CART), Bagging, Boosting, Random Forests, Generalized Additive Models (GAM for Nonlinearity), Nonparametric Smoothing; Neural Network, Clustering(K-means clustering, Support Vector Machine), Principal Component Analysis, Association Rules, and Text Mining.
Syllabus Lab Notes