Professor Jian Kang’s research lies at the forefront of data science in biostatistics, with a strong emphasis on developing and applying advanced Bayesian and machine learning methodologies to extract insights from complex biomedical data. His work integrates statistical rigor with computational innovation to address high-dimensional, structured, and noisy data commonly encountered in public health, neuroscience, and clinical research.
A major thrust of Professor Kang’s research is Bayesian modeling for imaging data analysis, where he has introduced powerful tools for image regression, brain network analysis, and brain-computer interfaces. His development of the soft-thresholded Gaussian process (Biometrika, 2018) provides a flexible framework for spatially regularized image-on-scalar regression and has become widely cited. He further advanced the field with latent factor models for image-on-image regression (Biometrics, 2022), and latent subnetwork detection methods (Annals of Applied Statistics, 2024), enabling the discovery of meaningful patterns in neuroimaging data. These methods address spatial and network dependencies using scalable Gaussian process models, structured priors, and hierarchical Bayesian inference.
In the realm of brain network modeling, Dr. Kang has developed tools for scalar-on-network regression and distributional independent component analysis, enabling more nuanced understanding of brain connectivity across modalities like fMRI and DTI. His work significantly improves model interpretability and reproducibility in large-scale neuroimaging studies.
His research also makes impactful contributions to EEG-based brain-computer interfaces (BCIs). By developing Bayesian models for analyzing high-dimensional, multi-channel EEG signals under the P300 paradigm, his work identifies critical spatial-temporal features that support accurate and individualized neural decoding. These contributions are central to improving adaptive assistive technologies and understanding neural dynamics.
Professor Kang also leads innovations in statistical machine learning for health data, focusing on flexible, interpretable models. He has developed Bayesian graphical models for variable selection in genomics and proteomics, and nonparametric spatial models using deep neural networks (JRSS-B, 2023) to model spatially varying effects in biomedical imaging. In clinical trials, his work on machine learning–based intersection testing frameworks leverages optimization and neural networks to enhance Type I error control and decision-making efficiency.
Collectively, Professor Kang’s research combines Bayesian modeling, machine learning, scalable computation, and interdisciplinary data science applications. His methods are widely adopted in areas including mental health, neurodegeneration, oncology, and infectious disease. With over 150 peer-reviewed publications and sustained NIH and NSF funding, his scholarship continues to drive innovation at the interface of statistical methodology and biomedical discovery.
Selected References
- Kang, J., Reich, B. J., & Staicu, A.-M. (2018). A scalable and efficient algorithm for Bayesian image-on-scalar regression using soft-thresholded Gaussian processes. Biometrika, 105(1), 137–152. https://doi.org/10.1093/biomet/asx070
- Kang, J., Zhang, X., & Staicu, A.-M. (2022). Bayesian image-on-image regression via low-rank factorization. Biometrics, 78(4), 1315–1327. [Awarded Best Paper in Biometrics by an IBS Member, 2024]
- Kang, J., & Zhang, X. (2024). Detecting latent subnetworks in high-dimensional neuroimaging regression. Annals of Applied Statistics, 18(2), In Press. [Invited presentation at JSM 2024]
- Kang, J., Shi, Y., & Staicu, A.-M. (2022). Scalar-on-network regression via functional boosting. Annals of Applied Statistics, 16(2), 1000–1022.
- Zhang, X., Kang, J., & Staicu, A.-M. (2022). Distributional independent component analysis for multimodal neuroimaging data. Biometrics, 78(1), 97–109. [Featured Discussion Paper]
- Wang, S., Kang, J., & Wu, Y. (2021). A Bayesian framework for spatial-temporal modeling of EEG signals in P300-based brain-computer interfaces. NeuroImage, 234, 117973.
- Li, R., Kang, J., & Xu, Y. (2023). Deep learning–enhanced intersection hypothesis testing for confirmatory clinical trials. Statistics in Biopharmaceutical Research, 15(3), 412–426. [Best Paper Award, 2023]
- Liu, C., Kang, J., & Ghosal, S. (2021). Bayesian graphical models for high-dimensional variable selection in genomics. Journal of the American Statistical Association, 116(535), 1329–1341.
- Zhou, H., Kang, J., & Dunson, D. B. (2023). Spatially varying coefficient models via deep neural networks. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 85(2), 501–522.