Article Highlight | 26-May-2026

SKKU develops Bayesian inference for hidden dependence structures in multi-group high-dimensional data

Achieving theoretical guarantee and computational efficiency simultaneously through the development of j-LANCE, a Bayesian joint local dependence learning method

Sungkyunkwan University External Affairs Division (PR team)

The research team of Professor Kyoungjae Lee of the Department of Statistics at Sungkyunkwan University, through joint research with Professor Won Chang of Seoul National University and Professor Xuan Cao of the University of Cincinnati, developed Bayesian inference for the hidden dependence structures of multi-group high-dimensional data.

A Dependence Map in High-Dimensional Data

In today’s scientific and industrial fields, high-dimensional data in which numerous variables are observed simultaneously, such as genomic, climate, financial, and sensor data, are rapidly increasing. In such data, an important problem is to learn the dependent structures connecting the variables and to identify a “dependence map” that reveals hidden information in massive datasets. For example, in climate data, temperatures in nearby regions may be related to one another, and in genomic data, genes located in adjacent positions may act together. If such dependence can be incorporated into inference, more efficient inference is possible than analyzing each variable separately.

Development of the j-LANCE Method for Joint Inference of Dependence Across Multiple Groups

The j-LANCE (joint LocAl depeNdence CholEsky) method proposed in this study focuses on the fact that, in real data such as genomic and climate data, variables have a natural ordering and are mainly related to nearby neighboring variables. Based on this idea, the method estimates the extent to which each variable is connected to neighboring variables and is designed to learn similar structures across multiple groups while allowing group-specific differences. In many existing methods, data from multiple groups are either analyzed separately or simplified by assuming that all groups have the same structure. In contrast, this study uses a Markov random field prior so that similarities and differences across groups can be flexibly learned from the data.

Simultaneously Achieving Theoretical Accuracy and Fast Computation

An important achievement of this study is that it simultaneously attains theoretical accuracy and computational efficiency even in high-dimensional settings. This study theoretically proved that j-LANCE can accurately estimate the dependence structures of multiple groups, and also showed that the rate at which the estimates approach the true values is nearly minimax-optimal. In addition, the methodology was designed to enable Bayesian inference without using MCMC, a complex iterative computation procedure, thereby securing the advantage that fast analysis is possible even for high-dimensional data.

Practical Applicability Confirmed Through Climate Data Analysis

In this study, ERA5 data were used to analyze temperatures at 30 locations in the Pacific Northwest region of the United States from 2019 to 2021, and the dependence structure of temperatures across regions was estimated based on a spatial ordering that reflects wind flow. As a result, j-LANCE was found to capture similar dependence patterns across years while also detecting distinctive dependence structures that appeared in a specific year. This confirmed the practical applicability of j-LANCE to real data, and the method is expected to be applicable in a wide range of fields that require simultaneous analysis of complex data from multiple groups, including climate, genomics, finance, and sensor time series.

*This research achievement was published in Bayesian Analysis, an international journal in the field of statistics.

Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.