The in-depth analysis of data from different practical scenarios is an important and modern challenge. This is the case for data with values in [-1,1], which can correspond to standardized scores of different nature, including psychologist scores or environmental measures. In this case, there are not many models in the literature to analyze them efficiently, mainly because of their asymmetric nature. This paper offers an elegant one by improving the accuracy of an existing model.
In applied science, we may be dealing with variables that naturally take their values in the interval [-1,1], or sometimes it is better to rescale their values to this interval for better interpretation. For example, correlation coefficients may measure the linear relationship between a person's height and weight and always fall within [-1, 1]. Similarly, standardized scores, such as psychologist or exam scores, may fall within this interval. In addition, continuous data such as temperature anomalies, which represent deviations from a baseline, can be rescaled to [-1, 1] to allow consistent comparisons across scales, improving interpretability and analysis.
As in the general case, most of these real world data are ''not exactly symmetric'' or ''clearly asymmetric'' in distribution. Surprisingly, the literature does not provide many models for this case.
With this in mind, we have worked on modifying a well-known symmetric distribution, known as the ''cosine distribution'', in a simple but efficient way to make it asymmetric and thus practical in a wide range of data analysis with values in [-1,1]. The modification is based on the addition of tuning components that can be activated to give a significant gain in modelling efficiency.
The paper illustrates this claim with all the necessary ingredients, including theoretical guarantees and expressions of numerous key measures, and diverse applications to simulated and real data. The real data include rescaled earthquake and air quality data. The results of the new model are quite satisfactory in most situations. It can also be used in other more sophisticated models, such as machine learning models, that require data in [-1,1]; we have opened a door for new developments in this direction.
This paper was published in Asymmetry. Chesneau C. The asymmetric cosine distribution. Asymmetry 2024(1):0004, https://doi.org/10.55092/asymmetry20240004.
Journal
Asymmetry
Method of Research
Data/statistical analysis
Subject of Research
Not applicable
Article Title
The asymmetric cosine distribution
Article Publication Date
14-Oct-2024