preamble
Correlation analysis is one of the basics of many algorithms and modeling, and it's a classic. Correlation analysis can be used to express many feature relationships and trends. There are three common types of correlation coefficients: person correlation coefficient, spearman correlation coefficient, and Kendall's tau-b rank correlation coefficient. Each of them has its own usage and scenarios. Of course, I will write all the algorithms and principles + code for these three correlation coefficients in my column. Currently on the mathematical modeling column has been the traditional machine learning prediction algorithms, dimensional algorithms, temporal prediction algorithms and weighting algorithms written seven or eight, there is a need for this interest in the students can go to take a look.Pearson correlation analysis a detailed article + python example code
I. Definitions
It is often denoted by the Greek letter ρ. It is a nonparametric measure of the dependence of two variables. It utilizes monotonic equations to evaluate the correlation of two statistical variables. The Spearman correlation coefficient is +1 or -1 if there are no repeated values in the data and when the two variables are perfectly monotonically correlated.The Spearman correlation coefficient is defined as the Pearson correlation coefficient between ranked variables. For a sample with sample size n, where n raw data are converted to rank data, the correlation coefficient ρ is:
where di is the grade difference between Xi and Yi. The di is calculated as:
II. Scenarios of Spielmann's relevant use
Spearman's correlation coefficient is more widely applicable than Pearson's correlation coefficient, as long as the observations of the two variables are paired rank-rated data, or rank-rated data transformed from observations of continuous variables, regardless of the overall distribution pattern of the two variables and the size of the sample size, the Spearman's rank-rated correlation coefficient can be used to conduct research. As long as the data satisfy a monotonic relationship (e.g., linear, exponential, logarithmic functions, etc.) they are able to be used.
The Spearman correlation coefficient is less sensitive to outliers because it is calculated based on rank order and the magnitude of the difference between the actual values has no direct effect on the calculation.
III. Calculation of Spearman's correlation coefficient
As with the function used in the previous article, the pandas function corr can be used:
(method='pearson', min_periods=1, numeric_only=_NoDefault.no_default)
Parameter Description:
method: {‘pearson’, ‘kendall’, ‘spearman’} or callable。Method of correlation。
- pearson : standard correlation coefficient, Pearson's coefficient
- kendall : Kendall Tau correlation coefficient, Kendall coefficient
- spearman : Spearman rank correlation, Spearman coefficient
min_periods: int, optional. the minimum number of samples required for each pair of columns. Currently only applies to Pearson and Spearman correlations.
numeric_only: bool, default True. contains only floating-point, integer, or boolean data.
It's easy to implement.
rho =df_test.corr(method='spearman') rho
heat map
[''] = ['SimHei'] ['axes.unicode_minus'] = False (rho, annot=True) ('Heat Map', fontsize=18)
Or use scipy's state function, which has the same effect:
import numpy as np from scipy import stats (data1,data2)
IV. Hypothesis testing of Spearman's correlation coefficient
There are two cases: small and large samples
For the small sample case (n ≤ 30), it is straightforward to look up the critical value tableH0:rs = 0; H1:rs ≠ 0
The resulting Spearman's correlation coefficient, r, was used to compare with the corresponding critical values.
The large sample case, the statistic
H0: rs = 0; H1: rs ≠ 0. It is sufficient to calculate the test value z* and derive the corresponding p-value for comparison with 0.05.
Above is the details of python spearman spearman correlation analysis example, more information about python spearman correlation analysis, please pay attention to my other related articles!