where \(x_i\) and \(y_i\) are the \(i\)-th values of the two variables, $ _x$ and $ _y$ are the means of the two variables, and \(n\) is the number of values.
The covariance can be positive, negative, or zero. A positive covariance indicates that the two variables tend to increase or decrease together, while a negative covariance indicates that one variable tends to increase as the other decreases.
Correlations
Correlation is a statistical measure that describes the relationship between two variables. It can be positive, negative, or zero.
Positive correlation: If one variable increases, the other variable also increases.
Negative correlation: If one variable increases, the other variable decreases.
Zero correlation: There is no relationship between the two variables.
The correlation coefficient ranges from -1 to 1. A value of 1 indicates a perfect positive correlation, a value of -1 indicates a perfect negative correlation, and a value of 0 indicates no correlation.
There are several methods to compute the correlation between two variables. The two most common methods are the Pearson correlation coefficient and the Spearman correlation
Pearson Correlation Coefficient
The Pearson correlation coefficient measures the linear relationship between two variables. It ranges from -1 to 1.
where \(x_i\) and \(y_i\) are the \(i\)-th values of the two variables, $ _x$ and $ _y$ are the means of the two variables, and \(n\) is the number of values.
where \(d_i\) is the difference between the ranks of the two variables and \(n\) is the number of values.
x.corr(y, method='spearman')
0.7432486904455022
.corr()
The .corr() method computes the correlation between columns in a DataFrame. By default, it computes the Pearson correlation coefficient, but the method parameter can be used to specify the method to use.