Covariance and Correlation
- Statistics 4.5
Covariance
Covariance is a measure of how much two random variables change together. The covariance between two random variables $X$ and $Y$ is defined as:
\[ \text{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])] \]
Properties of Covariance
By expanding the expectation and denoting the means as $\mu_X = \mathbb{E}[X]$ and $\mu_Y = \mathbb{E}[Y]$, we can rewrite the covariance as:
\[ \text{Cov}(X, Y) = \mathbb{E}[XY] - \mu_X \mu_Y \]
Also, covariance has the following properties:
\[ \mathrm{Var}(aX + bY) = a^2 \mathrm{Var}(X) + b^2 \mathrm{Var}(Y) + 2ab \text{Cov}(X, Y) \]
If $X$ and $Y$ are independent, then:
\[ \text{Cov}(X, Y) = 0 \]
However, be cautious: a covariance of zero does not imply independence of $X$ and $Y$.
Correlation
Correlation is a standardized measure of the relationship between two random variables, defined as:
\[ \rho_{X,Y} = \text{Corr}(X, Y) = \frac{\text{Cov}(X, Y)}{\sqrt{\text{Var}(X) \text{Var}(Y)}} \]
By denoting the standard deviations as $\sigma_X = \sqrt{\text{Var}(X)}$ and $\sigma_Y = \sqrt{\text{Var}(Y)}$, we can express correlation as:
\[ \rho_{X,Y} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y} \]
Properties of Correlation
If $X$ and $Y$ are independent, then:
\[ \rho_{X,Y} = \frac{0}{\sigma_X \sigma_Y} = 0 \]
Also, correlation has the following important property:
- $-1 \leq \rho_{X,Y} \leq 1$
- $\abs{\rho_{X,Y}} = 1$ if and only if there exists $a\neq 0$ such that $Y = aX + b$ for some constant $b$. Then, $\rho_{X,Y} = \text{sgn}\,a$.
Proof
\[ \begin{align*} h(t) &:= \mathrm{E}\left[ ((X-\mu_X)t+(Y-\mu_Y))^2 \right] \nl &= t^2 \sigma_X^2 + 2t \text{Cov}(X, Y) + \sigma_Y^2 \geq 0 \nl \Rightarrow \;\; \text{Disc}_t h &= 4 \text{Cov}(X, Y)^2 - 4 \sigma_X^2 \sigma_Y^2 \leq 0 \nl \therefore & -1 \leq \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y} \leq 1 \end{align*} \]
The equality holds if and only if the discriminant is zero, which means $h(t)$ is a perfect square. This implies that there exists a constant $a \neq 0$ such that $Y = aX + b$ for some constant $b$, leading to $\rho_{X,Y} = \text{sgn}\, a$.
Multivariate Normal Distribution
The multivariate normal distribution is a generalization of the normal distribution to multiple dimensions. A random vector $\mathbf{X} = (X_1, X_2, \ldots, X_n)$ follows a multivariate normal distribution if every linear combination of its components is normally distributed. The multivariate normal distribution is characterized by its mean vector $\bs{\mu} = (\mu_1, \mu_2, \ldots, \mu_n)$ and covariance matrix $\bs{\Sigma}$, which is a symmetric positive semi-definite matrix.
The probability density function of a multivariate normal distribution is given by:
\[ f(\mathbf{x}) = \frac{1}{\sqrt{(2\pi)^n \det \bs{\Sigma}}} \exp\left(-\frac{1}{2}(\mathbf{x} - \bs{\mu})^\top \bs{\Sigma}^{-1} (\mathbf{x} - \bs{\mu})\right) \]
where $\mathbf{x}$ is a vector in $\mathbb{R}^n$, $\bs{\mu}$ is the mean vector, and $\bs{\Sigma}$ is the covariance matrix: $\Sigma_{i,j} = \text{Cov}(X_i, X_j)$. Then we write as:
\[ \mathbf{X} \sim \mathcal{N}(\bs{\mu}, \Sigma) \]
Let’s check an important property of the multivariate normal distribution:
\[ X_i \sim \mathcal{N}(\mu_i, \sigma_i^2) \]
where $\sigma_i^2 = \Sigma_{i,i}$ is the variance of $X_i$ and $\mu_i$ is the mean of $X_i$.