Covariance and Correlation
- Statistics 4.5

29 Jul 2025 in Mathematics on Statistics · 8 min

Covariance
- Properties of Covariance
Correlation
- Properties of Correlation
Multivariate Normal Distribution

Covariance

Covariance is a measure of how much two random variables change together. The covariance between two random variables $X$ and $Y$ is defined as:

\[ \text{Cov}(X, Y) = \mathbb{E}[(X - \mathbb{E}[X])(Y - \mathbb{E}[Y])] \]

Properties of Covariance

By expanding the expectation and denoting the means as $\mu_X = \mathbb{E}[X]$ and $\mu_Y = \mathbb{E}[Y]$, we can rewrite the covariance as:

\[ \text{Cov}(X, Y) = \mathbb{E}[XY] - \mu_X \mu_Y \]

Also, covariance has the following properties:

\[ \mathrm{Var}(aX + bY) = a^2 \mathrm{Var}(X) + b^2 \mathrm{Var}(Y) + 2ab \text{Cov}(X, Y) \]

If $X$ and $Y$ are independent, then:

\[ \text{Cov}(X, Y) = 0 \]

However, be cautious: a covariance of zero does not imply independence of $X$ and $Y$.

Correlation

Correlation is a standardized measure of the relationship between two random variables, defined as:

\[ \rho_{X,Y} = \text{Corr}(X, Y) = \frac{\text{Cov}(X, Y)}{\sqrt{\text{Var}(X) \text{Var}(Y)}} \]

By denoting the standard deviations as $\sigma_X = \sqrt{\text{Var}(X)}$ and $\sigma_Y = \sqrt{\text{Var}(Y)}$, we can express correlation as:

\[ \rho_{X,Y} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y} \]

Properties of Correlation

If $X$ and $Y$ are independent, then:

\[ \rho_{X,Y} = \frac{0}{\sigma_X \sigma_Y} = 0 \]

Also, correlation has the following important property:

$-1 \leq \rho_{X,Y} \leq 1$
$\abs{\rho_{X,Y}} = 1$ if and only if there exists $a\neq 0$ such that $Y = aX + b$ for some constant $b$. Then, $\rho_{X,Y} = \text{sgn}\,a$.

Proof

\[ \begin{align*} h(t) &:= \mathrm{E}\left[ ((X-\mu_X)t+(Y-\mu_Y))^2 \right] \nl &= t^2 \sigma_X^2 + 2t \text{Cov}(X, Y) + \sigma_Y^2 \geq 0 \nl \Rightarrow \;\; \text{Disc}_t h &= 4 \text{Cov}(X, Y)^2 - 4 \sigma_X^2 \sigma_Y^2 \leq 0 \nl \therefore & -1 \leq \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y} \leq 1 \end{align*} \]

The equality holds if and only if the discriminant is zero, which means $h(t)$ is a perfect square. This implies that there exists a constant $a \neq 0$ such that $Y = aX + b$ for some constant $b$, leading to $\rho_{X,Y} = \text{sgn}\, a$.

Multivariate Normal Distribution

The multivariate normal distribution is a generalization of the normal distribution to multiple dimensions. A random vector $\mathbf{X} = (X_1, X_2, \ldots, X_n)$ follows a multivariate normal distribution if every linear combination of its components is normally distributed. The multivariate normal distribution is characterized by its mean vector $\bs{\mu} = (\mu_1, \mu_2, \ldots, \mu_n)$ and covariance matrix $\bs{\Sigma}$, which is a symmetric positive semi-definite matrix.

The probability density function of a multivariate normal distribution is given by:

\[ f(\mathbf{x}) = \frac{1}{\sqrt{(2\pi)^n \det \bs{\Sigma}}} \exp\left(-\frac{1}{2}(\mathbf{x} - \bs{\mu})^\top \bs{\Sigma}^{-1} (\mathbf{x} - \bs{\mu})\right) \]

where $\mathbf{x}$ is a vector in $\mathbb{R}^n$, $\bs{\mu}$ is the mean vector, and $\bs{\Sigma}$ is the covariance matrix: $\Sigma_{i,j} = \text{Cov}(X_i, X_j)$. Then we write as:

\[ \mathbf{X} \sim \mathcal{N}(\bs{\mu}, \bs{\Sigma}) \]

Let’s check an important property of the multivariate normal distribution:

\[ X_i \sim \mathcal{N}(\mu_i, \sigma_i^2) \]

where $\sigma_i^2 = \Sigma_{i,i}$ is the variance of $X_i$ and $\mu_i$ is the mean of $X_i$.

Covariance and Correlation
- Statistics 4.5

Covariance

Properties of Covariance

Correlation

Properties of Correlation

Multivariate Normal Distribution

Jiho's Blog

Error

Covariance

Properties of Covariance

Correlation

Properties of Correlation

Multivariate Normal Distribution

Templates (for web app):

Error