# Independence & Conditioning

Basics of Probability

Here we introduce a purely probabilistic concept, independence.

Let $(\Omega, \mathcal H, \mathbb P)$ be a probability space.

Definition: Given $A, B \in \mathcal H$ with $\mathbb P(B) \neq 0$, we define the conditional probability of $A$ given $B$ to be $$\mathbb P(A | B) = \frac{\mathbb P(A \cap B)}{\mathbb P(A)}.$$

Definition: Two events $A, B \in \mathcal H$ are said to be independent if $\mathbb P(A | B) = \mathbb P(A)$, or equivalently if $\mathbb P(A \cap B) = \mathbb P(A) \mathbb P(B)$.

More generally, $A_1, A_2, \ldots$ are independent if for every finite subcollection $A_{m_1}, A_{m_2}, \ldots, A_{m_N}$ we have $\mathbb P(A_{m_1} \cap \cdots \cap A_{m_N} ) = \mathbb P(A_{m_1}) \cdots \mathbb P(A_{m_N})$.

We can extend this definition to $\sigma$-algebras by saying that two sub-$\sigma$-algebras of $\mathcal H$, $\mathcal F$ and $\mathcal G$ are independent if $A$ and $B$ are independent for every choice of $A \in \mathcal F$ and $B \in \mathcal G$. A collection of sub-$\sigma$-algebras $(\mathcal F_{\lambda})_{\lambda \in \Lambda}$ is called an independency if for every choice of $A_{\lambda} \in \mathcal F_{\lambda}$, only finitely many of which are not equal to $\Omega$, $$\mathbb P\left( \bigcap_{\lambda \in \Lambda} A_{\lambda} \right) = \prod_{\lambda \in \Lambda} \mathbb P(A_{\lambda}).$$ Note that because only finitely many of the probabilities $\mathbb P(A_{\lambda}) \neq 1$ the product which appears here is actually a finite product.

Definition: If $X_1$ and $X_2$ are random variables on $\mathcal H$, then $X_1$ and $X_2$ are said to be independent if $\sigma(X_1)$ and $\sigma(X_2)$ are independent. More generally $\{X_{\lambda}\}_{\lambda \in \Lambda}$ is an independency of random variables if $(\sigma(X_{\lambda}))_{\lambda \in \Lambda}$ is an independency of $\sigma$-algebras.

The notion of independence of random variables can be recast to say that $X$ and $Y$ are independent when $\mathbb P\{X \in A, Y \in B\} = \mathbb P\{X \in A\} \mathbb P\{Y \in B\}$ for all $A, B \in \mathcal B(\mathbb R)$. This extends to multiple random variables: $(X_{\lambda})_{\lambda \in \Lambda}$ is an independency if for every finite collection $X_{m_1}, \ldots, X_{m_N}$, and $A_1, \ldots, A_N \in \mathcal B(\mathbb R)$, $\mathbb P\{X_{m_1} \in A_1, \ldots, \mathbb X_{m_N} \in A_N\} = \mathbb P\{X_{m_1} \in A_1\} \cdots \mathbb P\{X_{m_N} \in A_N\}$. This in turn can be extend to say something about the joint distributions and densities for independencies of random variables.

For simplicity, here we assume that $\mathbf X = (X_1, \ldots, X_N)$ is a random vector on $\mathcal H$ with joint distribution $\mu_{\mathbf x}$. As usual, the cumulative distribution function and joint density of $\mathbf X$ (assuming the latter exists) is denoted $F_{\mathbf X}$ and $f_{\mathbf x}$ respectively.

Theorem: $X_1, \ldots, X_N$ form an independency if any one of the following are true.

• $\mu_{\mathbf x}(d x_1, \ldots, d x_N) = \mu_{X_1}(dx_1) \cdots \mu_{X_N}(d x_n)$.
• $F_{\mathbf X}(x_1, \ldots, x_N) = F_{X_1}(x_1) \cdots F_{X_N}(x_N)$.
• $f_{\mathbf x}(x_1, \ldots, x_N) = f_{X_1}(x_1) \cdots f_{X_N}(x_N)$.

#### Conditional Densities

Given two random variables $X$ and $Y$, the conditional probability of $\{X = x\}$ given $\{Y = y\}$, when defined, is $$\mathbb P( X = x | Y = y) = \frac{\mathbb P\{X = x, Y = y\}}{\mathbb P\{Y = y\}}.$$ We say ‘when defined’ because it is often the case that $\mathbb P\{Y = y\} = 0$ in which case conditional expectation does not make sense. When $X$ and $Y$ have a joint density, the notion of conditional probability can be replaced with a conditional density of $X$ given $Y$.

Definition: If $X, Y$ are jointly distributed random variables with joint density $f_{X,Y}(x,y)$ then the conditional distribution of $X$ given $Y$ is defined to be $$f_{X|Y}(x,y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}.$$

With some care this definition can be extended to the conditional density of a random vector $\mathbf X$ given a subvector $\mathbf Y \subseteq \mathbf X$, but for now we concentrate on the case of two random variables.

Theorem: If $A, B \in \mathcal B(\mathbb R)$ and $\mathbb P\{Y \in B\} \neq 0$, then $$\mathbb P\{X \in A | Y \in B\} = \int_A \int_B f_{X|Y}(x, y) dx \, dy.$$

Exercise: Show that if $X$ and $Y$ are independent, then $f_{X|Y} = f_X$.