Let $(\Omega, \mathcal H, \mathbb P)$ be a probability space and let $(E, \mathcal E, \mu)$ be a measure space. Set $A \in \mathcal H$ and $B \in \mathcal E$ are called null sets if $\mathbb P(A) = 0$ or $\mu(B) = 0$. Null sets are not detectable by measures, but introduce complications in their analysis. These complications are mostly best ignored, but occasionally we need to recognize null sets and/or have language that describes when we are ignoring them. The word ‘almost’ is usually an indicator that we are purposefully ignoring null sets.

**Definition:** If $A$ is an event in $\mathcal H$ for which $\mathbb P(A) = 0$ then we say $A$ *almost* *never *occurs. If, on the other hand $\mathbb P(A) = 1$ then we say $A$ *almost surely* occurs. If $B$ is a set in $\mathcal E$ for which $\mu(A^c) = 0$ then we say $A$ is *almost everywhere*. These are often abbreviated *a.n., a.s. *and* a.e.*.

We say two measurable functions $f$ and $g$ on $(E, \mathcal E)$ are *equivalent* (or $\mu$-*equivalent*) then when they agree almost everywhere. That is when $\mu\{ x \in E : f(x) \neq g(x) \} = 0$. We denote the equivalence class of $f$ by $[f]$. Similarly, two random variables are equivalent when they agree almost surely, and we denote the equivalence class of $X$ by $[X]$.

**Exercise:** Suppose $f$ and $g$ are equivalent. Show that $\int f d\mu = \int g d\mu$.

This notation gives us an ahistoric and ad hoc explanation for why we denote expectation by $\mathbb E[X]$ as it is defined only up to equivalence.

Some authors use $L^p$ to be a set of equivalence classes (as opposed to a set of functions). This perspective eliminates some technical complications at the expense of a more complicated definition. It is not hard to see that the equivalence classes inherit the algebra structure of functions. That is $L^p$ is a normed algebra whether we view it as a set of functions or equivalence classes. We will mostly conflate measurable functions with their equivalence classes.