Skip to content

Borel-Cantelli Lemma

BOREL-CANTELLI LEMMA
If n=1P(An)<\sum_{n=1}^{\infty} \mathbb{P}(A_n) < \infty then P(Ani.o.)=0\mathbb{P}(A_n \, \text{i.o.}) = 0.
If (An)(A_n) are independent and n=1P(An)=\sum_{n=1}^{\infty} \mathbb{P}(A_n) = \infty then P(Ani.o.)=1\mathbb{P}(A_n \, \text{i.o.}) = 1.
The Borel-Cantelli Lemma is a fundamental result in probability theory which provides criteria to determine whether an infinite sequence of events will occur infinitely often or only finitely often.
Before proving the both Borel-Cantelli lemmas let's consider three examples.
Example of Symmetric coin
Consider an infinite sequence of independent tosses of a fair coin, represented by random variables X1,X2,X_1, X_2, \ldots, where each toss has probability
P(Head)=12.\begin{equation*}\mathbb{P}(\text{Head}) = \frac{1}{2}.\end{equation*}
Will we see infinitely many heads, or will the sequence eventually contain only tails?
While our intuition suggests we should see infinitely many heads, the second Borel-Cantelli lemma provides mathematical certainty. Consider:
k=1nP(Xk=Head)=k=1n12\begin{equation*}\sum_{k=1}^n \mathbb{P} \left( X_k = \text{Head} \right) = \sum_{k=1}^n \frac{1}{2} \to \infty\end{equation*}
By the second Borel-Cantelli lemma, since the events are independent, with probability 1 we will observe infinitely many heads.
Example Biased coin 1
Now consider a more interesting case where the probability of heads decreases with each toss:
P(Xn=Head)=1n.\begin{equation*}\mathbb{P}(X_n = \text{Head}) = \frac{1}{n}.\end{equation*}
The sum is
k=1nP(Xk=Head)=k=1n1k\begin{equation*}\sum_{k=1}^n \mathbb{P} \left( X_k = \text{Head} \right) = \sum_{k=1}^n \frac{1}{k} \to \infty\end{equation*}
This is the harmonic series, which diverges. Therefore, by the second Borel-Cantelli lemma, we will still see infinitely many heads, even though the probability of heads becomes arbitrarily small!
Example of Biased coin 2
Finally, consider an even more extreme bias where
P(Xn=Head)=1n2.\begin{equation*}\mathbb{P}(X_n = \text{Head}) = \frac{1}{n^2}.\end{equation*}
Here, the Borel-Cantelli lemma reveals a fundamentally different behavior:
k=1nP(Xk=Head)=k=1n1k2k=11k2<\begin{equation*}\sum_{k=1}^n \mathbb{P} \left( X_k = \text{Head} \right) = \sum_{k=1}^n \frac{1}{k^2} \to \sum_{k=1}^{\infty} \frac{1}{k^2} < \infty\end{equation*}
Since this series converges, the Borel-Cantelli lemma tells us that, almost surely, there exists some finite NN after which we will never see another head. The probability of heads decreases so rapidly that only finitely many heads will occur.
Before discussing the two Borel-Cantelli Lemmas, we first need to define some technical terms related to an infinite sequence of events. Consider the sequence of events A1,A2,A3,A_1, A_2, A_3, \dots. We say that AnA_n happens infinitely often (i.o.) if
{An,i.o.}={ω:mN,nmsuch thatωAn}=m=1mnmnmAn=lim supnAn=limmnmAn\begin{equation*}\{ A_n, \text{i.o.} \} = \{ \omega : \forall m \in \mathbb{N}, \, \, \exists n \geq m \,\, \text{such that} \, \, \omega \in A_{n} \} = \underbrace{\bigcap_{m=1}^{\infty}}_{\forall m} \underbrace{\bigcup_{n \geq m}}_{\exists n \geq m} A_n = \limsup_{n \rightarrow \infty} A_n = \lim_{m \rightarrow \infty} \bigcup_{n \geq m} A_n\end{equation*}
and we say that AnA_n happens eventually, that is, for all but finitely many nn,
{An,eventually}={ω:mN,nmsuch thatωAn}=m=1mnmnmAn=lim infnAn=limmnmAn\begin{equation*}\{ A_n, \text{eventually} \} = \{ \omega : \exists m \in \mathbb{N}, \,\, \forall n \geq m \, \, \text{such that} \, \, \omega \in A_{n} \} = \underbrace{\bigcup_{m=1}^{\infty}}_{\exists m} \underbrace{\bigcap_{n \geq m}}_{\forall n \geq m} A_n = \liminf_{n \rightarrow \infty} A_n = \lim_{m \rightarrow \infty} \bigcap_{n \geq m} A_n\end{equation*}
Suppose we focus on two subsets within the interval (0,1](0,1] : A1=(0,12]A_1 = (0, \frac{1}{2}] and A2=(12,1]A_2 = (\frac{1}{2}, 1]. We define the sequence AnA_n
An={(0,12],n=2k+1,kN(12,1],n=2k,kN.\begin{equation*}A_n = \begin{cases} (0, \frac{1}{2}] , & n = 2k + 1, \quad k \in \mathbb{N} \\ (\frac{1}{2}, 1], & n = 2k, \quad k \in \mathbb{N} .\end{cases}\end{equation*}
With this sequence, we can compute the upper limit lim supnAn\limsup_{n \rightarrow \infty} A_n
{An,i.o.}=m=1nmAn=(0,1]\begin{equation*}\{ A_n, \text{i.o.} \} = \bigcap_{m=1} \bigcup_{n \geq m} A_n = (0, 1]\end{equation*}
and the lower limit lim infnAn\liminf_{n \rightarrow \infty} A_n
{An,eventually}=m=1nmAn=.\begin{equation*}\{ A_n, \text{eventually} \} = \bigcup_{m=1} \bigcap_{n \geq m} A_n = \emptyset\end{equation*}.
THE FIRST PART OF BOREL-CANTELLI LEMMA
If n=1P(An)<\sum_{n=1}^{\infty} \mathbb{P}(A_n) < \infty then P(Ani.o.)=0\mathbb{P}(A_n \, \text{i.o.}) = 0.
\quad Proof. If we denote Gm=nmAnG_m = \bigcup_{n \geq m} A_n then we can see that Gm+1GmG_{m+1} \subset G_m and Gmlim supnAnG_m \downarrow \limsup_{n \rightarrow \infty} A_n. So using the continuity property of P\mathbb{P}:
P(An,i.o.)=P(m=1Gm)=limmP(Gm)=limmP(nmAn)limmnmP(An)\begin{equation*}\mathbb{P}(A_n, \text{i.o.}) = \mathbb{P}\left(\bigcap_{m=1}^{\infty} G_m\right) = \lim_{m \rightarrow \infty} \mathbb{P}(G_m) = \lim_{m \rightarrow \infty} \mathbb{P}\left(\bigcup_{n \geq m} A_n\right) \leq \lim_{m \rightarrow \infty} \sum_{n \geq m} \mathbb{P}(A_n)\end{equation*}
where the last inequality is due the sub-additivity property of P\mathbb{P}. When n=1P(An)<\sum_{n=1}^{\infty} \mathbb{P}(A_n) < \infty, the right-hand side becomes 00, and we get P(Ani.o.)=0\mathbb{P}(A_n \, \text{i.o.}) = 0. \Box
Plot of 1x1 - x and exp(x)\exp(-x)
THE SECOND PART OF BOREL-CANTELLI LEMMA
If the events AnA_n are independent and n=1P(An)=\sum_{n=1}^{\infty} \mathbb{P}(A_n) = \infty , then P(Ani.o.)=1\mathbb{P}(A_n \, \text{i.o.}) = 1
\quad Proof. If A1,A2,...A_1, A_2, ... are independent, so are A1,A2,...\overline{A}_1, \overline{A}_2, .... Hence for NnN \geq n we have,
P(k=nNAck)=k=nNP(Ack)=k=nN(1P(Ak))\begin{equation*}\mathbb{P} \left( \bigcap^N_{k=n} {A^c}_k \right) = \prod^N_{k=n} \mathbb{P}({A^c}_k) = \prod^N_{k=n} (1 - \mathbb{P}(A_k))\end{equation*}
using 1xexp(x) 1-x \leq \exp(-x)
k=nN(1P(Ak))k=nNexp(P(Ak))=exp(k=nNP(Ak))asN\begin{equation*}\prod^N_{k=n} (1 - \mathbb{P}(A_k)) \leq \prod^N_{k=n} \exp(-\mathbb{P}(A_k)) = \exp \left( - \sum^N_{k=n} \mathbb{P}(A_k) \right) \quad \text{as} \quad N \rightarrow \infty\end{equation*}
Hence
P ⁣(k=nNAkc)exp ⁣(k=nNP(Ak))NP ⁣(k=nAkc)=0,\begin{equation*}\mathbb{P}\!\left(\bigcap_{k=n}^N A_k^c\right)\le\exp\!\left(-\sum_{k=n}^N \mathbb{P}(A_k)\right)\overset{N\to\infty}{\Longrightarrow}\mathbb{P}\!\left(\bigcap_{k=n}^{\infty} A_k^c\right)=0,\end{equation*}
because k=nP(Ak)=\sum_{k=n}^{\infty}\mathbb{P}(A_k)=\infty. Therefore
P ⁣(k=nAk)=1P ⁣(k=nAkc)=1.\begin{equation*}\mathbb{P}\!\left(\bigcup_{k=n}^{\infty} A_k\right)=1-\mathbb{P}\!\left(\bigcap_{k=n}^{\infty} A_k^c\right)=1.\end{equation*}
for all nn, and since
k=nAkn=1k=nAk=lim supnAn,\begin{equation*}\bigcup^{\infty}_{k=n} A_k \downarrow \bigcap_{n=1}^{\infty}\bigcup^{\infty}_{k=n} A_k = \limsup_{n\to\infty} A_n,\end{equation*}
it follows that P(Ani.o.)=1\mathbb{P}(A_n \, \text{i.o.}) = 1. \Box

The foolowing example is Theorem 2.3.5, page 59 from [Durrett2019]

THEOREM
Let X1,X2,X_1, X_2, \ldots be i.i.d. with EX=μ\mathbb{E}X = \mu and EX4<\mathbb{E}X^4 < \infty. If Sn=X1++XnS_n = X_1 + \cdots + X_n, then Sn/nμS_n/n \to \mu a.s.
\quad Proof. By letting Xi=XiμX_i' = X_i - \mu, we can suppose without loss of generality that μ=0\mu = 0. Now
ESn4=E(i=1nXi)4=E1i,j,k,lnXiXjXkXl\begin{equation*}\mathbb{E}S_n^4 = \mathbb{E}\left( \sum_{i=1}^n X_i \right)^4 = \mathbb{E} \sum_{1 \leq i,j,k,l \leq n} X_i X_j X_k X_l\end{equation*}
Terms in the sum of the form E(Xi3Xj)\mathbb{E}(X_i^3 X_j), E(Xi2XjXk)\mathbb{E}(X_i^2 X_j X_k), and E(XiXjXkXl)\mathbb{E}(X_i X_j X_k X_l) are 00 (if i,j,k,li,j,k,l are distinct) since the expectation of the product is the product of the expectations because of the independence, and in each case one of the terms has expectation 0. The only terms that do not vanish are those of the form EXi4\mathbb{E}X_i'^4 and EXi2Xj2=(EXi2)2\mathbb{E}X_i'^2 X_j'^2 = (\mathbb{E}X_i'^2)^2. There are nn and 3n(n1)3n(n - 1) of these terms, respectively. In the second case, we can pick the two indices in n(n1)/2n(n - 1)/2 ways, and with the indices fixed, the term can arise in a total of six ways: {1,2},{1,3},{1,4},{2,3},{2,4}\{ 1, 2\}, \{ 1, 3\}, \{ 1, 4\}, \{ 2, 3\}, \{ 2, 4\} and {3,4}\{ 3, 4\}. The last observation implies
ESn4=nEXi4+3(n2n)(EXi2)2Cn2\begin{equation*}\mathbb{E}S_n^4 = n\mathbb{E}X_i^4 + 3(n^2 - n)(\mathbb{E}X_i^2)^2 \leq Cn^2\end{equation*}
where C<C < \infty. Chebyshev’s inequality gives us
P(Snn>ε)E(Sn4)(nε)4C(n2ε4)\begin{equation*}\mathbb{P}(\frac{|S_n|}{n} > \varepsilon) \leq \frac{\mathbb{E}(S_n^4)}{(n\varepsilon)^4} \leq \frac{C}{(n^2 \varepsilon^4)}\end{equation*}
Summing on nn and using the Borel-Cantelli lemma gives P(Sn>nε i.o.)=0\mathbb{P}(|S_n| > n\varepsilon \text{ i.o.}) = 0 or n0,nn0Snnε \exists n_0, \, \forall n \geq n_0 \: \frac{|S_n|}{n} \leq \varepsilon. Since ε\varepsilon is arbitrary, the proof is complete. \Box