Borel-Cantelli Lemma

Last updated: 2026-04-12

The Borel-Cantelli Lemma is a fundamental result in probability theory which provides criteria to determine whether an infinite sequence of events will occur infinitely often or only finitely often.
Theorem (Borel-Cantelli Lemma)
If n=0P(An)<\sum_{n=0}^{\infty} P(A_n) < \infty then P{Ani.o.}=0P\{ A_n \, \text{i.o.}\} = 0.
If n=0P(An)<\sum_{n=0}^{\infty} P(A_n) < \infty then P{Ani.o.}=0P\{ A_n \, \text{i.o.}\} = 0.

Motivation

The Borel-Cantelli lemma provides powerful insights into the behavior of infinite sequences of events. Let's explore this through three illuminating examples of coin tossing experiments.

Symmetric coin

Consider an infinite sequence of independent tosses of a fair coin, represented by random variables X1,X2,X_1, X_2, \ldots, where each toss has probability P(Head)=12P(\text{Head}) = \frac{1}{2}. Will we see infinitely many heads, or will the sequence eventually contain only tails?
Example of Symmetric coin
While our intuition suggests we should see infinitely many heads, the Borel-Cantelli lemma provides mathematical certainty. Consider:
k=1nP(Xk=Head)=k=1n12=\begin{equation*}\sum_{k=1}^n P \left( X_k = \text{Head} \right) = \sum_{k=1}^n \frac{1}{2} = \infty\end{equation*}
The lemma confirms our intuition: with probability 1 (almost surely), we will observe infinitely many heads.

Biased coin 1

Now consider a more intriguing case where the probability of heads decreases with each toss: P(Xn=Head)=1nP(X_n = \text{Head}) = \frac{1}{n}. Despite this decreasing probability, the Borel-Cantelli lemma reveals that:
k=1nP(Xk=Head)=k=1n1n=\begin{equation*}\sum_{k=1}^n P \left( X_k = \text{Head} \right) = \sum_{k=1}^n \frac{1}{n} = \infty\end{equation*}
This is the harmonic series, which diverges. Therefore, remarkably, we will still see infinitely many heads almost surely, even though the probability of heads becomes arbitrarily small!
Example Biased coin 1

Biased coin 2

Finally, consider an even more extreme bias where P(Xn=Head)=1n2P(X_n = \text{Head}) = \frac{1}{n^2}. Here, the Borel-Cantelli lemma reveals a fundamentally different behavior:
Example of Biased coin 2
k=1nP(Xk=Head)=k=1n1n2<\begin{equation*}\sum_{k=1}^n P \left( X_k = \text{Head} \right) = \sum_{k=1}^n \frac{1}{n^2} < \infty\end{equation*}
Since this series converges, the Borel-Cantelli lemma tells us that, almost surely, there exists some finite NN after which we will never see another head. The probability of heads decreases so rapidly that only finitely many heads will occur.

What is limit of the events

Before discussing the two Borel-Cantelli Lemmas, we first need to define some technical terms related to an infinite sequence of events. Consider the sequence of events A1,A2,A3,A_1, A_2, A_3, \dots. We say that AnA_n happens infinitely often (i.o.) if
{An,i.o.}={ω:mN,n(ω)msuch thatωAn(ω)}==m=1nmAn=lim supnAn=limmnmAn\begin{align*}\{ A_n, \text{i.o.} \} &= \{ \omega : \forall m \in \mathbb{N}, \, \, \exists n(\omega) \geq m \,\, \text{such that} \, \, \omega \in A_{n(\omega)} \} = \\&= \bigcap_{m=1} \bigcup_{n \geq m} A_n = \limsup_{n \rightarrow \infty} A_n = \lim_{m \rightarrow \infty} \bigcup_{n \geq m} A_n\end{align*}
and we say that AnA_n happens ultimately often (ult.), for all but finitely many AnA_n
{An,ult.}={ω:m(ω)N,nm(ω)such thatωAn}==m=1nmAn=lim infnAn=limmnmAn\begin{align*}\{ A_n, \text{ult.} \} &= \{ \omega : \exists m(\omega) \in \mathbb{N}, \,\, \forall n \geq m(\omega) \, \, \text{such that} \, \, \omega \in A_{n} \} = \\&= \bigcup_{m=1} \bigcap_{n \geq m} A_n = \liminf_{n \rightarrow \infty} A_n = \lim_{m \rightarrow \infty} \bigcap_{n \geq m} A_n\end{align*}
Example
The concept of the limit of a sequence of infinite events can be intricate. To better understand this, consider a simple example that fits well with the definitions of limits superior (lim sup\limsup) and limits inferior (lim inf\liminf). Suppose we focus on two subsets within the interval (0,1](0,1] : A1=(0,12]A_1 = (0, \frac{1}{2}] and A2=(12,1]A_2 = (\frac{1}{2}, 1]. We define the whole sequence AnA_n, as follows:
An={(0,12],n=2k+1,kN(12,1],n=2k,kN.\begin{equation*}A_n = \begin{cases} (0, \frac{1}{2}] , & n = 2k + 1, \quad k \in \mathbb{N} \\ (\frac{1}{2}, 1], & n = 2k, \quad k \in \mathbb{N} .\end{cases}\end{equation*}
With this sequence, we can compute the upper limit lim supnAn\limsup_{n \rightarrow \infty} A_n
{An,i.o.}=m=1nmAn=(0,1]\begin{equation*}\{ A_n, \text{i.o.} \} = \bigcap_{m=1} \bigcup_{n \geq m} A_n = (0, 1]\end{equation*}
and the lower limit lim infnAn\liminf_{n \rightarrow \infty} A_n
{An,ult.}=m=1nmAn=.\begin{equation*}\{ A_n, \text{ult.} \} = \bigcup_{m=1} \bigcap_{n \geq m} A_n = \emptyset\end{equation*}.

Proof of Borel-Cantelli Lemma

Theorem (The first part of Borel-Cantelli Lemma)
If n=0P(An)<\sum_{n=0}^{\infty} P(A_n) < \infty then P{Ani.o.}=0P\{ A_n \, \text{i.o.}\} = 0.
\begin{proof}
\quad Proof. If we denote Gm=nmAnG_m = \bigcup_{n \geq m} A_n then we can see that Gm+1GmG_{m+1} \subset G_m and Gmlim supnAnG_m \downarrow \limsup_{n \rightarrow \infty} A_n. So using the continuity property of PP:
P{An,i.o.}=P{m=1Gm}=limmP(Gm)=limmPnmAnlimmnmP(An)\begin{equation*}P \{ A_n, \text{i.o.} \} = P \{ \bigcap_{m=1} G_m \} = \lim_{m \rightarrow \infty} P ( G_m ) = \lim_{m \rightarrow \infty} P \bigcup_{n \geq m} A_n \leq \lim_{m \rightarrow \infty} \sum_{n \geq m} P (A_n)\end{equation*}
where the last inequality is due the sub-additivity property of PP. When n=0P(An)<\sum_{n=0}^{\infty} P(A_n) < \infty, the right-hand side becomes 00, and we get P{Ani.o.}=0P\{ A_n \, \text{i.o.}\} = 0. \Box
\end{proof}
The typical application of the first part of Borel-Cantelli Lemma is to consider events AnA_n with n=0P(An)<\sum_{n=0}^{\infty} P(A_n) < \infty, then from the statement P{Ani.o.}=0P\{ A_n \, \text{i.o.} \} = 0 we get by taking complement P{Ani.o.}c=1P\{ {A_n \, \text{i.o.}} \}^c = 1. The last probability can be rewritten P{ω:m(ω)N,nm(ω)such thatωAn}=1P \{ \omega : \exists m(\omega) \in \mathbb{N}, \,\, \forall n \geq m(\omega) \, \, \text{such that} \, \, \omega \in A_{n} \} = 1, which means that m\exists m, nm\forall n \geq m all events AmA_m occur.
For example, if the probability of the head at nn-th coin toss equals to pn=1n2p_n = \frac{1}{n^2}, the the series is convergent and then starting from some mm we can observe only heads.
Plot of 1x1 - x and exp(x)\exp(-x)
Theorem (The second part of Borel-Cantelli Lemma)
If the events AnA_n are independent and n=1P(An)=\sum_{n=1}^{\infty} P(A_n) = \infty , then P{Ani.o.}=1P\{ A_n \, \text{i.o.}\} = 1
\begin{proof}
\quad Proof. If A1,A2,...A_1, A_2, ... are independent, so are A1,A2,...\overline{A}_1, \overline{A}_2, .... Hence for NnN \geq n we have, using 1xexp(x) 1-x \leq \exp(-x)
P(k=nNAck)=k=nNP(Ack)=k=nN(1P(Ak))k=nNexp(P(An))=exp(k=nNP(An))asN\begin{align*}P \left( \bigcap^N_{k=n} {A^c}_k \right) &= \prod^N_{k=n} P({A^c}_k) = \prod^N_{k=n} (1 - P(A_k)) \leq \prod^N_{k=n} \exp(-P(A_n)) \\&= \exp \left( - \sum^N_{k=n} P(A_n) \right) \quad \text{as} \quad N \rightarrow \infty\end{align*}
Consequently
P(k=nAk)=1\begin{equation*}P \left( \bigcup^{\infty}_{k=n} A_k \right) = 1\end{equation*}
for all nn, and since k=nAklim supAn\bigcup^{\infty}_{k=n} A_k \downarrow \limsup A_n it follows that P(Ani.o.)=1P(A_n \, \text{i.o.}) = 1. \Box
\end{proof}

Applications

Proof of Strong Law of Large Numbers

The foolowing example is Theorem 2.3.5, page 59 from [Durrett2019]

Theorem
Let X1,X2,X_1, X_2, \ldots be i.i.d. with EX=μ\mathbb{E}X = \mu and EX4<\mathbb{E}X^4 < \infty. If Sn=X1++XnS_n = X_1 + \cdots + X_n, then Sn/nμS_n/n \to \mu a.s.
\begin{proof}
\quad Proof. By letting Xi=XiμX_i' = X_i - \mu, we can suppose without loss of generality that μ=0\mu = 0. Now
ESn4=E(i=1nXi)4=E1i,j,k,lnXiXjXkXl\begin{equation*}\mathbb{E}S_n^4 = \mathbb{E}\left( \sum_{i=1}^n X_i \right)^4 = \mathbb{E} \sum_{1 \leq i,j,k,l \leq n} X_i X_j X_k X_l\end{equation*}
Terms in the sum of the form E(Xi3Xj)\mathbb{E}(X_i^3 X_j), E(Xi2XjXk)\mathbb{E}(X_i^2 X_j X_k), and E(XiXjXkXl)\mathbb{E}(X_i X_j X_k X_l) are 00 (if i,j,k,li,j,k,l are distinct) since the expectation of the product is the product of the expectations because of the independence, and in each case one of the terms has expectation 0. The only terms that do not vanish are those of the form EXi4\mathbb{E}X_i'^4 and EXi2Xj2=(EXi2)2\mathbb{E}X_i'^2 X_j'^2 = (\mathbb{E}X_i'^2)^2. There are nn and 3n(n1)3n(n - 1) of these terms, respectively. In the second case, we can pick the two indices in n(n1)/2n(n - 1)/2 ways, and with the indices fixed, the term can arise in a total of six ways: {1,2},{1,3},{1,4},{2,3},{2,4}\{ 1, 2\}, \{ 1, 3\}, \{ 1, 4\}, \{ 2, 3\}, \{ 2, 4\} and {3,4}\{ 3, 4\}. The last observation implies
ESn4=nEXi4+3(n2n)(EXi2)2Cn2\begin{equation*}\mathbb{E}S_n^4 = n\mathbb{E}X_i^4 + 3(n^2 - n)(\mathbb{E}X_i^2)^2 \leq Cn^2\end{equation*}
where C<C < \infty. Chebyshev’s inequality gives us
P(Snn>ϵ)E(Sn4)(nϵ)4C(n2ϵ4)\begin{equation*}P(\frac{|S_n|}{n} > \epsilon) \leq \frac{\mathbb{E}(S_n^4)}{(n\epsilon)^4} \leq \frac{C}{(n^2 \epsilon^4)}\end{equation*}
Summing on nn and using the Borel-Cantelli lemma gives P(Sn>nϵ i.o.)=0P(|S_n| > n\epsilon \text{ i.o.}) = 0 or n0,nn0Snnϵ \exists n_0, \, \forall n \geq n_0 \: \frac{|S_n|}{n} \leq \epsilon. Since ϵ\epsilon is arbitrary, the proof is complete. \Box
\end{proof}

References