Hopefully Interesting

March 14, 2008

Quantitative Borel-Cantelli Lemma

Filed under: math.CA,motivated theorems — Pietro @ 12:00 pm
Tags: ,

Everyone knows and loves the Borel-Cantelli lemma: it is widely used, intuitive and has a cute proof. Let’s recall it.

Borel-Cantelli Lemma. Let (X,\mathcal{M},\mu) be a measure space, and (E_n)_{n\in\mathbb{N}} a sequence of measurable sets such that \sum_{n\in\mathbb{N}} \mu(E_n) < \infty. Then the points of X that lie in infinitely many E_n form a set of measure zero.

Proof. We present the usual slick proof of the lemma, and postpone intuition until the development of a quantitative statement. Define F_\infty to be the set of points lying in infinitely many E_n, and notice that F_\infty = \bigcap_{n\in\mathbb{N}} \bigcup_{k\geq n} E_k. It follows that F_\infty is measurable, and contained in each cofinite union \bigcup_{k\geq n} E_k. Now, \mu(\bigcup_{k\geq n}E_k) \leq \sum_{k\geq n} \mu(E_k), and the latter goes to zero as n\to\infty, because \sum_{n\in\mathbb{N}} \mu(E_n) converges. Thus \mu(F_\infty)=0, as claimed. \Box

Alright, so the set of points lying in “very many” E_n is “very small”. A natural quantitative question to ask at this point is: how small is the set of points lying in “moderately many” E_n? More precisely, if F_k is the set of points lying in at least k of the E_n, and G_k = F_k \backslash F_{k+1} the set of points lying in exactly k of the E_n, can we get bounds on \mu(F_k) and \mu(G_k), depending on k?

It seems plausible. If a point x\in X lies in at least k of the sets E_n, adding up all the \mu(E_n) seems to be “counting” x at least k times. Therefore, intuitively, \sum_{n\in\mathbb{N}} \mu(E_n) should be at least k\cdot\mu(F_k), for each k, which gives an O(1/k) upper bound on \mu(F_k). Since G_k is a subset of F_k, the same bound applies.

This is consistent with the usual Borel-Cantelli lemma as k\to\infty, and indeed, in the finitary case, we have a

Counting lemma. Suppose S is a finite set, endowed with the structure of a measure space (S,2^S,\#), where \# is the counting measure (ie, \# A is the number of elements in A). Let E_1,\ldots,E_N be subsets of S, and define F_k, G_k as above. Then

\sum_{n=1}^N \# E_n = \sum_{k=1}^N \# F_k = \sum_{k=1}^N k\cdot \# G_k

Proof. This is a very proof-friendly lemma. One may use induction on N, or on the cardinality of S, or even reason visually, by counting edges in an appropriate bipartite graph. \Box

Since F_{k+1} \subseteq F_k, we easily get, for each m,

\sum_{n=1}^N \# E_n = \sum_{k=1}^N \# F_k \geq \sum_{k=1}^m \# F_k \geq m\cdot \# F_m \Longrightarrow \# F_m \leq \frac{1}{m} \sum_{n=1}^N \# E_n.

The trouble with extending this analysis to general measure spaces is that, typically, \mu(A) has very little to do with the number of elements in A, which at any rate might be infinite. In fact, are F_k and G_k even measurable in general? Fortunately, yes.

It suffices to establish measurability of all the F_k, and the idea is just a finitary version of the argument expressing F_\infty, above, as a union of intersections. Let \mathfrak{E}_k be the (denumerable) set of k-element subsets of \{E_n : n\in\mathbb{N}\}. Obviously, if \mathcal{E} \in \mathfrak{E}_k, then \cap \mathcal{E} is measurable, whence \bigcup_{\mathcal{E}\in\mathfrak{E}_k} \cap\mathcal{E} is measurable. But that’s exactly F_k.

But still, it isn’t clear how to adapt the proofs of the counting lemma to the general conjecture

\sum_{n\in\mathbb{N}} \mu(E_n) = \sum_{k\in\mathbb{N}} \mu(F_k) = \sum_{k\in\mathbb{N}} k\cdot \mu(G_k).

Let’s take a look at the three straightforward proofs of the c.l. Two of them somehow rely on the “individuality” of each point: induction on the cardinality of S proceeds by adding a point to some of the E_n, and seeing how the sums change; and edge-counting reinterprets the sum of \# E_n as a sum of certain functions (vertex degree) over the points. In a general measure space, there may be infinitely many points, and they may each have measure zero, so those lines of argument don’t look too appealing.

The third proof we had was induction on the number of E_n, which isn’t exactly applicable, as there are denumerably many of them; but a standard move in analysis can help us here, namely, “try to ignore small things and see if simpler patterns emerge”. Since the sum \sum_{n\in\mathbb{N}} \mu(E_n) converges, we know that not only do the individual \mu(E_n) become small as n\to\infty, but a cofinite set of them becomes small.

So let’s assume, for the moment, that there are only finitely many E_1,\ldots,E_N. Maybe later we can disregard the contributions of all but finitely many E_n, and manage an “approximation by \varepsilon” argument.

This looks tractable! For instance, if there are only E_1 and E_2, we have

\begin{array}{rcl} \mu(E_1) + \mu(E_2) & = & \mu(E_1 \cup E_2) + \mu(E_1 \cap E_2) \\ & = & \mu((E_1 \cup E_2) \backslash(E_1 \cap E_2)) + 2\cdot\mu(E_1 \cap E_2). \end{array}

Obviously, F_1 = E_1 \cup E_2 and G_1 = (E_1 \cup E_2) \backslash (E_1 \cap E_2), while F_2 = G_2 = E_1 \cap E_2. Thus the conjecture holds when there are only two sets E_1,E_2; more than that, the relevant structures become obvious. What we should consider, instead of single points, is atoms of the \sigma-algebra generated by E_1,\ldots,E_N — which is just a fancy way of saying “regions of the Venn diagram of the E_n“. More explicitly, the 2^N possible intersections \tilde{E_1} \cap \ldots \cap \tilde{E_N}, where each \tilde{E_n} is either E_n or its complement.

At this point we may even forget the complicated idea of induction (and the myriad ways in which a new E_{N+1} can intersect the old E_n) and focus on proving an analogue of the counting lemma where the elements of S are allowed to have “weights”, and subsets of S are assigned the sum of the weights of their elements, instead of just their number. More precisely:

Weighted counting lemma. Let S be a finite set, E_1,\ldots,E_N some of its subsets, and define F_k,G_k as before. Let w : S \to \mathbb{R} be a “weight function” on elements of S, and for A \subseteq S define w(A) = \sum_{a\in A} w(a). Then

\sum_{n=1}^N w(E_n) = \sum_{k=1}^N w(F_k) = \sum_{k=1}^N k\cdot w(G_k)

First proof. One way to go is induction on the cardinality of S: add a new guy in, call him s, and say he’s in m of the E_n. Observe that the leftmost sum increases by m\cdot w(s) (w(s) for each E_n that s is in). Now, w(G_m) increases by w(s), as does each of F_1,\ldots,F_m, while the other F_k,G_k remain unchanged. Therefore, the middle and rightmost sums also increase by m\cdot w(s). \Box

Second proof. Or we may take the graph-theoretic approach, defining a bipartite graph G=P+Q, the vertices of which are the elements of S on one hand, and the sets E_n on the other (P and Q, respectively). Add an edge between s and E_n of weight w(s) whenever s\in E_n. Calculate the total edge weight in two different ways to get equality between the leftmost and rightmost terms; equality between the middle and rightmost terms is straightforward. \Box

Looking at the atoms of the \sigma-algebra generated by E_1,\ldots,E_N as “weighted elements” of the E_n, w.c.l. trivially gives

Quantitative Borel-Cantelli (finite case). Let E_1,\ldots,E_N be measurable subsets of a measure space X, and define F_k,G_k as before. Then

\sum_{n=1}^N \mu(E_n) = \sum_{k=1}^N \mu(F_k) = \sum_{k=1}^N k\cdot \mu(G_k).

Ideally, we should be able to take some sort of limit as N\to\infty and prove the general case. The more obvious way to do that is disregard all but finitely many E_n, which introduces an arbitrarily small error in the \sum\mu(E_n) term. However, F_k and G_k are sort of spread out over the E_n, and truncating the latter series does not straightforwardly correspond to a truncation of the former.

We take our next clue from writing out, in full, the proof of quantitative B-C via w.c.l., the second proof in particular. The productive thing to do there was split up each E_n along the G_k, in the sense that

E_i \cap G_1 = E_i \cap \bigcap_{j\neq i}E_j^c;

E_i \cap G_2 = E_i \cap \bigcup_{j\neq i} \left( E_j \cap \bigcap_{k\neq i,j}E_j^c \right);

et cetera. So we try that now: let E_{n,k} = E_n \cap G_k. Notice that E_n = \bigcup_{k\in\mathbb{N}}E_{n,k} except for E_n \cap G_\infty, which has measure zero by the original Borel-Cantelli. (Henceforth we shall ignore F_\infty and G_\infty completely.) Since the G_k are disjoint, we get \mu(E_n) = \sum_{k\in\mathbb{N}} \mu(E_{n.k}), and, as the sum of all \mu(E_n) converges absolutely, we get

\sum_{n\in\mathbb{N}} \mu(E_n)  = \sum_{n\in\mathbb{N}} \sum_{k\in\mathbb{N}} \mu(E_{n,k}) = \sum_{k\in\mathbb{N}} \sum_{n\in\mathbb{N}} \mu(E_{n,k}).

Hence, if we can prove the plausible-sounding statement

\sum_{n\in\mathbb{N}} \mu(E_{n,k}) = k\cdot\mu(G_k)

we’ll be halfway done. Here we’re so close to the finitary case that the same combinatorial argument works. Fix k and recall that \mathfrak{E}_k is the (denumerable) set of k-element families of E_n, for instance,

\mathfrak{E}_2 = \{ \ \{E_1,E_2\},\{E_1,E_3\}, \ldots \{E_i,E_j\}, \ldots\}.

For \mathcal{E}\in\mathfrak{E}_k, let E_{\mathcal{E},k} = \bigcap\mathcal{E}\cap G_k. For fixed k, all E_{\mathcal{E},k} are disjoint, and their union is obviously G_k, whence

\sum_{\mathcal{E}\in\mathfrak{E}_k} \mu(E_{\mathcal{E},k}) = \mu(G_k).

Now, each E_{n,k} is also the disjoint union of a few E_{\mathcal{E},k}, namely, those for which E_n\in\mathcal{E}; and further, each E_{\mathcal{E},k} is contained in exactly k of the E_n, one for each member of \mathcal{E}. Therefore,

\sum_{n\in\mathbb{N}} \mu(E_{n,k}) = k \cdot \sum_{\mathcal{E}\in\mathfrak{E}_k} \mu(E_{\mathcal{E},k}) = k\cdot\mu(G_k).

This gets us half our theorem. The other half, as in the finite case, can be established more easily as follows. Since G_k = F_k \backslash F_{k+1},

\begin{array}{rcl}\sum_{k=1}^N k\cdot\mu(G_k) & = & (\mu(F_1) - \mu(F_2)) + 2\cdot (\mu(F_2) - \mu(F_3)) + \ldots + N\cdot(\mu(F_N) - \mu(F_{N+1}))\\ & = & \sum_{k=1}^N \mu(F_k) - N\cdot\mu(F_{N+1})\leq \sum_{k=1}^N \mu(F_k)\\ & \leq & \sum_{k\in\mathbb{N}} \mu(F_k).\end{array}

On the other hand, it is clear that F_k = \bigcup_{l\geq k} G_l, the union being disjoint. It follows that \mu(F_k) = \sum_{l\geq k} \mu(G_l), and so

\sum_{k=1}^N \mu(F_k) = \sum_{k=1}^N k\cdot\mu(G_k) + N\cdot\sum_{k\geq N} \mu(G_k) \leq \sum_{k\in\mathbb{N}} k\cdot\mu(G_k).

Taking limits as N\to\infty in both inequalities above finally gets us the

Quantitative Borel-Cantelli Lemma. Let (X,\mathcal{M},\mu) be a measure space, (E_n)_{n\in\mathbb{N}} a family of measurable sets such that \sum_{n\in\mathbb{N}}\mu(E_n) < \infty, F_k = \{x\in X : x \textrm{ is in at least }k\textrm{ of the }E_n\} and G_k =F_k\backslash F_{k+1}. Then

\sum_{n\in\mathbb{N}} \mu(E_n) = \sum_{k\in\mathbb{N}} \mu(F_k) = \sum_{k\in\mathbb{N}} k\cdot\mu(G_k).

In particular, we have the bounds

\mu(G_k)\leq\mu(F_k) \leq \frac{1}{k}\sum_{n\in\mathbb{N}} \mu(E_n)

\lim_{k\to\infty}k\cdot\mu(F_k) = 0.

Proof. We’ve done everything but make explicit the limit in the previous line. It’s just a consequence of

\lim_{N\to\infty}\sum_{k=1}^N k\cdot\mu(G_k) = \lim_{N\to\infty} \sum_{k=1}^N \mu(F_k) - N\cdot\mu(F_{N+1}) = \lim_{N\to\infty} \sum_{k=1}^N \mu(F_k). \Box


Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Create a free website or blog at WordPress.com.

%d bloggers like this: