The proofs I’ve seen of Lusin’s theorem go through Egorov’s theorem; that’s how Stein-Shakarchi, Folland and Wikipedia do it. I don’t feel it is a particularly elegant proof, and it produces a truncated version of Lusin’s theorem, which holds only for sets of finite measure. A simple -finiteness argument takes care of this shortcoming, but one is left with the feeling that Egorov’s theorem is our hammer, and we’re trying to see Lusin’s as a nail. Hence the motivation for a different proof. I hope you will also appreciate how there are no real obstacles in the route below; everything that needs to be done can be done nearly without thinking.

First, let us state the theorem at a fairly low level of generality.

**Theorem. (Lusin)** Let be measurable, and a measurable function. Then, for each there is a closed set such that has measure at most and the restriction of to is continuous. (In the induced topology, of course.)

If one weren’t taught to be scared of arbitrary measurable functions as very complex objects, the first thing to attempt would be to remove bits of where was discontinuous until none were left. Surprisingly, this very simple-minded approach turns out to work.

What are witnesses to the discontinuity of ? One possibility is points such that there is some sequence in converging to but for which does not converge to . However, measure theory is fundamentally countable in nature: we are only able to control processes that are repeated a countable number of times. On the other hand, we will not get anywhere by removing a countable number of points.

So what is another witness? Thinking back to the topological definition of continuity, we come upon the following: is not continuous if there is some open set such that . So we would be all right if we could fix up all the inverse images , for open sets , to be open. In fact, by basic topology, it’s enough to fix up all the for *basic* open sets , and admits a countable base for its standard topology.

Well, how *do* you fix these up? Suppose is not open in , and we want to make it open by removing bits of ; when would that work? We need to find an such that — that is, such that there is with .

That’s easier than it sounds. If we just take any open set containing , the problem is that might contain points besides those in . We fix this by removing from to form . Then we will clearly have and so .

The good news is, since continuity is preserved by restrictions, we are free to run the above procedure for as many as we want, and each time we do it we won’t be screwing up our previous work. In other words, if is a countable base for the topology of , and is an open set containing , then removing , etc from will make successively “more continuous”, and finally the restriction of to will be fully continuous.

Now we just need to make the very small, say of measure less than ; then the sum of the measures of the removed bits will be at most . This choice of is guaranteed by the definition of measurability or a standard theorem, depending on your approach. You can take “good outer approximations by open sets” as the definition of measurability, or you can take Caratheodory’s “good splittings” definition, and get good approximations by open sets as an easy theorem. Either way, the proof is finished.

One final nitpick: we haven’t proved that, after removing all the junk, we’re left with a closed set. In fact, we usually won’t be. But we’re certainly left with a measurable set, so we can use the standard theorem on good inner approximations by closed sets and be done.

\mathcal{

]]>

The standard move for infinite would be Zorn’s lemma. By going carefully through the second part of our proof, where we “concentrate” each color on top of only one chain, we see that it can be adapted to arbitrary without too much trouble: instead of looking for the squiggle with the smallest dark red element in its upper monochromatic set, we choose one which has “arbitrarily small” dark red elements there, and put all the dark red on top of it. This gives us the incremental part of a standard argument by Zorn’s Lemma, but our increment potentially involves a lot of fiddling with the chains, so we will probably not be able to define a very demanding order on the set of partially-built structures. As we saw, this will probably make it difficult to prove the existence of upper bounds for metachains.

So, let’s scale back the difficulty and consider the case of denumerable ; with countable sets, we can usually build stuff “one element at a time”. Suppose, then, that is a countable poset without antichains bigger than . Let be an antichain, and start with a partial decomposition of into chains , where initially. Put for definiteness. Using the finite form of Dilworth’s theorem, whereby each finite subset of can be arranged into chains, we hope to find such a decomposition of all of .

Let’s just bulldoze our way forward and see what goes wrong. Stick in whatever it will fit ( must be comparable to at least one , otherwise we’d have an -antichain), and continue by putting each in whatever is possible. If this can be done indefinitely, we get a decomposition of . Otherwise, we run into trouble at some , meaning each contains, at that point, some not comparable to .

An obvious reaction is to use Dilworth’s: we know that can in fact be put consistently into the chains . We do that, and continue on our merry way.

Unfortunately, it’s not that easy. Decomposing ever larger subsets of into chains is not enough: we need to harness that information into a *single* decomposition of . Let me spell it out in the simple case.

Suppose we don’t run into trouble adding elements of to the , one at a time. How exactly do we specify a decomposition of from that? That is, given , to which does it belong to? What we had in mind was pretty simple: it belongs to whatever it was assigned to, when it was its turn to be added!

Simple as it may be, we must answer a couple of questions. First, is that decomposition well-defined, unambiguous? In this case, yes, since each is assigned a unique in the trouble-free case. Second, is each really a chain when all is said and done? Also yes: suppose , with . By construction, when was added to (at step ), it was comparable to all the elements that had been assigned to at that point. In particular, since was added earlier (at step ), must have been comparable to .

Contrast this with the case where we do run into trouble, say at . Invoking Dilworth’s will cause a few among to jump around between the . If we keep doing this indefinitely, who’s to say that all the eventually settle down and can be assigned a definite ? And if they don’t, what should we assign them?

Maybe if we take more care in assigning each to a , we won’t have to change it afterwards. That is, we won’t have to apply Dilworth again and again. So what should we look out for?

It isn’t very clear how to express the obstructions to avoid in a succint form, e.g. involving only a few kinds of sentence about . Therefore, we resort to another venerable custom of mathematics: just state what you want, flat out, and use the structure arising from that alone.

There are possible colors for : . We want to choose the one which runs into the least trouble. Suppose coloring with inevitably runs into trouble at , in the sense that any extension into a coloring of has incomparable elements of the same color. A natural choice for the color of would be that with the biggest , ie, that which runs into trouble the latest.

A pleasant fact now asserts itself: there must be a without a corresponding ! That is, a choice of color for which admits arbitrarily large extensions. It’s obvious once you think about it: if all colors ran into dead-ends at , there would be a largest dead-end , and we would be unable to color beyond . However, we know that all finite subsets, including , admit consistent colorings.

Another way to see it is the following: let be the sequence of colors assigned to in consistent colorings of , , etc. Since there are only possible colors, at least one must be repeated infinitely many times.

That seems as good a color to give as any; say it is . What next? Well, since there are arbitrarily large colorings with painted , we can run the argument again using only those, and choose a color for which admits arbitrarily large extensions. And so forth.

To sum up, here’s how we define a coloring of . For each , let be a coloring of , that is, a map such that two equally colored elements are comparable. Let be a color such that arbitrarily large have , and let be the subsequence of those which do. Inductively, given a subsequence of colorings , all of which have the same first colors, choose a subsequence whose elements also have the same -th color. This is possible because there are only different colors. Finally, let the “actual” color of be that assigned by the .

This process assigns each a definite color, and the result is consistent. Indeed, if , then extends (being, in fact, for some ); thus, if and are assigned the same color by , they are assigned the same color by , which, being a consistent coloring, makes and comparable.

So that’s it for countable . For general , much the same idea works, with the caveat that we recast the argument in the language of ordinals (for the “step-by-step” angle) or nets in topological spaces (for the “subsequences” angle). We briefly work out the second approach. Let be the set of finite subsets of , and give it the structure of a directed set via ; an upper bound for two elements is just their set-theoretic union. Give the product topology (over the discrete topology in ) and define a net by setting equal to a consistent coloring of (given by finite Dilworth) plus some random coloring of . Since is compact by Tychonoff’s theorem, there must be a convergent subnet . It is straightforward to check that is a consistent coloring of all of .

There is some crucial finiteness phenomenon underlying the results and arguments above, and it is more clearly seen in the contrapositive form of Dilworth’s theorem: if a poset *cannot* be decomposed into chains, then it must contain an antichain of size at least . Therefore, to demonstrate the impossibility of such a decomposition, it suffices to pick out a finite *witness*: a large, but finite, antichain of . Such a witness may be expressed by a finite number of statements of the form .

These considerations suggest another approach to the infinite form of Dilworth’s theorem: mathematical logic. In particular, the compactness theorem for first-order logic.

Let be a language consisting of the usual symbols for variables, logical connectives and quantifiers, plus a family of constants , a binary relation symbol and a unary function symbol . We can encode all the information about in an -theory consisting of the poset axioms plus the sentences:

- , for each such that ;
- , for each such that is not the case;
- ;
- ;

Suppose has a model with constants . We claim that the map is an order-theoretic embedding of in .

First, all the are distinct: if , then one of , fails to be the case (by antisymmetry), whence either or is in . It follows that either or fails; and since is the model of a poset, would imply both.

Next, the first two types of sentence of obviously imply that is order-isomorphic to its image by the map . Therefore, if it is possible to decompose into the union of chains, the same goes for . But that’s just what the last two types of sentence of say, with the aid of the function symbol : its realization splits up into subsets (one for each ), and each subset is a chain.

To summarize: if has a model , it can be decomposed into chains, and has as a subposet; thus can be decomposed into chains.

On the other hand, if *didn’t* have a model, by the compactness theorem in first-order logic there would be some finite subtheory of that didn’t have a model. However, a finite subtheory, having only a finite number of sentences, can only mention a finite number of the constants ; and the finite subposet of corresponding to those constants, endowed with a decomposition into chains (finite Dilworth), is a model for that subtheory. Therefore, does have a model, and is decomposable, as we wished.

Next time, we’ll look at some applications of Dilworth’s theorem in combinatorics, and its relationship to various other famous theorems in that field.

]]>

**Borel-Cantelli Lemma.** Let be a measure space, and a sequence of measurable sets such that . Then the points of that lie in infinitely many form a set of measure zero.

*Proof.* We present the usual slick proof of the lemma, and postpone intuition until the development of a quantitative statement. Define to be the set of points lying in infinitely many , and notice that . It follows that is measurable, and contained in each cofinite union . Now, , and the latter goes to zero as , because converges. Thus , as claimed.

Alright, so the set of points lying in “very many” is “very small”. A natural quantitative question to ask at this point is: how small is the set of points lying in “moderately many” ? More precisely, if is the set of points lying in at least of the , and the set of points lying in exactly of the , can we get bounds on and , depending on ?

It seems plausible. If a point lies in at least of the sets , adding up all the seems to be “counting” at least times. Therefore, intuitively, should be at least , for each , which gives an upper bound on . Since is a subset of , the same bound applies.

This is consistent with the usual Borel-Cantelli lemma as , and indeed, in the finitary case, we have a

**Counting lemma.** Suppose is a finite set, endowed with the structure of a measure space , where is the counting measure (ie, is the number of elements in ). Let be subsets of , and define as above. Then

*Proof.* This is a very proof-friendly lemma. One may use induction on , or on the cardinality of , or even reason visually, by counting edges in an appropriate bipartite graph.

Since , we easily get, for each ,

.

The trouble with extending this analysis to general measure spaces is that, typically, has very little to do with the number of elements in , which at any rate might be infinite. In fact, are and even measurable in general? Fortunately, yes.

It suffices to establish measurability of all the , and the idea is just a finitary version of the argument expressing , above, as a union of intersections. Let be the (denumerable) set of -element subsets of . Obviously, if , then is measurable, whence is measurable. But that’s exactly .

But still, it isn’t clear how to adapt the proofs of the counting lemma to the general conjecture

.

Let’s take a look at the three straightforward proofs of the c.l. Two of them somehow rely on the “individuality” of each point: induction on the cardinality of proceeds by adding a point to some of the , and seeing how the sums change; and edge-counting reinterprets the sum of as a sum of certain functions (vertex degree) over the points. In a general measure space, there may be infinitely many points, and they may each have measure zero, so those lines of argument don’t look too appealing.

The third proof we had was induction on the number of , which isn’t exactly applicable, as there are denumerably many of them; but a standard move in analysis can help us here, namely, “try to ignore small things and see if simpler patterns emerge”. Since the sum converges, we know that not only do the individual become small as , but a *cofinite set* of them becomes small.

So let’s assume, for the moment, that there are only finitely many . Maybe later we can disregard the contributions of all but finitely many , and manage an “approximation by ” argument.

This looks tractable! For instance, if there are only and , we have

Obviously, and , while . Thus the conjecture holds when there are only two sets ; more than that, the relevant structures become obvious. What we should consider, instead of single points, is atoms of the -algebra generated by — which is just a fancy way of saying “regions of the Venn diagram of the “. More explicitly, the possible intersections , where each is either or its complement.

At this point we may even forget the complicated idea of induction (and the myriad ways in which a new can intersect the old ) and focus on proving an analogue of the counting lemma where the elements of are allowed to have “weights”, and subsets of are assigned the sum of the weights of their elements, instead of just their number. More precisely:

**Weighted counting lemma. **Let be a finite set, some of its subsets, and define as before. Let be a “weight function” on elements of , and for define . Then

*First proof.* One way to go is induction on the cardinality of : add a new guy in, call him , and say he’s in of the . Observe that the leftmost sum increases by ( for each that is in). Now, increases by , as does each of , while the other remain unchanged. Therefore, the middle and rightmost sums also increase by .

*Second proof*. Or we may take the graph-theoretic approach, defining a bipartite graph , the vertices of which are the elements of on one hand, and the sets on the other (P and Q, respectively). Add an edge between and of weight whenever . Calculate the total edge weight in two different ways to get equality between the leftmost and rightmost terms; equality between the middle and rightmost terms is straightforward.

Looking at the atoms of the -algebra generated by as “weighted elements” of the , w.c.l. trivially gives

**Quantitative Borel-Cantelli (finite case).** Let be measurable subsets of a measure space , and define as before. Then

.

Ideally, we should be able to take some sort of limit as and prove the general case. The more obvious way to do that is disregard all but finitely many , which introduces an arbitrarily small error in the term. However, and are sort of spread out over the , and truncating the latter series does not straightforwardly correspond to a truncation of the former.

We take our next clue from writing out, in full, the proof of quantitative B-C via w.c.l., the second proof in particular. The productive thing to do there was split up each along the , in the sense that

;

;

et cetera. So we try that now: let . Notice that except for , which has measure zero by the original Borel-Cantelli. (Henceforth we shall ignore and completely.) Since the are disjoint, we get , and, as the sum of all converges absolutely, we get

.

Hence, if we can prove the plausible-sounding statement

we’ll be halfway done. Here we’re so close to the finitary case that the same combinatorial argument works. Fix and recall that is the (denumerable) set of -element families of , for instance,

.

For , let . For fixed , all are disjoint, and their union is obviously , whence

.

Now, each is also the disjoint union of a few , namely, those for which ; and further, each is contained in exactly of the , one for each member of . Therefore,

.

This gets us half our theorem. The other half, as in the finite case, can be established more easily as follows. Since ,

On the other hand, it is clear that , the union being disjoint. It follows that , and so

.

Taking limits as in both inequalities above finally gets us the

**Quantitative Borel-Cantelli Lemma.** Let be a measure space, a family of measurable sets such that , and . Then

.

In particular, we have the bounds

.

*Proof*. We’ve done everything but make explicit the limit in the previous line. It’s just a consequence of

.

]]>

Here’s the statement:

**Theorem 1.** Let be a partially ordered set (a *poset*) in which the largest antichain has size . Then it is possible to decompose into the union of chains.

The key terms are chain and antichain; here are quick definitions. We say that two elements are *comparable* if either or is the case. Recall that, in a partial order, not all pairs of elements are required to be comparable — hence the term, partial.

A *chain* in is a subset of , where any two elements are comparable. An *antichain*, on the other hand, is a subset of such that no two elements of are comparable. If we give the divisibility order to the set of natural numbers (whereby iff divides ), then the set is a chain, whereas is an antichain.

Intuitively, the chains of a poset are long “noodles”, like below:

Antichains, on the other hand, are independent sets, and the largest antichain is a sort of “diameter” of the poset. In the above picture each “floor” of the diagram is an antichain, for instance .

If we would like to decompose into a union of chains, it’s obvious we need at least as many chains as there are elements in the largest antichain; any less, and two incomparable elements would have to be in the same chain. Trying out a few cases like the above, it’s plausible to conjecture we don’t need any more:

There are a few routes we may choose in our search of a proof. First, as our initial optimistic conjecture (and the statement of theorem 1) places no restriction on , it may be an infinite set. A very large infinite set. And for building structures on infinite sets there’s nothing like Zorn’s lemma.

A prototype proof along these lines would need

- a suitable set of “partially-built” structures, e.g. the set of decompositions of
*subsets*of into chains; - a notion of ordering for those, which could be a straightforward “extends” relation, e.g. “each chain in one decomposition is contained in a chain of the other decomposition”;
- an upper bound on each chain of partial structures, e.g. the decomposition into the union of the chains;
- an incremental argument, which, given a partial structure which does not encompass all of , adds something to it.

To avoid confusion of the senses of “chain” and “order” — as applied to the original poset, and to the set of partially-built structures — I will use the terms *metachain* and *metaorder* for the latter.

Getting back on track: there is always a”law of conservation of difficulty” in such situations. If we choose the metaorder in (2) to be very demanding, so that the upper bound in (3) is e.g. a simple union of the elements of the metachain (like the hypothetical (2) would allow), we become severly restricted in our incremental arguments: tweak a partial structure a little, and the result winds up not extending, but being incomparable with the original version. On the other hand, if the metaorder is very permissive, increments become easy, but even elements of the same metachain start to have little in common, and one can’t harness them (as in a simple union) to produce an upper bound.

Let’s go with the idea in (1) and try out different metaorders. To formalize, let be a maximum antichain in , and define

is a chain in and .

Our hope is that contains some element whose chains cover .

We define two orders on :

iff, for each , ;

iff .

Obviously, is a lot more demanding than . In particular, the easy union argument applies to the former: if is a metachain under , then is an upper bound for it. On the other hand, given a tuple of chains, whose union *doesn’t* cover , it’s hard to see how to extend it.

Suppose doesn’t belong to any , and can’t be straighforwardly inserted into one — say there’s at least one element in each that isn’t comparable with . Then we’re pretty much lost: if we remove elements from any , to make way for the new , the resulting tuple of chains will fail to be metacomparable with the original one.

To avoid “bad choices” leading up to such a dead-end, we could restrict the set to “good partial choices” in some way — but how? Let’s save this approach for later and try something else.

For the easygoing metaorder , it looks as if we could more easily increment : as long as we add a new element to some , and just move elements around between them, without removing any, we’re guaranteed to get a larger partial structure. Even if we make some “bad choices” along the way, we may be able to cleverly rearrange the elements among the and manage to fit the new guy in. However, a metachain in can contain very diverse elements, and the straightforward union idea will likely break the chain property of each , so it’s unclear how to get upper bounds on general metachains.

There is one case where our inability to get upper bounds is not a problem: when is a finite set. Then we can just run the incremental part of the argument a bunch of times and prove the theorem. Besides, proving it first for finite would be taking Pólya’s advice: “if there’s a problem you can’t solve, there is an easier problem you *can* solve.”

One other piece of insight that might prod us in this direction is the observation that partial orders can be incredibly intricate: many non-trivial structures, from a lattice of subgroups to the state space of a distributed system, can be represented by a partial order. Therefore, we may expect the construction of an explicit decomposition of a partial order into chains to be very algorithmic in nature, that is, to proceed by building and changing partial structures that cannot be described simply (in the Kolmogorov complexity sense, for instance).

So let’s jump right in. Let be our finite poset, one of its maximum antichains, and suppose we have a tuple of chains , where , whose union is not quite all of . Let belong to none of the chains. If there is a such that is comparable to all of its elements, we just stick in there and move on. Otherwise, in each there are elements not comparable to .

And they have usable structure, too. In each , among the elements not comparable to , there is a greatest, , and a least, , because is a chain. Moreover, no element between and is comparable to , otherwise transitivity would give a contradiction. We present the situation schematically below. Each color (red, green, blue) represents a chain, and dots further down are dots further up. Dark hues are elements not comparable to .

Now, notice that any dark antichain (composed only of dark elements) can be extended by adding in . Since our poset has no antichains bigger than , it follows that there are no dark antichains with more than elements. If this doesn’t scream inductive hypothesis, I don’t know what does.

Heeding the screams, we decompose the subposet of dark elements into chains, say , which we picture as black squiggles:

Notice that this breaks each into an upper and a lower half, which we denote and , respectively. (Either, or both, may be empty.)

We want to somehow rearrange this mess into tidy chains. Recall that, by construction, is comparable to all the brightly-colored elements, being less than the ones above it, and greater than the ones below. This means we can stick in a chain in many different ways; just choose any , any , and connect them by :

The next step is almost too obvious to be wrong. Here we have “upper halves” of chains, “lower halves”, and precisely candidates for connecting stuff: itself, and the dark chains. We just have to prove the ends can be tied up properly.

But we run into a little trouble. Choose one of the at random; where can we tie its top and bottom ends? Offhand, the most obvious thing to do is match colors: tie the top end to the which is the same color as ‘s greatest element, and similarly for the bottom end. By construction, we’re guaranteed to wind up with a chain:

But what happens if the top ends of the squiggles repeat colors? That is, what if two or more dark chains have largest elements of the same color? Then we can’t do the obvious tie-ups, and there don’t seem to be any clever tie-ups, either. This is where the proof becomes “algorithmic” for the second time (the first was the recursive bit, at the inductive hypothesis).

The main observation is that we can “concentrate” all the similarly-colored top ends in just one place — wherever the top end is smallest. Indeed, suppose and have “dark red” (say, ) tops, and consider the longest run of consecutive elements in each, starting from the top. That is, we look not only at the top element of , but see how far down the dark red goes, before some other color appears. Call these “monochromatic top sets” and , respectively. Since they are finite and the same color, is a chain and has a least element, say . Now just move all of to whichever of , contained in the first place. Say it was . Since was in the monochromatic top set, it is larger than all non- elements in ; and since it was the least element, everything we added to is also larger than the non- stuff. In other words, is still a chain. As they say, a picture is worth a thousand words:

We may iterate this procedure until all the “dark red” is sitting on top of a single squiggle (dark chain). Then we do it for the other colors, and also for the bottom ends. In the end, each color will sit on top of at most one squiggle, and lay at the bottom of at most one squiggle. This allows us to perform the obvious tie-ups discussed earlier, and still have upper and lower half-chains left over. Finally, we connect one of each through (remember him?), and tie up any remaining halves directly. Voilà! We managed to stick one more element in an -chain decomposition of , and the inductive step is complete.

So we’ve managed to prove Theorem 1 for finite . In the following post, we’ll see how the infinite case drops out easily, and discuss the relationship of Dilworth’s theorem with other famous theorems from combinatorics.

]]>

- ( is open in );
- ( is closed in );
- (the interior of A);
- (the closure of A).

They’ve been saving me a lot of time and thought since then, like notation’s supposed to. Witness . The closure symbol, in particular, has ended the ambiguity with , which often denotes the complement of in other contexts. It’s easy to know which is meant if you think about it, but this sort of thing should be run by the cerebellum.

]]>

In any case, we have to develop some new intuitions if we’re going to expect, rather than be surprised by, the Koch snowflake, nonmeasurable sets, and families of curves that have positive area but only at the endpoints.

The story begins with the naïve picture of a closed plane curve (a loop) we had in our minds before we learned analysis. Most of us thought of it as something similar to a circle, with a dent or two thrown in for generality perhaps. Our “general” plane curve looked a bit like a kidney, the body of a guitar, or a mix of both:

A few intuitive facts suggest themselves upon inspection of this picture:

- The curve itself, the boundary of the kidney, has no area; more precisely, it has zero plane measure.
- However, it seems to have a
*length*. We imagine the curve as a piece of string we can lift up, straighten out, and measure against a ruler. - If the curve doesn’t cross itself (like a figure eight does), then it separates the plane into two parts, the “interior” and “exterior” of the curve.

Are these observations *true*?

The one correct answer to every single question in the world, including this one, is: it depends on what we mean by the terms employed. Interestingly, different meanings of the phrase “closed plane curve”, all of them reasonable, give different answers to each of the three questions above.

One standard meaning is “the image of a continuously differentiable map from the unit circle into the plane”. Here, “circle” refers to the outer edge of the familiar figure, that is, the round outline that the pencil traces out. The interior of the figure I call a “disk”. (Thus, a circle is to a disk as a ring is to a frisbee.) I prefer to have the domain be a circle instead of the more traditional closed unit interval in order to avoid repeating every time I want the curve to be closed (which will be pretty much always). Also, special issues of differentiability at the endpoints don’t arise: the whole description is more symmetrical and elegant.

If we use this strong meaning of closed plane curve, 1–3 are answered in the affirmative, proving which is a nifty calculus exercise (especially #3). Even #2, which logically demands a definition of length before it can be approached, yields to the most natural definition that I know of, and which will be the topic of a future post. Not wishing to spoil the reader’s fun, we concentrate on somewhat more general meanings of “closed plane curve”.

**Definition.** A *closed plane curve* is the image of a continuous map from the unit circle into the plane, which we will have occasion to write as . A *simple *closed plane curve is one for which the map is injective: these are curves which don’t cross themselves. It’s usual to denote them by .* *(Notice how the arrow is different.) Since we shall be working exclusively in , I’ll drop the adjective “plane” from now on.

We might hope that these curves possess properties 1–3. After all, what’s a continuous curve? It’s just a differentiable curve with a few creases, right? How could it not have a length, and worse still, how could it have an area?

Well, it turns out that continuous functions can do some pretty unexpected things. Without differentiability, the map can stretch arbitrarily small arcs of the circle into proportionally large segments of the curve, making its total length infinite, while still lying in a bounded region of the plane. (A technical way to put this is that, without differentiability, there need be no Lipschitz condition.) The Koch snowflake is an example of this situation, and the reader may find some enjoyment in proving that the snowflake really is a closed plane curve, according to our definition.

Even more amazing is that the circle can be ingeniously stretched to cover an entire square. Not just the four edges, mind you, but the whole thing. If you have not heard about this, I recommend reading about space-filling curves, preferably in Hans Sagan’s wonderful book. (The price is a bit steep for this smallish book; I read it at the library.)

Since this topic is extensively discussed in many websites and an entire textbook, I shall not take it up in the traditional manner; I will simply assume that you have a passing acquaintance with it, and try out some variations. Nor will I elaborate on the fact that the third property — that of separating the plane into two regions — is the only one valid in the general setting of continuous . That proof alone could fill a post thrice this size.

Alright. Suppose we are baffled by the phenomenon of space-filling curves, and would like to put our finger on where exactly our intuition went wrong. How is it that something so *one-dimensional* as the circle can be stretched and deformed into something *two-dimensional*?

One natural candidate for suspicion may be the map , described in Wikipedia, that does the trick. After all, the image of the circle is only as one-dimensional as makes it. Indeed, we know that an arbitrary map can turn a circle (or line) into almost anything else, from a plane square to a nine-dimensional ball, since all these sets of points have the cardinality of , and are thus in bijective correspondence with each other. Therefore, if we are to be legitimately surprised, we must inquire into the nature of : in what sense does it *preserve* the one-dimensional character of a circle?

Since dimensionality of curves and plane regions sounds like a topological concept, we may start by examining how faithfully preserves the topology of . Continuity is not delicate enough, since continuous maps may turn an arbitrary object into a single point. It’s a bit surprising that one can “build” stuff with continuous maps, instead of just “collapsing” it, but we’d be *flabbergasted* if the space-filling map turned out to be that most faithful of topological morphisms: a homeomorphism. Our intuitive notion of dimension would be all but chucked out the window.

Fortunately, that isn’t the case. It’s easy to see that isn’t a homeomorphism, simply because it’s quite strongly non-injective. What’s more, it’s *impossible* for a homeomorphism to turn into a plane square. This is also kind of unexpected: we can’t map the “thin” space continuously onto the “fat” space without repeating values — which, intuitively, only “spends” our already very “scarce” domain set.

What happens is that, for a curve to fill up a square without repeating values (ie, without intersecting itself), it would have to dodge itself in ingenious ways. One can imagine that, after filling up half the square, the curve may sort of paint itself into a corner, and after filling up 99%, this will almost surely happen. Theorem 2, below, shows that such a situation would actually occur as soon as the curve filled up *any* tiny disk, no matter how small. In preparation, we need the simple

**Lemma 1.** If are topological spaces, with compact and Hausdorff, and is a continuous bijection, then is a homeomorphism.

**Proof.** We show that is a closed map. Being bijective, it will be a homeomorphism. Indeed, if is a closed subset of , then is compact. Therefore, is compact. Since is Hausdorff, is closed in Y.

**Theorem 2.** Consider a continuous injective map . The image curve, , has empty interior.

**First proof.** Since is compact and is Hausdorff, lemma 1 applies: is an immersion, ie a homeomorphism between and in the induced topology.

Suppose contains an open disk , centered at . Let be a circle inside .

Since is closed and connected, its inverse image under is either a point, a closed arc, or all of ; since is uncountable, it can’t be a point. This means that, as “traces out” its curve, it can never leave a circle “unfinished”; ie, if touches , it must trace out the whole of before moving on. We then have two easy contradictions.

First, by tracing out , comes back to the point where it started the tracing. If the inverse image of is an arc in , then and map to the same starting/ending point in , which hurts injectivity. On the other hand, if the inverse image is all of , then the image curve is simply and has empty interior. Either way, we’re done.

Alternatively, let be concentric circles inside , each contained in the next. Suppose we start drawing ‘s curve from , and, after tracing out , touch before touching . Then we’re on the “inside” of but still missing from the image. However, by the Jordan curve theorem, we can’t go outside to finish the job. (This proof ending is circular, because the Jordan curve theorem already contains the statement of our theorem. Still, it gives some intuition.)

**Second**** proof.** Suppose contains the open disk , centered at , of radius . Let be the circle of center and radius . Clearly, is the disjoint union of all with . Since the are connected, their inverse images in must be single points or arcs. The ones which are not points contain *open* arcs.

Now, there are uncountably many , and only one of them consists of a single point (). Therefore, there are uncountably many pairwise disjoint open sets in . But is second-countable, a contradiction.

**Third proof.** Suppose contains an open disk. Then one can remove two points from it (and thus from ) while still keeping it path-connected. This is impossible for .

An easy consequence of theorem 2 is that space-filling curves can never be homeomorphic to a circle, which assuages our suspicions that the intuitive notion of dimension might be topologically inadequate. (In fact, there are purely topological notions of dimension of a space.) We may even conjecture that *any* plane set homeomorphic to a circle has zero area. This is a bold conjecture, since topology and measure theory often don’t mix all that well, and it doesn’t follow from theorem 2 because a plane set may have empty interior but still have nonzero area, e.g. the irrational points of .

Lemma 1 suggests where to continue our search. If we want a bona fide one-dimensional object with positive area, we should look at the simple closed plane curves: injectivity is enough to guarantee that a continuous map will be a homeomorphism, and thus faithfully preserve dimensionality. So an initial idea would be to fiddle with a space-filling curve and try to remove its self-intersections, in a way that leaves it still occupying “most” of the unit square.

Let’s look at the first few iterations of one of the simpler such curves, to see what kind of self-intersections it has:

The iteration mechanism should be clear: at the -th step, we picture the unit square as being divided into tiny subsquares, each of which contains a triangle, as in the leftmost figure (possibly turned on its side, or upside-down). We then replace the triangle with a (scaled and possibly rotated) copy of the middle figure. Note that the bottom of the outer square is also part of the curve: we’re mapping into the unit square.

As most other space-filling curve constructions, ours divides the unit square into subregions at each step, and subsequent iterations only refine the curve inside individual subregions. This is what allows us to prove existence of a limit curve through completeness of the space , of continuous functions from the unit circle into the square, with the uniform metric: increasingly local changes to a curve yield a Cauchy sequence of curves.

For example, in the middle figure above, we have an approximation to the final curve, which will change in the next steps. However, the first quarter of the curve, corresponding to the bottom-left triangle and half the base of the outer square, will always remain within the bottom-left subsquare.

This gives us a clue: intersections between different quarters, sixteenths, … of the curve can only occur on the boundaries of the quarter, sixteenth, … subsquares! So what if we “push” the next iteration away from the boundary, at each step, forming “windows” like so?

This procedure removes self-intersections at each iteration, and a simple argument shows that the limit curve is also free of self-intersections. The only problem is that, unless we choose the “pushing” mechanism carefully, the limit curve may end up having zero area. For example, in the pictures above, I made the “pushing” as simple as I could: instead of splitting each square into four pieces half the size, I split them into four pieces two-fifths of the size, centralized. This means that the sum of the areas of the subsquares decreases by a factor of after each step. Thus the limit curve, which lies in the intersection of all the subsquares — except for the denumerably many “connecting segments”, each of which has zero area — has zero area itself.

What we need is a way of “pushing” which becomes very small, very quickly, so that the intersection of all the subsquares still has positive area. Since we are very free in this respect — any amount of pushing at all will remove self-intersections — we may just do it. For instance, at the -th step, we split each subsquare into four pieces times the size. It is easily established that, if a sequence of non-negative reals has sum less than , then the infinite product of the sequence converges and

Therefore, the latter method of pushing yields a limit curve that occupies area at least . (Technically, we would have to show that every point in the intersection of all the subsquares is also a point in the limit curve; but this follows by the standard arguments that are used to prove the space-fillingness of the usual curves, e.g. Peano’s and Hilbert’s.)

When it is first suggested, the mind boggles to think of the Jordan curve theorem applied to a curve of positive area. However, the pictures above give a nice intuition of what goes on: any point which is not *on* the curve gets left behind on some “windowsill”, and on those the theorem is rather understandable.

So there we have it: a homeomorphism of the unit circle onto a subset of the plane of positive area. In a follow-up post we will investigate possible definitions of the *length* of a curve, and show that, fortunately, a curve of finite length must have null plane measure.

]]>

Think of Pelé. When somebody asks “who is Pelé”, you might reply, “the best football player ever”. Notice that you are implicitly defining Pelé in terms of his relationship to other footballers; an observation which is often obscured by the fact that there is also an actual person you can point to, and say “that’s Pelé”.

Now suppose a toy company came out with a football trading card game, where each card represents a famous player, and contains various numerical ratings like precision, speed, stamina etc. You might now explain that Pelé (meaning the Pelé card) is “the card with highest speed and precision”. Even though there is still an actual card you call “Pelé” (and children might think of it this way), it is now clear that the thrust of the definition is quite another: what is important about the Pelé card is its relationship to the other cards, not its particular shape, or whether it’s made of paper or plastic.

Similarly, if you ask a child what is a chess pawn, they will most likely point to the actual, physical chess piece: “that’s a pawn”. Which is fine if we had asked “what is a chess pawn, in the physical world?” However, the deeper content of our question was “what is a chess pawn, *in the game of chess*?” Since grandmasters are able to play entire games in their minds, a pawn can’t be just a wooden piece. In fact, even novice players experience no confusion if pawns are replaced with (say) beans, as long as they* agree to it beforehand*. So we see that the main defining feature of chess pawns is not their wooden incarnation, nor their color, but their relation to other pieces; that is, the rules they obey in the game of chess. The physical pawn is just a memory aid.

Many of these points are obscured in everyday life because we usually have material counterparts to various concepts which would be better described in abstract terms: a card for Pelé, a wooden piece for the pawn, coins and bills for money. In mathematics, however, such counterparts are fewer in number; decimal representations are an example, but you won’t find many more. Furthermore, sticking to naïve concrete views (e.g. that ‘1’ actually *is* the number one) quickly leads to conceptual conundrums which would not arise if one took a proper view of things.

I will now give a simple mathematical example of the method, exemplified above, of defining something by its relationship to other things. Consider the sequence

It is, implicitly, an infinite set of numbers (that’s what ‘…’ means), but one whose structure is easily grasped. Given such a sequence S, there are two obvious numbers that one can try to define, in terms of their relationship to the members of S:

**Minimum.** The smallest number in all of S.

**Maximum.** The biggest number in all of S.

These are very like the Pelé definition. One of them is easy: the smallest number is 0.9. The other, however, is no good. Each member of the sequence is strictly larger than the previous one; therefore, there is no largest element. Thus the second definition does not pick out any number. And that’s fine! Remember how we thought about the expression ‘1/0’: it tries to specify a number by a sequence of steps, but fails. ‘The biggest of them all’ tries to specify a number by its relationship to some others, and it also fails. No big deal.

(The maximum is not 1 because 1 is not a number in the sequence. Look at it again. It’s quite explicitly composed only of *terminating* decimals less than one. It is of the utmost importance, all through our reasoning, to hold off the impulse of saying things about “infinity”. We argue only about particular elements of the sequence; only they are “really there” for us to see. If we want to come up with new ways of speaking about numbers, we must base them firmly on what is “really there”. Otherwise all speech will be devoid of meaning.)

Alright. Here’s a not-so-obvious number we can define by its relationship to our sequence S above:

**Popular number. **We say a number is “popular with our sequence” if, for every distance , however small, all elements of our sequence are less than units away from , except maybe for a finite number of them. (Intuitively, “almost all” elements of S are close to .)

Let’s see what this means. Pick a random number, say -1. How far from -1 are the elements of our sequence S? Well, the first element, 0.9, is at a distance of 1.9. The next, 0.99, is 1.99 units away; the third, 1.999; and so on:

distances from -1 to elements of S:

Is -1 popular with our sequence? It seems not: the terms get farther away from -1 as we progress into S, which is not what we would expect if S “liked” the number -1. Let’s check this intuition against the definition above. It says that, in order for -1 to be popular with S, all but finitely many elements of S must be within a distance of -1, no matter what tiny we choose. To really test the popularity of -1, let’s pick an extremely tiny , say 0.01. Is it the case that all but finitely many elements of S are within 0.01 of -1? Well, no. In fact, *no* elements of S are that close to -1. Therefore, -1 is not a popular number with S. If you ask whether -2, 0 or 0.5 are popular with S, the same kind of reasoning shows that they’re not.

For practice, and to understand how “all but finitely” really works, let’s try 0.998. Is it popular? Here are the distances from it to S:

Notice how S starts out far from 0.998, gets pretty close, but then moves away. This suggests that S likes 0.998 a bit more than -1, but not enough to make it **popular**. Let’s prove that it’s not, like we did before. Choose =0.005. Are all but finitely many elements of S less than 0.005 units away from 0.998? Whoops. Yes they are. The first two elements of S are farther than that: their distances are 0.098 and 0.008, respectively. But all the others are no further than 0.002 units away, which means 0.998 does pass the check for =0.005: *all but finitely many* (in this case, all but two) elements of S are less than 0.005 units away from 0.998.

Even so, we shouldn’t give up. Recall that, for a number to be popular, it must pass the -check for *any* positive . Just because 0.998 passed *one* -check, doesn’t mean it’s popular. (It’s just like the trading card game: just because a card has higher ratings than *one* other card, doesn’t mean it’s Pelé; it must have higher ratings than *every* other card.) In fact, as you can check, 0.998 fails the -check already for =0.001.

At this point we might worry that a popular number doesn’t exist. Given our experience with meaningless expressions like ‘1/0’, this wouldn’t be too surprising. It’s entirely possible to write something down, like an expression or a definition, which doesn’t actually stand for any number at all; we already saw this happen when we defined the **maximum** of S.

There are, in fact, some sequences other than our S for which there is no corresponding popular number. For example, the sequence (1, 2, 3, 4, …) eventually moves past any given number, and never comes back. In fact, any candidate popular number fails *all* possible -checks, even for as large as a billion. This sequence really hates every single number there is. (Again, it is important to resist talking about “infinity”. The sequence (1, 2, 3, 4, …) does not “go to infinity” in this context, simply because we have not defined what “going to infinity” means. We might as well say that it goes to the beach.)

Fortunately, our initial S isn’t as misanthropic: the number 1 is quite popular with it. You may like to prove this for yourself. Here are

distances from 1 to S:

the same, in a helpful way:

The proof consists in showing that 1 passes all -checks. This happens because, for each , there is some negative power of ten, say , which is less than . This means that all elements of S, except the first n (finitely many) are no farther than units from 1.

As you probably noticed, a popular number is nothing but the limit of a sequence. Compare its definition with those for the Pelé card and the chess pawn.

Based on all this, I’ll introduce a new way of referring to numbers. Given a sequence

I *define* the symbol

to mean “the number which is popular with the sequence A”. In the particular case where A is a sequence of growing decimal expansions (like our original S), ie

(each a digit between 0 and 9) we use the alternative symbol ‘‘. Though the symbol is different, we define it to stand for the exact same thing: the number which is popular with A’, or, in more standard terms, the *limit* of A’.

If you’re with me so far, you probably already have a much better understanding of limits than most beginning calculus students worldwide. (And we hardly used any fancy jargon!) For example, now that I’ve defined ‘0.999…’ to be the limit of the sequence ‘(0.9, 0.99, 0.999, …)’, you would never make B’s mistake:

Person A: 0.999… = 1.

Person B: It’s not equal. The left-hand side gets infinitely close to 1, but never equals 1.

Of course, this makes as much sense as saying that Pelé isn’t the best player because there are other players which are not the best. Or that the pawn can’t be traded in for a queen, because the other pieces can’t. This kind of confusion is the result of conflating two very different things:

- A collection of objects (the sequence ‘0.9, 0.99, 0.999, …’, football players, chess pieces);
- An object defined by its relationship to that collection (the number ‘1’, Pelé, the pawn).

Alright, we’re pretty much done. If you understand everything above well enough to explain it to somebody else, then all that’s left to do is go back into the mainstream. Understanding definitions in a deep way is a vital part of mathematics, but it is also important to see why they are interesting, where they lead. Indeed, the definition of limit is an unusually fruitful one. There are

- Foundational theorems, like existence and uniqueness: a certain kind of sequence (monotonic and bounded) always has a limit, and any sequence has at most one limit. You can prove it from the definition, and it justifies our constantly saying
*the*limit instead of*a*limit. - “Niceness” theorems: if you take two sequences, S and S’, which have limits, and you define a new sequence S+S’ from the addition of corresponding terms of S and S’, then the limit of S+S’ is the sum of the limits of S and S’ separately. The same goes for multiplication (SS’).
- Interesting uses of the definition: starting with derivatives and integrals, up to nets and filters in topological spaces, the basic idea of a “limit” underlies a lot of modern analysis.

Enjoy!

]]>

I don’t know about usefulness, but one can certainly say something *different*. For all their numerosity, calculus texts are remarkably homogeneous in their treatment of this elementary topic, and I see at least one point that could stand clarification.

Limits — especially expressions involving “dot dot dot”, such as 0.999… — still seem to be a source of confusion for many beginning calculus students. Since all the popular treatises give thorough mathematical definitions of the concept, the problem must lie somewhere other than mathematics. I conjecture that what’s missing is not a more easily understandable definition, but instead a straightforward discussion of how definitions work in mathematics.

To get into definitions, I’m going to start by talking about *symbols* — in particular, symbols that represent numbers.

I will assume that numbers are something that can be coherently referred to, just as tables and chairs. This does not entail that numbers exist in the same sense as tables and chairs, since it is also possible to refer coherently to Santa Claus: everyone agrees that he wears red and owns a sled.

Nevertheless, the content of today’s post depends only on the weak assumption that people can talk about numbers — via decimal representations, or fingers raised up in one hand, for instance — and agree with each other. In fact, today’s post merely clarifies that limits are a way of referring to numbers. *How* they refer to numbers is a bit subtler than, say, how decimal representations refer to numbers; and in a way their mechanism is, I daresay, almost unique to mathematics. Which may explain the difficulty encountered by many new students: very few people, in their daily life, refer to objects in the subtle way that limits refer to numbers.

Let’s start with an example: what number does the following symbol stand for?

Yes, the number one. People use it to count how many heads the average human has, or the number of stars that the Earth orbits around. In general, if an object is of a certain kind, and there are no other objects of the same kind, we say: “there is **one** object of that kind”.

Compare how ‘1’ refers to the number one, and how ‘ॐ’ refers to the sound ‘om’. (I know it refers to a lot more, but focus on the sound right now, for argument’s sake.) In a certain sense, each symbol just sort of “stands for” something else, and provides a way for people to talk about that something else.

Now, what number do the following symbols stand for?

Yes, the number two. Notice how this is qualitatively different from the simple ‘1’ above. It is a *composite* symbol, so to speak: it’s made up of smaller symbols, namely a couple ‘1’s and the plus sign. I will call such composite symbols *expressions*.

One crucial point is that it is not at all obvious when a few pencil strokes count as a symbol, or as an expression. There is nothing in the expression itself to tell you how to read it; it’s a convention which you must be taught. Someone uninformed, like a small child or an alien, might view ‘1+1’ as a single symbol with a few separated pieces, just like ‘i’ is a single letter, even though the dot is separated from the rest. If you can’t read japanese, you might make the same mistake! The japanese alphabet, though staggeringly large at first sight, actually contains many symbols which are combinations of other symbols, often drawn a bit smaller and on top of each other. For instance, here is the saying “onna sannin yoreba kashimashii”, meaning “when three women gather, it is noisy”:

**女 三人 寄れば 姦しい**

The symbol for ‘noisy’ contains three miniatures of the symbol for ‘woman’. Can you spot them?

A peripheral remark: we might find it undesirable to leave any part of mathematics to pure convention, let alone a part as important as the meaning of formulas. We might wish to write formulas in such a way that it would be *clear* how to read them. However, some level of arbitrariness is unavoidable: even if expressions came with instructions for proper reading, you’d need further instructions on how to read *those*; and so on, ad infinitum. If we’re going to communicate, it has to stop somewhere, and at that point we’ll have to assume that everyone just* gets it*. (This is a point forcefully made by Wittgenstein in his *Philosophical Investigations*.)

All right. How do we find out what number ‘1+1’ stands for? We were taught in school that ‘1+1’ is made up of a ‘1’, a ‘+’, and another ‘1’; that each ‘1’ stands for a number, and ‘+’ stands for addition, which is a way to combine numbers into new numbers; and that ‘+’ combines two ‘1’s into a number that everyone calls ‘two’.

Compare how ‘1+1’ refers to the number two, and how ‘pineapple juice, coconut milk and rum, shaken with ice’ refers to the piña colada. They both sort of give us some starting ingredients and a way to mix them. The same goes on with more complicated expressions like

.

The distinguishing feature of this method of referring to numbers is the following: we start with an expression containing symbols for numbers and rules; follow the rules for a while; and end up with the number that the expression refers to. Each step of following the rules is basically replacing a rule symbol, and its associated numbers, with the “result”. For example:

(1 + 3 x 5) /2

(1 + 15) / 2

(1+15) / 2

16 / 2

16 / 2

8

There is a subtlety implicit already in this very simple way of referring to numbers. Suppose I want to tell you a number that I’m thinking of, and the way I do it is by giving you a sequence of steps to follow. I say, “the end result of these steps, that’s the number I’ve got in mind”. For this promise to be held, that is, for the sequence of steps to actually represent a number, it is absolutely essential that the steps *can* be carried out.

For instance, suppose I tell you “1/0”. This is shorthand for “take the number 1, which you know, and the number 0, which you also know, and divide the former by the latter; the result of that division is the number I want to talk about”. It’s obvious that I haven’t told you any number at all, since the division rule doesn’t handle a 0 in the denominator. There’s nothing mysterious about this: I promised you a number at the end of some calculations, but they can’t be done. That just means I’m a flake. It doesn’t say anything deep about mathematics.

OK. So far we’ve agreed to communicate numbers to each other in various ways, one of which is giving a sequence of steps to “build” a number. We’ve seen that sometimes one of the steps is meaningless, and therefore the expression stands for no number at all. Another way that communication can go wrong is if I give you a list of steps, each of which is fine on its own, but the list is infinitely long. It is also obvious that such a list can’t denote a number, since the very essence of naming numbers by sequences of steps is each party’s unspoken meaning: “the end result of these steps, that’s the number I’ve got in mind”. If there’s no end result, there’s no number being communicated.

Enter the bane of online math forums everywhere:

How is one to make sense of this? Worse yet, how can it ever equal 1?

One natural way to go, and many people do go this way, is to establish a connection with something already familiar. For example, the simpler expression ‘0.99’. Here is a perfectly well understood sequence of symbols: it stands for the number

.

We also understand ‘0.999’, ‘0.9999’ and ‘0.99999’; each means a sum of fractions like the above, only with more terms.

Great! We are rarely in such good shape when solving a problem. We usually have to scratch our heads to come up with just *one* helpful analogy, but here is endless supply of them. This gives us confidence to assert that the expression ‘0.999…’ stands for that number which is obtained by the *infinite sum*

And boom, just like that, everything has gone to hell. We have committed the cardinal sin of trying to express a number by a sequence of steps which *does not end*; which, therefore, does not express *any number at all*.

Yet mathematicians still seem able to talk about ‘0.999…’ and agree with each other, just as they do with simpler number representations like ‘1’. To understand how this is possible — ultimately, to understand what ‘0.999…’ actually means — we must come to grips with a different kind of representation, a different *way of talking* about things. It is more abstract than the two fairly direct ways described above (the straightforward symbol, like ‘1’, and the symbol-with-rules, like ‘1+1’). Nevertheless, it is well within the grasp of any human being, once it is brought to their attention. For some reason, however, mathematics textbooks never come out and say it, which I suspect to be the main source of difficulty.

]]>

In this blog, I hope to share interesting things I come across in logic and math, which may not be widely known. Also, I may occasionally share a different way of thinking about well-known things, if I feel I have developed it enough, and it is uncommonly enough seen in mainstream sources like textbooks and papers. I find it enormously important for people to exchange their “inner ways” of thinking, rather than just the “outer” results they reach; communication becomes much clearer, more interesting, and conductive to progress. I thank Terry Tao, Tim Gowers , the n-Category Café and the Catsters for proving this point so forcefully.

Though I’m brazilian, I will be writing in english, simply because I will be understood by a larger number of people that way. It’s fine to be protective of one’s culture and language, but I feel this is simply not the place. If my experience on orkut is any indication, for every portuguese-speaking reader I lose, I will gain five readers from India.

Finally, the “kc” in this blog’s URL are my other initials.

]]>