The Construction of Cantor Sets

By Ng Tze Beng

Cantor sets play an important role in real analysis, particularly in furnishing counter examples and exotic or pathological functions. Most frequently, we meet Cantor set of zero measure but the construction is canonical enough to apply to give Cantor set of positive measure. These are subsets of the closed interval [0,1] and of measure greater or equal to 0 but less than 1.  For the use of Cantor sets see Composition and Riemann Integrability, Lebesgue Integration and Composition and Change of Variable or Subsitution in Riemann Integration.  

The first Cantor set we shall construct is the Cantor set of measure zero. This is the set left over after repeatedly deleting the middle third open intervals.

The Cantor set C0

We shall start from the closed unit interval [0,1]. At the first stage, we delete the middle open interval (1/3, 2/3) from [0, 1]. We shall enumerate the open intervals to be deleted. We denote (1/3, 2/3) by I(1,1). That is I(1,1) = (1/3,2/3). The first construction gives us the remaining 2 closed intervals

[0, 1] - I(1,1) = [0,1] - (1/3,2/3) = [0, 1/3] È[2/3, 1].

Then at the second stage we delete the middle third open interval from each of the closed intervals. Thus there are 2 open intervals to be deleted and they are

I(2,1) = 1/3 I(1, 1) =(1/9, 2/9) and

I(2, 2) = I(2, 1) + 2/3 , the translation of I(2,1) by 2/3.

Hence we are left with 4 = 22 closed intervals, two in [0, 1/3] and two in [2/3, 1]. By rescaling we see that the middle open third intervals of each of these 4 closed intervals are given by

I(3,1) = 1/3 I(2,1), I(3,2) =1/3 I(2,2) and

I(3, 3) = I(3,1) + 2/3, I(3, 4) = I(3,2) +2/3.

Repeating this construction, at the k-th stage, there are 2k-1 open intervals, each of length 1/3k, to be deleted from [0, 1] and for k ³ 3, they are given by

I(k, j) = 1/3 I(k-1, j), j = 1, 2, ¼, 2k-2 ;

I(k, 2k-2+ j) = I(k, j) + 2/3, j = 1, 2, ¼, 2k-2.

Let G be the union of these countable collection of disjoint open intervals,

.

Thus G is open. The Cantor set C0 is defined to be the complement of G in [0, 1]. That is C0 = [0, 1] - G and is therefore a closed subset of [0, 1]. Note that at the k-th stage the total length of the intervals to be deleted is 2k-1 (1/3k). Therefore, the measure of G, since the intervals in G are all disjoint, is given by

Thus the measure of C0 is equal to m([0, 1]) - m(G) = 1-1 = 0.

In order to define the Cantor function, we shall consider another representation of the Cantor set C0 . This will also show that C0 has the same cardinality as [0, 1] and so it is uncountable. We shall make use of the ternary representation of numbers in [0, 1]. There is a unique way of representing real numbers in (0, 1] by non-terminating ternary expansion.

We shall redefine these real numbers as the supremum of a sequence of rational numbers. Take any 1 ³ x > 0 in [0,1]. Choose the greatest integer a1 such that a1/3 < x £. a1/3 + 1/3. Consider the rational numbers 0/3, 1/3 , 2/3. Then take the largest of these which is < x. That is we choose a1 in {0,1,2} such that < x £ + . Thus a1 = 0 Û 0/3 < x £ 1/3, a1 = 1 Û 1/3 < x £ 2/3 and a1 = 2 Û 2/3 < x £ 3/3=1. Let d1 = 0 + . Then d1 < x £ d1 + . Now choose integer a2 in {0,1,2} such that d1 + < x £ d1 + . Let now d2 = d1 + = 0 + + . We have d2 < x £ d2 + . Continuing like this we shall obtain a sequence a0 , a1 , a2 , ¼, of integers, 0 £ ai £ 2, such that if dn = dn-1 + = 0+ + + ¼+ then dn < x £ dn + . The symbol 0 × a1 a2 a3 ¼ is called the non-terminating ternary expansion of x . So the set {d1 , d2 , d3 , d4 ¼ } is bounded above by x. Then the least upper bound or supremum of this set is x. We deduce this as follows. If x ¹ sup{d1 , d2 , d3 , d4 ¼ } = M, then x > M . Then by the Archimedean property of the set of real numbers, there exists a counting number m such that < x - M and so M + < x. Now take a non-terminating ternary expansion 0 × c1 c2 c3 ¼ for . Let L be the first integer such that cL ¹ 0. Then . Therefore, M + < M + < x. Hence since M = sup{d1 , d2 , d3 , d4 ¼ }, dL + £ M + < x and this contradicts x £ dL + . Hence x = sup{d1 , d2 , d3 , d4 ¼ }.

This representation of the real number x in (0, 1] is unique. Suppose two non-terminating ternary expressions 0× a1 a2 a3 ¼ and 0× b1 b2 b3 ¼ are such that for some integer j ³ 0, aj ¹ bj and ai = bi for i £ j - 1. If 0× a1 a2 a3 ¼ represents x and 0× b1 b2 b3 ¼ represents y we shall show that x ¹ y. Suppose that aj < bj . By the hypothesis we have

0 + + + ¼+ + < x £ 0 + + + ¼+ + .

But 0 + + + ¼+ += 0 + + + ¼+ +

£ 0 + + + ¼+ + < y.

Therefore, x < y and so x ¹ y. Similarly if aj > bj, we can show that x ¹ y. Thus this way of representing any real number in (0, 1] is unique. Let 0 be represented by the terminating ternary expansion 0.00. Notice that for x = , the corresponding non-terminating expansion is the expression 0.022222¼¼ which is a non-terminating expression with recurring '2'. is represented by 0.122222....... a non-terminating expression with recurring '2'. 1 is represented by 0.22222........ a non-terminating expression with recurring '2'. We are going to change our representation a little. We are going to use some terminating expansion in our representation as follows. If the non-terminating expansion of x consists of exactly one '1' in its expansion, say 0× a1 a2 a3 .... has precisely aj = 1 and that ak = 2 for k > j , then we replace it by the terminating expansion 0× b1 b2 b3 ... bj -1 2 with bk = ak for k £ j-1, bj = 2 and bk = 0 for k > j, hence this representation has no '1' in it. Thus we are going to use the following convention: if a number x in [0, 1] has two ternary expansion, one with no 1's and one with at least one '1', then it is the one that has no 1's that is to be used. We refer to this as our system of expansion. We are going to use this representation to describe the set G in [0, 1].

x Î I(1,1) = (1/3, 2/3) Û the non terminating expansion of x has a1 = 1 and a2 and subsequent aj are not all equal to 2 Û the system of expansion used for x has a1 = 1 (because the terminating expansion excluded here has no '1', namely 2/3 which has terminating expansion 0.2 and recurring expansion 0.12 , where the underscore denotes repeating infinitely many times the number underscored. )

Now dividing by 3 has the effect of shifting the expansion by one place to the right and introducing a zero, i.e. it has the effect of moving the ternary point to the left. Thus

x Î I(2,1) = 1/3 I(1,1) = (1/9, 2/9)

Û the system of expansion used for x has a1 = 0 and a2 = 1

Therefore, x Î I(2,2) = I(2,1) + 2/3

Û the system of expansion used for x has a1 = 2 and a2 = 1.

Hence x Î I(2,1)È I(2,2) Û the system of expansion used for x has the first '1' occurring in the second ternary place.

Then since I(3,1) = 1/3 I(2,1) and I(3,2) =1/3 I(2,2), x Î I(3,1)È I(3,2) Û the system of expansion used for x has a1 = 0 and has the first '1' occurring in the third ternary place. Since I(3, 3) = I(3,1) + 2/3 and I(3, 4) = I(3,2) +2/3, x Î I(3, 3)È I(3, 4) Û the system of expansion used for x has a1 = 2 and has the first '1' occurring in the third ternary place. Therefore, x Î I(3,1)È I(3,2)ÈI(3, 3)È I(3, 4) Û the system of expansion used for x has the first '1' occurring in the third ternary place. Thus by induction we see that Û the system of expansion used for x has the first '1' occurring in the k-th ternary place. Therefore, Û the system of expansion used for x has at least one '1'. Hence x Î C0 = [0, 1] - G Û the system of expansion used for x has no 1's. We have thus proved the following:

, where bk = 0 or 1.

Next we shall see how our system of representation also gives the same conclusion. Observe that the convention used in our system of representation deviates from a non-terminating expansion only if the number is zero or has two possible ternary expansions with one of them involving no 1's. Thus if y has a terminating expansion either y = 0 or the terminating expansions has no 1's and ends in the number '2'. Thus if x < y , and y has a terminating ternary expansion,

Then we can write Thus if , x < y implies that there exists integer k ³ 1 such that ai = ci for all 1 £ i £ k -1 and ak < ck or a1 < c1, if k = 1 . If k £ l -1, then we have ai = ci = bi' for all 1 £ i £ k -1 and ak < bk'. If k = l , then we have ai = bi' for all 1 £ i £ l -1 and al < cl = 1< 2 = bl'. If k > l , then we have ai = bi' for all 1 £ i £ l -1 and al = cl = 1 < 2 = bl'. Hence in any case we obtain that x < y implies that there exists integer l ³ k > 1 such that ai = bi' for all 1 £ i £ k -1 and ak < bk' or a1 < b1' .

Now if x has the terminating ternary expansion , we can write it as Hence by what we have just proved, there exists integer l ³ k > 1 such that di = bi' for all 1 £ i £ k -1 and dk < bk' or d1 < b1' . If p =1, then d1 = 1 and if d1 < b1' , b1' =2 and l must be greater than 1, otherwise x = y. Thus if p =1, then a1' = 2 =b1' and there exists an integer l ³ k > 1 such that ai = bi' for all 1 £ i £ k -1 and ak '< bk' . We now assume that p > 1. If k < p, then ai'= di = bi' for all 1 £ i £ k -1 and ak'= dk < bk'. If If k = p, then ai'= di = bi' for all 1 £ i £ k -1 and dk = dp = 1 < bp'. Hence bp' = 2 and so ap'=2 = bp'. Then it follows that l > p and that there exists q such that p < q £ l and ai' = bi' for all 1 £ i £ q -1 and aq < bq'. Now we shall see that k £ p. if k > p, then we have ai'= di = bi' for all 1 £ i £ k -1 and dk < bk'. Therefore, ai'= di = bi' for 1 £ i £ p -1, 1 = dp = bp' , hence contradicting that bp' is even. Hence we have shown that the same conclusion for Lemma 3 is true also for our system of representation of real numbers in [0, 1] by ternary expansion. The converse of the statement is obvious.

Now we are ready to investigate the monotonicity of the function g:[0, 1] ®[0, 1]. We have shown that g is a surjective map.

Next we shall state a general result concerning bounded monotone function whose range is an interval.

Now we turn our attention towards constructing a Cantor set of positive measure in [0, 1]. Indeed the construction also applies to give a different Cantor set of measure zero. The procedure of deleting the middle portion of each of the remaining closed sets is followed here but using a different specified length. (This idea can be generalised by not requiring the intervals to be deleted to be the middle portion, to give a generalised Cantor set.) Hence the construction is canonical. Let k be any real number with 0 £ k < 1. Let d = (1 - k) > 0. Start with the closed unit interval [0, 1]. Let I(1,1) be the middle open interval of length d/2. Then if we write I(1,1) = (a(1,1), b(1,1)), then a(1,1) = 1/2 - d/4. Then let G1 = I(1,1) and F1 = [0,1] - G1. The first stage is to form F1 . Then F1 is the disjoint union of 2 closed intervals each of length equal to a(1,1) = 1/2 - d/4. Let I(2,1) and I(2,2) be respectively the middle open intervals, each of length d/23, of the 2 closed intervals of F1 . The open intervals are ordered from the left to the right by the second indices. Let G2 = I(1,1)ÈI(2,1)ÈI(2,2). Suppose I(2,1) = (a(2,1), b(2,1)). Then a(2,1) = 1/2 a(1,1) - d/24 = 1/22 -d(1/22 - 1/24). The second stage is to form F2 = [0,1] - G2 = [0,1] - G1 - I(2,1)ÈI(2,2). Then F2 is the disjoint union of 22 closed intervals each of length equaling a(2,1) = 1/22 -d(1/22 - 1/24). Then let I(3, j), j =1,..,22 be respectively the middle portion of length d/25 of the 22 closed intervals again ordered by the second indices, from left to right in the sense that I(3, j) < I(3, k) if and only if j < k if and only if there exist x in I(3, j) and y in I(3, k) such that x < y. Stage three is to remove from F2 these 22 open intervals, that is to form F3 = F2 - È{ I(3, j), j =1,..,22 } = [0, 1] - G3 , where . Thus F3 consists of 23 disjoint closed intervals, each of length equaling a(3,1), where I(3,1) = (a(3,1), b(3,1)). It is easily seen that a(3,1) = 1/23 -d(1/23 - 1/26). Let , where I(4, j), j =1,..,23 are the middle open intervals each of length d/27, of each of the disjoint closed intervals. Continuing in this way, at the n-th stage we have , where I(n, j) = (a(n, j), b(n, j)) is an open interval of length d/22n-1 and the I(n, j)'s are ordered from left to right by the second indices according to the natural ordering of elements in the intervals. Note that a(n,1) = 1/2 a(n-1,1)-d(1/22n) =1/2n -d(1/2n-1/22n). We obtain Fn by deleting from Fn-1 the uion of open intervals Hn. Hence Fn = Fn-1- Hn = [0,1] - Gn , where and Fn is a disjoint union of 2n closed intervals each of length equaling a(n,1) = 1/2n -d(1/2n-1/22n). To obtained Fn+1 , we shall delete from Fn open intervals each of length equaling d/2 2n+1 from the 2n closed intervals. That is if we let , where I(n+1, j), j =1,..,2n are these open intervals to be deleted, then Fn+1 = Fn - Hn+1 =[0, 1] - Gn+1 , where . Obviously Fn+1 is a collection of disjoint closed intervals, each of length equal to a(n+1,1). Note also that for each n ³ 1, Fn+1 Í Fn.

Let . Then G is a disjoint union of open intervals. The length or measure of Hk is given by . Thus the measure or length of G,

The Cantor set Ck is defined to be the complement of G in [0, 1]. That is, . Thus the measure of Ck is equal to m([0,1]) - m(G) = 1 - d = k.

Then Ck satisfies the properties stated in Theorem 8.

Proposition 10. There is a continuous strictly increasing bijective map f : [0, 1] ® [0,1] mapping the Cantor set Ck onto the the Cantor set C0 defined earlier using the "deleted middle third intervals" construction.

Proof. This is given by Lemma 1 of Composition and Riemann Integrability. Note that the proof given there applies also to the case k = 0.

Since C0 constructed using the "deleted middle third intervals" construction is uncountable and the function f given by Proposition 10 maps Ck bijectively onto C0, Ck is also uncountable.

ã Ng Tze Beng 2001