Hyperbinary representations and amateur-hour number theory

I’ve recently become rather obsessed with a math problem, and thought I’d try out my new LaTeX plugin (this one) by talking about it a bit. I came across this great article series on a non-standard proof of the countability of the rationals (\mathbb{Q}? yes!) that gives a list of the rationals, each appearing once and only once, in reduced form. I originally saw it mentioned on Division by Zero here (along with a bunch of other great non-standard proofs), Calkin and Wilf’s original published paper is here, and the series in question was on The Math Less Traveled. I’m going to provide something of an introduction here, but the posts and comments there are extremely accessible, and I can’t recommend them more.

The interesting aspects of this sequence are pretty simple to prove, and I really wish I had had this paper when I was teaching about infinite sets; a well-defined sequence is an absolute joy compared to putting the rationals in a grid and skipping the ones already enumerated. Just try and tell me the nth entry in that sequence (I think I even gave that as a bonus question: is there a pattern in the position of reducible fractions?).

A sequence of all rationals

Calkin and Wilf’s sequence, on the other hand, is easy to define. The (n+1)th numerator is equal to the nth denominator, and the nth denominator is equal to the number of ways you can write n as the sum of powers of two, using each power of two no more than twice. Calkin and Wilf term this a hyperbinary representation of n, though I think that term was used before their paper made the connection to a list of the rationals. One way to think about a hyperbinary representation is as a distinct pair of binary numbers that sum to n, where a bit is set in the second number if and only if the corresponding bit is set in the first. e.g.

2 = 2^1 = 2 \cdot 2^0

or, in binary

10_2 = 10_2 = 01_2 + 01_2

Two has two hyperbinary representations, and, lo and behold, the second rational number in the sequence has a denominator of 2. The sequence itself has a nice recurrence relation, which becomes most clear when you see Calkin and Wilf’s original derivation of it from a binary tree. If we define the number of hyperbinary representations of a x as h(x), then

h(0) = 1 h(2k + 1) = h(k) \quad \text{[for odd x]} h(2k) = h(k) + h(k - 1) \quad \text{[for even x]}

If we define a sequence as h(x-1)/h(x), and start with x=1, we get the sequence starting: 1/1, 1/2, 2/1, 1/3, 3/2, 2/3, 3/1, 1/4, 4/3, 5/2, 2/5, 3/4, 4/1,…. You can verify that this list will contain every rational number, in reduced form, once and only once. The rationals are indeed enumerable, and if you want to know the nth one, you can easily calculate it (edit: h(x) also turns out to be “Stern’s diatomic series,” which is integer sequence A00248. More interesting research done on this sequence can be found on that page).

The Inverse of h

Brent Yorgey at The Math Less Traveled was curious about the inverse relation of h(x). That relations would, given any integer n, produce the positions where n occurs as the denominator of a rational in Calkin and Wilf’s sequence of all rationals. There are an infinite number of occurrences, of course, but it turns out there are an infinite number only at odd positions in the sequence; n occurs only a finite number of times at even positions, the set of which Brent terms n’s “primary occurrences” (edit: in fact it seems, as mentioned here, there seem to be exactly \phi (n) primary occurrences of n, where phi in this case is Euler’s totient function). You’ll find a an introduction to the problem that is probably more persuasively interesting on his blog in the third section of this page.

I, at least, found myself distracted by it for some time and while there was plenty of other work for me to be doing, I started scribbling binary numbers everywhere. I found a few patterns in how to construct primary occurrences of n, guaranteeing that you can always produce at least two of those occurrences. The proofs are in the comments of the article here, but, in short, I showed that

n = h(2^n - 2) \quad \text{[1]}


n = h(2^{n-1}) \quad \text{[2]}

That is, n always has a primary occurrence at at least two positions, represented in binary by (n-1) 1s followed by one 0, and one 1 followed by (n-1) 0s.

As an example, if n=3, we can construct two of its primary occurrences by first using two 1s and one 0, and then using one 1 and two 0s, making 110_2 and 100_2, also known as 4 and 6. Looking at the rationals in positions 4 and 6  in the the sequence given above, 3 is indeed the denominator of both (they are 1/3 and 2/3, respectively). Since this pattern works for any n, you can always find two primary occurrences for any n (greater than 2, since, for n = 2, the two constructed numbers are the same: 2).

To generalize this finding, I wanted to calculate the values of n that had primary occurrences at binary strings of p 1s and q 0s, the previously shown occurrences being special cases when either p=1 or q=1. The pattern ended up being fairly simple, but I find the result kind of pretty.

The following proof depends on the fact that

h(2^k - 1) = 1 \quad \text{[3]} h(m \cdot 2^k - 1) = h(m - 1) \quad \text{[4]}

both talked about here. Equation (4), incidentally, allows us to define all of the infinite occurrences of n at odd positions in terms of the finite number of “primary” (even) occurrences.

A string of p 1s and q 0s, in binary, is equal to 2^{p+q} - 2^q. This somewhat awkward formulation keeps the final result simple.

Proof (by induction)

We will show

h^{-1}(n) \supseteq \{2^{p+q} - 2^q : \exists p, q \in \mathbb{N} (pq = (n-1)) \}

or, equivalently,

n = 1 + pq = h(2^{p+q} - 2^q) \;p, q \in \mathbb{N}

Our basis is q = 0, p \in \mathbb{N}:

h(2^{p+0} - 2^0) = h(2^p - 1) = 1 + p \cdot 0 = 1

which is exactly what we would expect from equation (3).

We’ve settled the case for all p as long as q=0, and now just need to take care of every q larger than 0. We do this by assuming h(2^{p+q} - 2^q) = 1 + pq, and showing that if this relationship is true for q, then it must be true for (q+1). If you’re new to induction, or just haven’t visited in a while: if we can show that our equation is true for (q+1) as long as it’s true for q, and we’ve already shown that it’s true for q=0, we can deduce (induce? =) that it must be true for (0+1)=1. And, if its true for 1, it must be true for (1+1). And if its true for 2, it must be true for (2+1). And so on, covering all the natural numbers.

To tackle this, we insert (q+1) wherever there is a q. Since, by definition, our constructed argument for h is an even number, we use the recurrence relation for even numbers:

h(2^{p+(q+1)} - 2^{(q+1)}) = h(2^{p+q} - 2^q) + h(2^{p+q} - 2^q - 1)

the first term is simply our assumption, so is equal to 1 + pq. The second term:

h(2^{p+q} - 2^q - 1) = h(2^p \cdot 2^q - 2^q - 1) = h((2^p - 1) \cdot 2^q - 1) = h((2^p - 1) - 1) \quad \text{[by (4)]} = h(2^p - 2) = p \quad \text{[by (1)]}

summing the two terms:

h(2^{p+q+1} - 2^{q+1}) = (1 + pq) + p = 1 + p(q + 1) \quad \square

(The usual Latex alignment commands doesn’t seem to be working…I’ll have to figure that out)

We will begin with the firemen, then the math teachers…

As I note in my original comment, the most interesting thing about this result to me is that it sets a lower limit to the cardinality of h^{-1}(n) at the number of divisors of (n-1) (though an exact expression of the cardinality has already been suggested =). Moreover, you can construct that number of primary occurrences by taking p 1s (\forall p : p|(n-1)), appending (n-1)/p 0s, and interpreting the string as a binary number.

It would be interesting to take this further by investigating the connection between \phi (n) and the divisors of (n-1). I’ve also started playing with strings of p 1s, q 0s, r 1s, and s 0s, but it gets unwieldy quickly (you have to have an even number of components to the string, since you have to start with a 1 (leading 0s don’t count) and end with a 0 (primary occurrences are at even positions)). There is a hint, though, that using a technique similar to the one used above (and some serious trickery) could take that binary string and get some nice cascading simplifications that remove each term in turn. By extension, you then take that technique to the limit, and before you know it, you can find a closed form for

h(2^{p_1+p_2+\ldots+p_{k-1}+p_k} - 2^{p_2+p_3+\ldots+p_{k-1}+p_k} + \ldots + 2^{p_{k-1}+p_k} - 2^{p_k})

No problem, right? I’ll try and have something more soon.

This entry was posted in Uncategorized and tagged , , , , , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *


You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong> <pre lang="" line="" escaped="">