Let those datatypes represent unary and binary natural numbers, respectively:
data UNat = Succ UNat | Zero data BNat = One BNat | Zero BNat | End u0 = Zero u1 = Succ Zero u2 = Succ (Succ Zero) u3 = Succ (Succ (Succ Zero)) u4 = Succ (Succ (Succ (Succ Zero))) b0 = End // 0 b1 = One End // 1 b2 = One (Zero End) // 10 b3 = One (One End) // 11 b4 = One (Zero (Zero End)) // 100 (Alternatively, one could use `Zero End` as b1, `One End` as b2, `Zero (Zero End)` as b3...)
My question is: is there any way to implement the function:
toBNat :: UNat -> BNat
That works in
O(N), doing only one pass through UNat?
To increment a binary digit, you have to flip the first zero at the end of your number and all the ones preceding it. The cost of this operation is proportional to the number of 1 at the end of your input (for this your should represent number as right-to-left list, eg. the list [1;0;1;1] codes for 13).
Let a(n) be the number of 1 at the end of n:
a(n) = 0, 1, 0, 2, 0, 1, 0, 3, 0, 1, 0, 2, 0, 1, 0, 4, ...
s(k) = a(2^k) + a(2^k+1) + ... + a(2^(k+1)-1)
be the sum of elements between two powers of 2. You should be able to convince yourself that s(k+1)=2*s(k) + 1 (with s(0) = 1) by noticing that
a(2^(k+1)) ..., a(2^(k+2) - 1)
is obtained by concatenating
a(2^k) + 1, ..., a(2^(k+1) - 1) and a(2^k), ..., a(2^(k+1) - 1)
And therefore, as a geometric series, s(k) = 2^k - 1.
Now the cost of incrementing N times a number should be proportional to
a(0) + a(1) + ... + a(N) = s(0) + s(1) + s(2) + ... + s(log(N)) = 2^0 - 1 + 2^1 -1 + 2^2-1 + ... + 2^log(N) - 1 = 2^0 + 2^1 + 2^2 + ... + 2^log(N) - log(N) - 1 = 2^(log(N) + 1) - 1 - log(N) - 1 = 2N - log(N) - 2
Therefore, if you take care of representing your numbers from right-to-left, then the naive algorithm is linear (note that you can perform to list reversal and stay linear if you really need your numbers the other way around).
I like the other answers, but I find their asymptotic analyses complicated. I therefore propose another answer that has a very simple asymptotic analysis. The basic idea is to implement
divMod 2 for unary numbers. Thus:
data UNat = Succ UNat | Zero data Bit = I | O divMod2 :: UNat -> (UNat, Bit) divMod2 Zero = (Zero, O) divMod2 (Succ Zero) = (Zero, I) divMod2 (Succ (Succ n)) = case divMod2 n of ~(div, mod) -> (Succ div, mod)
Now we can convert to binary by iterating
toBinary :: UNat -> [Bit] toBinary Zero =  toBinary n = case divMod2 n of ~(div, mod) -> mod : toBinary div
The asymptotic analysis is now pretty simple. Given a number
n in unary notation,
divMod2 takes O(n) time to produce a number half as big -- say, it takes at most
c*n time for large enough
n. Iterating this procedure therefore takes this much time:
c*(n + n/2 + n/4 + n/8 + ...)
As we all know, this series converges to
toBinary is also in O(n) with witness constant
If we have a function to increment a
BNat, we can do this quite easily by running along the
UNat, incrementing a
BNat at each step:
toBNat :: UNat -> BNat toBNat = toBNat' End where toBNat' :: BNat -> UNat -> BNat toBNat' c Zero = c toBNat' c (Succ n) = toBNat' (increment c) n
Now, this is
M is the worst case for
increment. So if we can do
increment in O(1), then the answer is yes.
Here's my attempt at implementing
increment :: BNat -> BNat increment = (reverse End) . inc' . (reverse End) where inc' :: BNat -> BNat inc' End = One End inc' (Zero n) = One n inc' (One n) = Zero (inc' n) reverse :: BNat -> BNat -> BNat reverse c End = c reverse c (One n) = reverse (One c) n
This implementation is
O(N) because you have to
BNat to look at the least significant bits, which gives you
O(N) overall. If we consider the
BNat type to represent reversed binary numbers, we don't need to reverse the
BNat, and, as @augustss says, we have O(1), which gives you O(N) overall.