New state of the art in unlimited generation of Hamming sequence

北城余情 提交于 2019-12-17 07:34:49

问题


(this is exciting!) I know, the subject matter is well known. The state of the art (in Haskell as well as other languages) for efficient generation of unbounded increasing sequence of Hamming numbers, without duplicates and without omissions, has long been the following (AFAIK - and btw it is equivalent to the original Edsger Dijkstra's code too):

hamm :: [Integer]
hamm = 1 : map (2*) hamm `union` map (3*) hamm `union` map (5*) hamm
  where
    union a@(x:xs) b@(y:ys) = case compare x y of
        LT -> x : union  xs  b
        EQ -> x : union  xs  ys
        GT -> y : union  a   ys

The question I'm asking is, can you find the way to make it more efficient in any significant measure? Is it still the state of the art or is it in fact possible to improve this to run twice faster and with better empirical orders of growth to boot?

If your answer is yes, please show the code and discuss its speed and empirical orders of growth in comparison to the above (it runs at about ~ n^1.05 .. n^1.10 for first few hundreds of thousands of numbers produced). Also, if it exists, can this efficient algorithm be extended to producing a sequence of smooth numbers with any given set of primes?


回答1:


If a constant factor(1) speedup counts as significant, then I can offer a significantly more efficient version:

hamm :: [Integer]
hamm = mrg1 hamm3 (map (2*) hamm)
  where
    hamm5 = iterate (5*) 1
    hamm3 = mrg1 hamm5 (map (3*) hamm3)
    merge a@(x:xs) b@(y:ys)
        | x < y     = x : merge xs b
        | otherwise = y : merge a ys
    mrg1 (x:xs) ys = x : merge xs ys

You can easily generalise it to smooth numbers for a given set of primes:

hamm :: [Integer] -> [Integer]
hamm [] = [1]
hamm [p] = iterate (p*) 1
hamm ps = foldl' next (iterate (q*) 1) qs
  where
    (q:qs) = sortBy (flip compare) ps
    next prev m = let res = mrg1 prev (map (m*) res) in res
    merge a@(x:xs) b@(y:ys)
        | x < y     = x : merge xs b
        | otherwise = y : merge a ys
    mrg1 (x:xs) ys = x : merge xs ys

It's more efficient because that algorithm doesn't produce any duplicates and it uses less memory. In your version, when a Hamming number near h is produced, the part of the list between h/5 and h has to be in memory. In my version, only the part between h/2 and h of the full list, and the part between h/3 and h of the 3-5-list needs to be in memory. Since the 3-5-list is much sparser, and the density of k-smooth numbers decreases, those two list parts need much less memory that the larger part of the full list.

Some timings for the two algorithms to produce the kth Hamming number, with empirical complexity of each target relative to the previous, excluding and including GC time:

  k            Yours (MUT/GC)               Mine (MUT/GC)
 10^5           0.03/0.01                    0.01/0.01      -- too short to say much, really
2*10^5          0.07/0.02                    0.02/0.01
5*10^5          0.17/0.06  0.968  1.024      0.06/0.04      1.199    1.314
 10^6           0.36/0.13  1.082  1.091      0.11/0.10      0.874    1.070
2*10^6          0.77/0.27  1.097  1.086      0.21/0.21      0.933    1.000
5*10^6          1.96/0.71  1.020  1.029      0.55/0.59      1.051    1.090
 10^7           4.05/1.45  1.047  1.043      1.14/1.25      1.052    1.068
2*10^7          8.73/2.99  1.108  1.091      2.31/2.65      1.019    1.053
5*10^7         21.53/7.83  0.985  1.002      6.01/7.05      1.044    1.057
 10^8          45.83/16.79 1.090  1.093     12.42/15.26     1.047    1.084

As you can see, the factor between the MUT times is about 3.5, but the GC time is not much different.

(1) Well, it looks constant, and I think both variants have the same computational complexity, but I haven't pulled out pencil and paper to prove it, nor do I intend to.




回答2:


So basically, now that Daniel Fischer gave his answer, I can say that I came across this recently, and I think this is an exciting development, since the classical code was known for ages, since Dijkstra.

Daniel correctly identified the redundancy of the duplicates generation which must then be removed, in the classical version.

The credit for the original discovery (AFAIK) goes to Rosettacode.org's contributor Ledrug, as of 2012-08-26. And of course the independent discovery by Daniel Fischer, here (2012-09-18).

Re-written slightly, that code is:

import Data.Function (fix)

hamm = 1 : foldr (\n s -> fix (merge s . (n:) . map (n*))) [] [2,3,5]

with the usual implementation of merge,

merge a@(x:xs) b@(y:ys) | x < y     = x : merge xs b
                        | otherwise = y : merge a ys
merge [] b = b
merge a [] = a

It gives about 2.0x - 2.5x a speedup vs. the classical version.




回答3:


Well this was easier than I thought. This will do 1000 Hammings in 0.05 seconds on my slow PC at home. This afternoon at work and a faster PC times of less than 600 were coming out as zero seconds.

This take Hammings from Hammings. It's based on doing it fastest in Excel.

I was getting wrong numbers after 250000, with Int. The numbers grow very big very fast, so Integer must be used to be sure, because Int is bounded.

mkHamm :: [Integer] -> [Integer] -> [Integer] -> [Integer] 
       -> Int -> (Integer, [Int])
mkHamm ml (x:xs) (y:ys) (z:zs) n =  
   if n <= 1
       then (last ml, map length [(x:xs), (y:ys), (z:zs)])
       else mkHamm (ml++[m]) as bs cs (n-1)
     where
         m = minimum [x,y,z]
         as = if x == m then xs ++ [m*2] else (x:xs) ++ [m*2]
         bs = if y == m then ys ++ [m*3] else (y:ys) ++ [m*3]
         cs = if z == m then zs ++ [m*5] else (z:zs) ++ [m*5]

Testing,

> mkHamm  [1] [2] [3] [5] 5000
(50837316566580,[306,479,692])        -- (0.41 secs)

> mkHamm  [1] [2] [3] [5] 10000
(288325195312500000,[488,767,1109])   -- (1.79 secs)

> logBase 2 (1.79/0.41)     -- log of times ratio = 
2.1262637726461726          --   empirical order of growth

> map (logBase 2) [488/306, 767/479, 1109/692] :: [Float]
[0.6733495, 0.6792009, 0.68041545]     -- leftovers sizes ratios

This means that this code's run time's empirical order of growth is above quadratic (~n^2.13 as measured, interpreted, at GHCi prompt).

Also, the sizes of the three dangling overproduced segments of the sequence are each ~n^0.67 i.e. ~n^(2/3).

Additionally, this code is non-lazy: the resulting sequence's first element can only be accessed only after the very last one is calculated.

The state of the art code in the question is linear, overproduces exactly 0 elements past the point of interest, and is properly lazy: it starts producing its numbers immediately.

So, though an immense improvement over the previous answers by this poster, it is still significantly worse than the original, let alone its improvement as appearing in the top two answers.

12.31.2018

Only the very best people educate. @Will Ness also has authored or co-authored 19 chapters in GoalKicker.com “Haskell for Professionals”. The free book is a treasure.

I had carried around the idea of a function that would do this, like this. I was apprehensive because I thought it would be convoluted and involved logic like in some modern languages. I decided to start writing and was amazed how easy Haskell makes the realization of even bad ideas.

I've not had difficulty generating unique lists. My problem is the lists I generate do not end well. Even when I use diagonalization they leave residual values making their use unreliable at best.

Here is a reworked 3's and 5's list with nothing residual at the end. The denationalization is to reduce residual values not to eliminate duplicates which are never included anyway.

g3s5s n=[t*b|(a,b)<-[ (((d+n)-(d*2)), 5^d) | d <- [0..n]],
                t <-[ 3^e | e <- [0..a+8]],
             (t*b)<-(3^(n+6))+a]                                       


ham2 n = take n $ ham2' (drop 1.sort.g3s5s $ 48) [1]

ham2' o@(f:fo) e@(h:hx) = if h == min h f
                      then h:ham2'  o (hx ++ [h*2])
                      else f:ham2' fo ( e ++ [f*2])

The twos list can be generated with all 2^es multiplied by each of the 3s5s but when identity 2^0 is included, then, in total, it is the Hammings.

3/25/2019

Well, finally. I knew this some time ago but could not implement it without excess values at the end. The problem was how to not generate the excess that is the result of a Cartesian Product. I use Excel a lot and could not see the pattern of values to exclude from the Cartesian Product worksheet. Then, eureka! The functions generate lists of each lead factor. The value to limit the values in each list is the end point of the first list. When this is done, all Hammings are produced with no excess.

Two functions for Hammings. The first is a new 3's & 5's list which is then used to create multiples with the 2's. The multiples are Hammings.

h35r  x = h3s5s x (5^x)
h3s5s x c = [t| n<-[3^e|e<-[0..x]],
                m<-[5^e|e<-[0..x]], 
                t<-[n*m],
             t <= c ]

a2r n = sort $ a2s n (2^n)
a2s n c =   [h| b<-h35r n, 
                a<-[2^e| e<-[0..n]],
                h<-[a*b],
             h <= c ] 

last $ a2r 50

1125899906842624

(0.16 secs, 321,326,648 bytes)

2^50

1125899906842624

(0.00 secs, 95,424 bytes

This is an alternate, cleaner & faster with less memory usage implementation.

gnf n f = scanl (*) 1 $ replicate f n

mk35   n = (\c->      [m| t<- gnf 3 n, f<- gnf 5  n,    m<- [t*f], m<= c]) (2^(n+1))
mkHams n = (\c-> sort [m| t<- mk35  n, f<- gnf 2 (n+1), m<- [t*f], m<= c]) (2^(n+1))

last $ mkHams 50

2251799813685248

(0.03 secs, 12,869,000 bytes)

2^51

2251799813685248

5/6/2019

Well, I tried limiting differently but always come back to what is simplest. I am opting for the least memory usage as also seeming to be the fastest.

I also opted to use map with an implicit parameter.

I also found that mergeAll from Data.List.Ordered is faster that sort or sort and concat.

I also like when sublists are created so I can analyze the data much easier.

Then, because of @Will Ness switched to iterate instead of scanl making much cleaner code. Also because of @Will Ness I stopped using the last of of 2s list and switched to one value determining all lengths.

I do think recursively defined lists are more efficient, the previous number multiplied by a factor.

Just separating the function into two doesn't make a difference so the 3 and 5 multiples would be

m35 lim = mergeAll $ 
          map (takeWhile (<=lim).iterate (*3)) $
               takeWhile (<=lim).iterate (*5)  $ 1

And the 2s each multiplied by the product of 3s and 5s

ham n = mergeAll $
        map (takeWhile (<=lim).iterate (*2)) $ m35 lim 
    where lim= 2^n

After editing the function I ran it

last $ ham 50

1125899906842624

(0.00 secs, 7,029,728 bytes)

then

last $ ham 100

1267650600228229401496703205376

(0.03 secs, 64,395,928 bytes)

It is probably better to use 10^n but for comparison I again used 2^n

5/11/2019

Because I so prefer infinite and recursive lists I became a bit obsessed with making these infinite.

I was so impressed and inspired with @Daniel Wagner and his Data.Universe.Helpers I started using +*+ and +++ but then added my own infinite list. I had to mergeAll my list to work but then realized the infinite 3 and 5 multiples were exactly what they should be. So, I added the 2s and mergeAlld everything and they came out. Before, I stupidly thought mergeAll would not handle infinite list but it does most marvelously.

When a list is infinite in Haskell, Haskell calculates just what is needed, that is, is lazy. The adjunct is that it does calculate from, the start.

Now, since Haskell multiples until the limit of what is wanted, no limit is needed in the function, that is, no more takeWhile. The speed up is incredible and the memory lowered too,

The following is on my slow home PC with 3GB of RAM.

tia = mergeAll.map (iterate (*2)) $
      mergeAll.map (iterate (*3)) $ iterate (*5) 1

last $ take 10000 tia

288325195312500000

(0.02 secs, 5,861,656 bytes)

6.5.2019 I learned how to ghc -02 So the following is for 50000 Hammings to 2.38E+30. And this is further proof my code is garbage.

INIT    time    0.000s  (  0.000s elapsed)
MUT     time    0.000s  (  0.916s elapsed)
GC      time    0.047s  (  0.041s elapsed)
EXIT    time    0.000s  (  0.005s elapsed)
Total   time    0.047s  (  0.962s elapsed)

Alloc rate    0 bytes per MUT second
Productivity   0.0% of total user, 95.8% of total elapsed

6.13.2019

@Will Ness rawks. He provided a clean and elegant revision of tia above and it proved to be five times as fast in GHCi. When I ghc -O2 +RTS -s his against mine, mine was several times as fast. There had to be a compromise.

So, I started reading about fusion that I had encountered in R. Bird's Thinking Functionally with Haskell and almost immediately tried this.

mai n = mergeAll.map (iterate (*n))
mai 2 $ mai 3 $ iterate (*5) 1

It matched Will's at 0.08 for 100K Hammings in GHCi but what really surprised me is (also for 100K Hammings.) this and especially the elapsed times. 100K is up to 2.9e+38.

 TASKS: 3 (1 bound, 2 peak workers (2 total), using -N1)

  SPARKS: 0 (0 converted, 0 overflowed, 0 dud, 0 GC'd, 0 fizzled)

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    0.000s  (  0.002s elapsed)
  GC      time    0.000s  (  0.000s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    0.000s  (  0.002s elapsed)

  Alloc rate    0 bytes per MUT second

  Productivity 100.0% of total user, 90.2% of total elapsed


来源:https://stackoverflow.com/questions/12480291/new-state-of-the-art-in-unlimited-generation-of-hamming-sequence

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!