Simplification Algorithm for Reverse Polish Notation

后端 未结 4 801
温柔的废话
温柔的废话 2021-01-16 11:34

A couple of days ago I played around with Befunge which is an esoteric programming language. Befunge uses a LIFO stack to store data. When you write programs the digits fro

4条回答
  •  佛祖请我去吃肉
    2021-01-16 12:17

    There is 44 solutions for 999 with lenght 9:

    39149*+**
    39166*+**
    39257*+**
    39548*+**
    39756*+**
    39947*+**
    39499**+*
    39669**+*
    39949**+*
    39966**+*
    93149*+**
    93166*+**
    93257*+**
    93548*+**
    93756*+**
    93947*+**
    93269**+*
    93349**+*
    93366**+*
    93439**+*
    93629**+*
    93636**+*
    93926**+*
    93934**+*
    93939+*+*
    93948+*+*
    93957+*+*
    96357**+*
    96537**+*
    96735**+*
    96769+*+*
    96778+*+*
    97849+*+*
    97858+*+*
    97867+*+*
    99689+*+*
    956*99*+*
    968*79*+*
    39*149*+*
    39*166*+*
    39*257*+*
    39*548*+*
    39*756*+*
    39*947*+*
    

    Edit:

    I have working on some search space pruning improvements so sorry I have not posted it immediately. There is script in Erlnag. Original one takes 14s for 999 but this one makes it in around 190ms.

    Edit2:

    There is 1074 solutions of length 13 for 9999. It takes 7 minutes and there is some of them below:

    329+9677**+**
    329+9767**+**
    338+9677**+**
    338+9767**+**
    347+9677**+**
    347+9767**+**
    356+9677**+**
    356+9767**+**
    3147789+***+*
    31489+77***+*
    3174789+***+*
    3177489+***+*
    3177488*+**+*
    

    There is version in C with more aggressive pruning of state space and returns only one solution. It is way faster.

    $ time ./polish_numbers 999
    Result for 999: 39149*+**, length 9
    
    real    0m0.008s
    user    0m0.004s
    sys     0m0.000s
    
    $ time ./polish_numbers 99999
    Result for 99999: 9158*+1569**+**, length 15
    
    real    0m34.289s
    user    0m34.296s
    sys     0m0.000s
    

    harold was reporting his C# bruteforce version makes same number in 20s so I was curious if I can improve mine. I have tried better memory utilization by refactoring data structure. Searching algorithm mostly works with length of solution and it's existence so I separated this information to one structure (best_rec_header). I have also make solution as tree branches separated in another (best_rec_args). Those data are used only when new better solution for given number. There is code.

    Result for 99999: 9158*+1569**+**, length 15
    
    real    0m31.824s
    user    0m31.812s
    sys     0m0.012s
    

    It was still too much slow. So I tried some other versions. First I added some statistics to demonstrate that mine code is not computing all smaller numbers.

    Result for 99999: 9158*+1569**+**, length 15, (skipped 36777, computed 26350)
    

    Then I have tried change code to compute + solutions for bigger numbers first.

    Result for 99999: 1956**+9158*+**, length 15, (skipped 0, computed 34577)
    
    real    0m17.055s
    user    0m17.052s
    sys     0m0.008s
    

    It was almost as twice faster. But there was another idea that may be sometimes I give up find solution for some number as limited by current best_len limit. So I tried to make small numbers (up to half of n) unlimited (note 255 as best_len limit for first of operands finding).

    Result for 99999: 9158*+1569**+**, length 15, (skipped 36777, computed 50000)
    
    real    0m12.058s
    user    0m12.048s
    sys     0m0.008s
    

    Nice improvement but what if I limit solutions for those numbers by best solution found so far. It needs some sort of computation global state. Code becomes more complicated but result even faster.

    Result for 99999: 97484777**+**+*, length 15, (skipped 36997, computed 33911)
    
    real    0m10.401s
    user    0m10.400s
    sys     0m0.000s
    

    It was even able to compute ten times bigger number.

    Result for 999999: 37967+2599**+****, length 17, (skipped 440855)
    
    real    12m55.085s
    user    12m55.168s
    sys     0m0.028s
    

    Then I decided to try also brute force method and this was even faster.

    Result for 99999: 9158*+1569**+**, length 15
    
    real    0m3.543s
    user    0m3.540s
    sys     0m0.000s
    
    Result for 999999: 37949+2599**+****, length 17
    
    real    5m51.624s
    user    5m51.556s
    sys     0m0.068s
    

    Which shows, that constant matter. It is especially true for modern CPU when brute force approach gets advantage from better vectorization, better CPU cache utilization and less branching.

    Anyway, I think there is some better approach using better understanding of number theory or space searching by algorithms as A* and so. And for really big numbers there may be good idea to use genetic algorithms.

    Edit3:

    harold came with new idea to eliminate trying to much sums. I have implemented it in this new version. It is order of magnitude faster.

    $ time ./polish_numbers 99999
    Result for 99999: 9158*+1569**+**, length 15
    
    real    0m0.153s
    user    0m0.152s
    sys     0m0.000s
    $ time ./polish_numbers 999999
    Result for 999999: 37949+2599**+****, length 17
    
    real    0m3.516s
    user    0m3.512s
    sys     0m0.004s
    $ time ./polish_numbers 9999999
    Result for 9999999: 9788995688***+***+*, length 19
    
    real    1m39.903s
    user    1m39.904s
    sys     0m0.032s
    

提交回复
热议问题