Simplification Algorithm for Reverse Polish Notation

后端未结

关注

 4  811

温柔的废话 2021-01-16 11:34

A couple of days ago I played around with Befunge which is an esoteric programming language. Befunge uses a LIFO stack to store data. When you write programs the digits fro

4条回答

佛祖请我去吃肉 (楼主)

2021-01-16 12:17
There is 44 solutions for 999 with lenght 9:
```
39149*+**
39166*+**
39257*+**
39548*+**
39756*+**
39947*+**
39499**+*
39669**+*
39949**+*
39966**+*
93149*+**
93166*+**
93257*+**
93548*+**
93756*+**
93947*+**
93269**+*
93349**+*
93366**+*
93439**+*
93629**+*
93636**+*
93926**+*
93934**+*
93939+*+*
93948+*+*
93957+*+*
96357**+*
96537**+*
96735**+*
96769+*+*
96778+*+*
97849+*+*
97858+*+*
97867+*+*
99689+*+*
956*99*+*
968*79*+*
39*149*+*
39*166*+*
39*257*+*
39*548*+*
39*756*+*
39*947*+*
```
Edit:

I have working on some search space pruning improvements so sorry I have not posted it immediately. There is script in Erlnag. Original one takes 14s for 999 but this one makes it in around 190ms.

Edit2:

There is 1074 solutions of length 13 for 9999. It takes 7 minutes and there is some of them below:
```
329+9677**+**
329+9767**+**
338+9677**+**
338+9767**+**
347+9677**+**
347+9767**+**
356+9677**+**
356+9767**+**
3147789+***+*
31489+77***+*
3174789+***+*
3177489+***+*
3177488*+**+*
```
There is version in C with more aggressive pruning of state space and returns only one solution. It is way faster.
```
$ time ./polish_numbers 999
Result for 999: 39149*+**, length 9

real    0m0.008s
user    0m0.004s
sys     0m0.000s

$ time ./polish_numbers 99999
Result for 99999: 9158*+1569**+**, length 15

real    0m34.289s
user    0m34.296s
sys     0m0.000s
```
harold was reporting his C# bruteforce version makes same number in 20s so I was curious if I can improve mine. I have tried better memory utilization by refactoring data structure. Searching algorithm mostly works with length of solution and it's existence so I separated this information to one structure (best_rec_header). I have also make solution as tree branches separated in another (best_rec_args). Those data are used only when new better solution for given number. There is code.
```
Result for 99999: 9158*+1569**+**, length 15

real    0m31.824s
user    0m31.812s
sys     0m0.012s
```
It was still too much slow. So I tried some other versions. First I added some statistics to demonstrate that mine code is not computing all smaller numbers.
```
Result for 99999: 9158*+1569**+**, length 15, (skipped 36777, computed 26350)
```
Then I have tried change code to compute + solutions for bigger numbers first.
```
Result for 99999: 1956**+9158*+**, length 15, (skipped 0, computed 34577)

real    0m17.055s
user    0m17.052s
sys     0m0.008s
```
It was almost as twice faster. But there was another idea that may be sometimes I give up find solution for some number as limited by current best_len limit. So I tried to make small numbers (up to half of n) unlimited (note 255 as best_len limit for first of operands finding).
```
Result for 99999: 9158*+1569**+**, length 15, (skipped 36777, computed 50000)

real    0m12.058s
user    0m12.048s
sys     0m0.008s
```
Nice improvement but what if I limit solutions for those numbers by best solution found so far. It needs some sort of computation global state. Code becomes more complicated but result even faster.
```
Result for 99999: 97484777**+**+*, length 15, (skipped 36997, computed 33911)

real    0m10.401s
user    0m10.400s
sys     0m0.000s
```
It was even able to compute ten times bigger number.
```
Result for 999999: 37967+2599**+****, length 17, (skipped 440855)

real    12m55.085s
user    12m55.168s
sys     0m0.028s
```
Then I decided to try also brute force method and this was even faster.
```
Result for 99999: 9158*+1569**+**, length 15

real    0m3.543s
user    0m3.540s
sys     0m0.000s

Result for 999999: 37949+2599**+****, length 17

real    5m51.624s
user    5m51.556s
sys     0m0.068s
```
Which shows, that constant matter. It is especially true for modern CPU when brute force approach gets advantage from better vectorization, better CPU cache utilization and less branching.

Anyway, I think there is some better approach using better understanding of number theory or space searching by algorithms as A* and so. And for really big numbers there may be good idea to use genetic algorithms.

Edit3:

harold came with new idea to eliminate trying to much sums. I have implemented it in this new version. It is order of magnitude faster.
```
$ time ./polish_numbers 99999
Result for 99999: 9158*+1569**+**, length 15

real    0m0.153s
user    0m0.152s
sys     0m0.000s
$ time ./polish_numbers 999999
Result for 999999: 37949+2599**+****, length 17

real    0m3.516s
user    0m3.512s
sys     0m0.004s
$ time ./polish_numbers 9999999
Result for 9999999: 9788995688***+***+*, length 19

real    1m39.903s
user    1m39.904s
sys     0m0.032s
```
0 讨论(0)

查看其它4个回答
发布评论:

提交评论
- 加载中...