awk: negative exponential is not correctly interpreted

孤者浪人 提交于 2020-06-16 03:37:38

问题


I have this table:

a   0
b   0
c   1.6149e-315
d   5.2587e-265
e   8.2045e-227
f   8.2045e-227

If I type

$awk '($2<1){print}' my_file.txt

it returns

a   0
b   0
d   5.2587e-265
e   8.2045e-227
f   8.2045e-227

but it considers the value in the third row, 1.6149e-315, to be larger than 1:

$awk '($2>1){print}' my_file.txt 
c   1.6149e-315

Which is the reason for this behaviour? Is a negative exponential <1e-300 too small so it removes the "e-" part? It looks so, since

$awk '($2>1.6149){print}' my_file.txt 
c   1.6149e-315

but if I run

$ awk '($2>1.615){print}' my_file.txt

nothing is output.

How can I overcome this problem?


回答1:


Run your awk like this:

awk '($2+0) < 1' file

This will output:

a   0
b   0
c   1.6149e-315
d   5.2587e-265
e   8.2045e-227
f   8.2045e-227

$2+0 converts $2 into a numeric value.

btw on GNU Awk 5.0.1, I get correct output even without this trick.




回答2:


Reproduced the OP's issue with GNU Awk 4.2.1.


First of all, $NF+0 seems not to solve this issue, as we can see in this example:
> cat file
a   0
b   0
c   1.6149e-315
d   5.2587e-265
e   8.2045e-227
f   8.2045e-227

> awk '$2+0>0' file
d   5.2587e-265
e   8.2045e-227
f   8.2045e-227

The third number of the sample input is not printed again, while it should be greater than zero.

And here we see only zeros for the third number.

awk '{printf "%.320f\n",$2+0}' file

The above are indicating that e^-315 is not being represented in an expected way.


It seems like you have exceeded the limit which is -308 for double-precision floating point. Around e^-308 is the minimum positive non zero value to be represented.

https://www.gnu.org/software/gawk/manual/gawk.html#Computer-Arithmetic


Furthermore, if your gnu awk is compiled with MPFR support, you can have multiple precision numbers, using the -M option, which seems the only way to represent a positive number less that 10^-308

https://www.gnu.org/software/gawk/manual/html_node/MPFR-features.html


One last argument, a simple test:

> cat file
a   1.1e-312
b   1.1e-311
c   1.1e-310
d   1.1e-309
e   1.1e-308
f   1.1e-307
g   1.1e-306
h   1.1e-305
> awk '$2+0>0' file
f   1.1e-307
g   1.1e-306
h   1.1e-305

exponents less than -308 are not treated as expected.

> awk '{print($2+0)}' file
0
0
0
0
0
1.1e-307
1.1e-306
1.1e-305

and this is the proof, that $NF+0 forces to zero and not to the expontential number, any numbers beyond ^-308 cannot be represented because of that limit, which exists for awk instances with double-precision and no support for multi-precision.



来源:https://stackoverflow.com/questions/62196591/awk-negative-exponential-is-not-correctly-interpreted

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!