问题
I have this table:
a 0
b 0
c 1.6149e-315
d 5.2587e-265
e 8.2045e-227
f 8.2045e-227
If I type
$awk '($2<1){print}' my_file.txt
it returns
a 0
b 0
d 5.2587e-265
e 8.2045e-227
f 8.2045e-227
but it considers the value in the third row, 1.6149e-315, to be larger than 1:
$awk '($2>1){print}' my_file.txt
c 1.6149e-315
Which is the reason for this behaviour? Is a negative exponential <1e-300 too small so it removes the "e-" part? It looks so, since
$awk '($2>1.6149){print}' my_file.txt
c 1.6149e-315
but if I run
$ awk '($2>1.615){print}' my_file.txt
nothing is output.
How can I overcome this problem?
回答1:
Run your awk like this:
awk '($2+0) < 1' file
This will output:
a 0
b 0
c 1.6149e-315
d 5.2587e-265
e 8.2045e-227
f 8.2045e-227
$2+0
converts $2
into a numeric value.
btw on GNU Awk 5.0.1
, I get correct output even without this trick.
回答2:
Reproduced the OP's issue with GNU Awk 4.2.1
.
First of all,
$NF+0
seems not to solve this issue, as we can see in this example:
> cat file
a 0
b 0
c 1.6149e-315
d 5.2587e-265
e 8.2045e-227
f 8.2045e-227
> awk '$2+0>0' file
d 5.2587e-265
e 8.2045e-227
f 8.2045e-227
The third number of the sample input is not printed again, while it should be greater than zero.
And here we see only zeros for the third number.
awk '{printf "%.320f\n",$2+0}' file
The above are indicating that e^-315
is not being represented in an expected way.
It seems like you have exceeded the limit which is -308
for double-precision floating point. Around e^-308
is the minimum positive non zero value to be represented.
https://www.gnu.org/software/gawk/manual/gawk.html#Computer-Arithmetic
Furthermore, if your gnu awk is compiled with MPFR support, you can have multiple precision numbers, using the -M option, which seems the only way to represent a positive number less that 10^-308
https://www.gnu.org/software/gawk/manual/html_node/MPFR-features.html
One last argument, a simple test:
> cat file
a 1.1e-312
b 1.1e-311
c 1.1e-310
d 1.1e-309
e 1.1e-308
f 1.1e-307
g 1.1e-306
h 1.1e-305
> awk '$2+0>0' file
f 1.1e-307
g 1.1e-306
h 1.1e-305
exponents less than -308
are not treated as expected.
> awk '{print($2+0)}' file
0
0
0
0
0
1.1e-307
1.1e-306
1.1e-305
and this is the proof, that $NF+0
forces to zero and not to the expontential number, any numbers beyond ^-308
cannot be represented because of that limit, which exists for awk instances with double-precision and no support for multi-precision.
来源:https://stackoverflow.com/questions/62196591/awk-negative-exponential-is-not-correctly-interpreted