floating-accuracy

How to avoid less precise sum for numpy-arrays with multiple columns

ぐ巨炮叔叔 提交于 2019-12-01 21:28:52
I've always assumed, that numpy uses a kind of pairwise-summation , which ensures high precision also for float32 - operations: import numpy as np N=17*10**6 # float32-precision no longer enough to hold the whole sum print(np.ones((N,1),dtype=np.float32).sum(axis=0)) # [17000000.], kind of expected However, it looks as if a different algorithm is used if the matrix has more than one column: print(np.ones((N,2),dtype=np.float32).sum(axis=0)) # [16777216. 16777216.] the error is just to big print(np.ones((2*N,2),dtype=np.float32).sum(axis=0)) # [16777216. 16777216.] error is bigger Probably sum

C/C++: 1.00000 <= 1.0f = False

牧云@^-^@ 提交于 2019-12-01 19:40:45
问题 Can someone explain why 1.000000 <= 1.0f is false? The code: #include <iostream> #include <stdio.h> using namespace std; int main(int argc, char **argv) { float step = 1.0f / 10; float t; for(t = 0; t <= 1.0f; t += step) { printf("t = %f\n", t); cout << "t = " << t << "\n"; cout << "(t <= 1.0f) = " << (t <= 1.0f) << "\n"; } printf("t = %f\n", t ); cout << "t = " << t << "\n"; cout << "(t <= 1.0f) = " << (t <= 1.0f) << "\n"; cout << "\n(1.000000 <= 1.0f) = " << (1.000000 <= 1.0f) << "\n"; }

Is there a gcc flag to catch integer truncation?

若如初见. 提交于 2019-12-01 18:36:34
问题 Is there a gcc flag to signal a warning/error when I try to put a double value into an int variable? I currently have -Wall -Wextra -Werror set but I still don't get warned when I (for instance) pass a double to an int parameter, even though I'm losing information. 回答1: You can use the -Wconversion option. From GCC's manual (emphasis mine) : Warn for implicit conversions that may alter a value. This includes conversions between real and integer , like abs (x) when x is double; conversions

Will (int)pow(n,m) be wrong for some positive integers n,m?

旧巷老猫 提交于 2019-12-01 18:31:07
问题 Assuming n and m are positive integers, and n m is within the range of an integer, will (int)pow(n,m) ever give a wrong answer? I have tried many n for m=2 and have not gotten any wrong answers so far. 回答1: The C standard does not impose any requirements on the accuracy of floating point arithmetic. The accuracy is implementation-defined which means that implementations are required to document it. However, implementations are left with a significant "out": (§5.2.4.2.2 paragraph 6, emphasis

Can floating-point precision be thread-dependent?

半城伤御伤魂 提交于 2019-12-01 17:14:46
I have a small 3D vector class in C# 3.0 based on struct that uses double as basic unit. An example: One vector's y-value is -20.0 straight I subtract a vector with an y-value of 10.094999999999965 The value for y I would expect is -30.094999999999963 (1) Instead I get -30.094999313354492 (2) When I'm doing the whole computation in one single thread, I get (1). Also the debugger and VS quick-watch returns (1). But, when I run a few iterations in one thread and then call the function from a different thread, the result is (2). Now, the debugger returns (2) as well! We have to keep in mind the

Can floating-point precision be thread-dependent?

[亡魂溺海] 提交于 2019-12-01 17:13:28
问题 I have a small 3D vector class in C# 3.0 based on struct that uses double as basic unit. An example: One vector's y-value is -20.0 straight I subtract a vector with an y-value of 10.094999999999965 The value for y I would expect is -30.094999999999963 (1) Instead I get -30.094999313354492 (2) When I'm doing the whole computation in one single thread, I get (1). Also the debugger and VS quick-watch returns (1). But, when I run a few iterations in one thread and then call the function from a

Why does CLng produce different results?

本秂侑毒 提交于 2019-12-01 15:48:41
Here's a little gem directly from my VBE (MS Excel 2007 VBA): ?clng(150*0.85) 127 x = 150*0.85 ?clng(x) 128 Can anybody explain this behaviour? IMHO the first expression should yield 128 (.5 rounded to nearest even), or at least should both results be equal. Rick Regan I think wqw is right, but I'll give the details. In the statement clng(150 * 0.85) , 150 * 0.85 is calculated in extended-precision: 150 = 1.001011 x 2^7 0.85 in double precision = 1.1011001100110011001100110011001100110011001100110011 x 2^-1 Multiply these by hand and you get 1

Why is my number being rounded incorrectly?

不羁的心 提交于 2019-12-01 15:28:09
问题 This feels like the kind of code that only fails in-situ, but I will attempt to adapt it into a code snippet that represents what I'm seeing. float f = myFloat * myConstInt; /* Where myFloat==13.45, and myConstInt==20 */ int i = (int)f; int i2 = (int)(myFloat * myConstInt); After stepping through the code, i==269, and i2==268. What's going on here to account for the difference? 回答1: Float math can be performed at higher precision than advertised. But as soon as you store it in float f, that

Java float unexpectedly rounded

淺唱寂寞╮ 提交于 2019-12-01 14:28:12
I'm using a float constant and setting a objects private float variable to the float constant below, but when the object outputs the value it was set to, it's rounding up the last digit in the float. private final float RF_FREQUENCY = 956.35625f; Object o = new Object(); o.setRFFrequency(RF_FREQUENCY); System.out.println(o.getRFFrequency); Output: 956.35626 The variable in the object is declared as protected float rfFrequency; and below are the getters and setters. public float getRFFrequency() { return rfFrequency; } public void setRFFrequency(float value) { this.rfFrequency = value; } Any

Java float unexpectedly rounded

泪湿孤枕 提交于 2019-12-01 12:33:03
问题 I'm using a float constant and setting a objects private float variable to the float constant below, but when the object outputs the value it was set to, it's rounding up the last digit in the float. private final float RF_FREQUENCY = 956.35625f; Object o = new Object(); o.setRFFrequency(RF_FREQUENCY); System.out.println(o.getRFFrequency); Output: 956.35626 The variable in the object is declared as protected float rfFrequency; and below are the getters and setters. public float getRFFrequency