precision

Getting the fractional part of a double value in integer without losing precision

霸气de小男生 提交于 2019-12-22 10:36:43
问题 i want to convert the fractional part of a double value with precision upto 4 digits into integer. but when i do it, i lose precision. Is there any way so that i can get the precise value? #include<stdio.h> int main() { double number; double fractional_part; int output; number = 1.1234; fractional_part = number-(int)number; fractional_part = fractional_part*10000.0; printf("%lf\n",fractional_part); output = (int)fractional_part; printf("%d\n",output); return 0; } i am expecting output to be

How to generate random double numbers with high precision in C++?

非 Y 不嫁゛ 提交于 2019-12-22 10:28:08
问题 I am trying to generate a number of series of double random numbers with high precision. For example, 0.856365621 (has 9 digits after decimal). I've found some methods from internet, however, they do generate double random number, but the precision is not as good as I request (only 6 digits after the decimal). Thus, may I know how to achieve my goal? 回答1: In C++11 you can using the <random> header and in this specific example using std::uniform_real_distribution I am able to generate random

Taking logs and adding versus multiplying

你离开我真会死。 提交于 2019-12-22 10:22:20
问题 If I want to take the product of a list of floating point numbers, what's the worst-case/average-case precision lost by adding their logs and then taking exp of the sum as opposed to just multiplying them. Is there ever a case when this is actually more precise? 回答1: Absent any overflow or underflow shenanigans, if a and b are floating-point numbers, then the product a*b will be computed to within a relative error of 1/2 ulp. A crude bound on the relative error after multiplying a chain of N

Does a floating-point reciprocal always round-trip?

五迷三道 提交于 2019-12-22 09:23:35
问题 For IEEE-754 arithmetic, is there a guarantee of 0 or 1 units in the last place accuracy for reciprocals? From that, is there a guaranteed error-bound on the reciprocal of a reciprocal? 回答1: [Everything below assumes a fixed IEEE 754 binary format, with some form of round-to-nearest as the rounding-mode.] Since reciprocal (computed as 1/x ) is a basic arithmetic operation, 1 is exactly representable, and the arithmetic operations are guaranteed correctly rounded by the standard, the

Converting a precision double to a string

别等时光非礼了梦想. 提交于 2019-12-22 09:21:50
问题 I have a large number in c++ stored as a precise double value (assuming the input 'n' is 75): 2.4891e+109 Is there any way to convert this to a string or an array of each individual digit? Here's my code so far, although it's not entirely relevant to the question: int main(){ double n = 0; cout << "Giz a number: "; cin >> n; double val = 1; for(double i = 1; i <= n; i++){ val = val * i; } //Convert val to string/array here? } 回答1: std::stringstream str; str << fixed << setprecision( 15 ) <<

“possible loss of precision” is Java going crazy or I'm missing something?

这一生的挚爱 提交于 2019-12-22 08:59:28
问题 I'm getting a "loss of precision" error when there should be none, AFAIK. this is an instance variable: byte move=0; this happens in a method of this class: this.move=(this.move<<4)|(byte)(Guy.moven.indexOf("left")&0xF); move is a byte, move is still a byte, and the rest is being cast to a byte. I get this error: [javac] /Users/looris/Sviluppo/dumdedum/client/src/net/looris/android/toutry/Guy.java:245: possible loss of precision [javac] found : int [javac] required: byte [javac] this.move=

How do printf and scanf handle floating point precision formats?

戏子无情 提交于 2019-12-22 07:08:29
问题 Consider the following snippet of code: float val1 = 214.20; double val2 = 214.20; printf("float : %f, %4.6f, %4.2f \n", val1, val1, val1); printf("double: %f, %4.6f, %4.2f \n", val2, val2, val2); Which outputs: float : 214.199997, 214.199997, 214.20 | <- the correct value I wanted double: 214.200000, 214.200000, 214.20 | I understand that 214.20 has an infinite binary representation. The first two elements of the first line have an approximation of the intended value, but the the last one

scipy eigh gives negative eigenvalues for positive semidefinite matrix

假如想象 提交于 2019-12-22 05:12:02
问题 I am having some issues with scipy's eigh function returning negative eigenvalues for positive semidefinite matrices. Below is a MWE. The hess_R function returns a positive semidefinite matrix (it is the sum of a rank one matrix and a diagonal matrix, both with nonnegative entries). import numpy as np from scipy import linalg as LA def hess_R(x): d = len(x) H = np.ones(d*d).reshape(d,d) / (1 - np.sum(x))**2 H = H + np.diag(1 / (x**2)) return H.astype(np.float64) x = np.array([ 9.98510710e-02

x86 80-bit floating point type in Java

僤鯓⒐⒋嵵緔 提交于 2019-12-22 04:49:38
问题 I want to emulate the x86 extended precision type and perform arithmetic operations and casts to other types in Java. I could try to implement it using BigDecimal, but covering all the special cases around NaNs, infinity, and casts would probably a tedious task. I am aware of some libraries that provide other floating types with a higher precision than double, but I want to have the same precision as the x86 80-bit float. Is there a Java library that provides such a floating point type? If

Calculating a round order of magnitude

青春壹個敷衍的年華 提交于 2019-12-21 23:02:38
问题 For a simple project I have to make large numbers (e.g. 4294967123) readable, so I'm writing only the first digits with a prefix (4294967123 -> 4.29G, 12345 -> 12.34K etc.) The code (simplified) looks like this: const char* postfixes=" KMGT"; char postfix(unsigned int x) { return postfixes[(int) floor(log10(x))]; } It works, but I think that there's a more elegant/better solution than computing the full precision logarithm, rounding it and casting it down to an int again. Other solutions I