precision

How do I quickly calculate a large positive, or negative, power of 2 in Java?

柔情痞子 提交于 2020-01-03 04:49:13
问题 I want to calculate powers of two larger than 2 62 , so I must store the result in a double and can't use the (1L << exp) trick. I also want to store fractions representing negative powers of two. 回答1: Java provides java.lang.Math.scalb(float f, int scaleFactor) for this. It multiplies f by 2 scaleFactor . 回答2: Since the IEEE 754 standard specifies a hidden bit, you can simply leave the 52-bit significand portion as 0 and only need to change the exponent portion, which is a biased unsigned

Does floating point sqrt() function guarantee order relation

橙三吉。 提交于 2020-01-03 02:23:06
问题 given two floating point number x and y, suppose all floating point arithmetic conforming the IEEE754 standard, and a certain implementation of square root function sqrt(), if x < y, is it true that sqrt(x) <= sqrt(y) must hold? if sqrt(x) < sqrt(y), is it true that x <= y must hold? Let a, b are two (precise) real number, and x = op(a), y = op(b), where op() denotes rounding a real number to its floating point representation. Then the following question: (* means floating point

Convert between Degree and Milliseconds

只谈情不闲聊 提交于 2020-01-03 02:21:14
问题 I know the formular for conversion from Degree to Milliseconds and vice-versa. It can be implemented like that: protected function decimal_to_milisecond($dec) { if (!empty($dec)) { $vars = explode(".",$dec); if (count($vars) == 2) { $deg = $vars[0]; $tempma = "0.".$vars[1]; $tempma = $tempma * 3600; $min = floor($tempma / 60); $sec = $tempma - ($min*60); return round((((($deg * 60) + $min) * 60 + $sec) * 1000)); } else return false; } else return false; } function milisecond_to_decimal($sec)

How to put Rmpfr values into a function in R?

一笑奈何 提交于 2020-01-02 19:29:21
问题 I am calculating the inverse of a Vandermonde Matrix. I have written the codes to calculate the inverse explicitly by its formula as below: library(gtools) #input is the generation vector of terms of Vandermonde matrix. FMinv <- function(base){ n=length(base) inv=matrix(nrow=n,ncol=n) for (i in 1:n){ for (j in 1:n){ if(j<n){ a=as.matrix(combinations(n,n-j,repeats.allowed = F)) arow.tmp=nrow(a) #this is in fact a[,1] b=which(a==i)%%length(a[,1]) nrowdel=length(b) b=replace(b,b==0,length(a[,1])

Range of floats with a given precision

孤者浪人 提交于 2020-01-02 07:49:52
问题 I want to create an array of all floating point numbers in the range [0.000, 1.000], all with 3 decimal places / precision of 4. e.g. >>> np.arange(start=0.000, stop=1.001, decimals=3) [0.000, 0.001, ..., 0.100, 0.101, ..., 0.900, ..., 0.999, 0.000] Can something along the lines of this be done? 回答1: You could use np.linspace: >>> import numpy as np >>> np.linspace(0, 1, 1001) array([ 0. , 0.001, 0.002, ..., 0.998, 0.999, 1. ]) or np.arange using integers and then dividing: >>> np.arange(0,

Best IEEE 754-1985 representation for X3.9-1978 based standard

吃可爱长大的小学妹 提交于 2020-01-02 06:00:38
问题 As per DICOM standard, a type of floating point can be stored using a Value Representation of Decimal String. See Table 6.2-1. DICOM Value Representations: Decimal String: A string of characters representing either a fixed point number or a floating point number. A fixed point number shall contain only the characters 0-9 with an optional leading "+" or "-" and an optional "." to mark the decimal point. A floating point number shall be conveyed as defined in ANSI X3.9, with an "E" or "e" to

Best IEEE 754-1985 representation for X3.9-1978 based standard

痴心易碎 提交于 2020-01-02 06:00:30
问题 As per DICOM standard, a type of floating point can be stored using a Value Representation of Decimal String. See Table 6.2-1. DICOM Value Representations: Decimal String: A string of characters representing either a fixed point number or a floating point number. A fixed point number shall contain only the characters 0-9 with an optional leading "+" or "-" and an optional "." to mark the decimal point. A floating point number shall be conveyed as defined in ANSI X3.9, with an "E" or "e" to

Precision nightmare in Java and SQL Server

别来无恙 提交于 2020-01-02 05:46:20
问题 I've been struggling with precision nightmare in Java and SQL Server up to the point when I don't know anymore. Personally, I understand the issue and the underlying reason for it, but explaining that to the client half way across the globe is something unfeasible (at least for me). The situation is this. I have two columns in SQL Server - Qty INT and Price FLOAT. The values for these are - 1250 and 10.8601 - so in order to get the total value its Qty * Price and result is 13575.124999999998

Qt binary reading error in qDatastream

一笑奈何 提交于 2020-01-02 05:43:28
问题 I am reading a binary file that is a produced by a sensor. I am having problem in reading float with different precision (32 or 64). I can read them in MATLAB (64 bit version) but Qt (32 bit version on windows) is giving wrong values. I can read till dtmth (please ref structure below) . After it I am getting a value Inf for baseline . This value is 0 in fact. As you can see, I changed MSB (LittleEndian). If I keep BigEndian, I get 0 for baseline but others values are wrong then. My desktop is

Qt binary reading error in qDatastream

强颜欢笑 提交于 2020-01-02 05:42:24
问题 I am reading a binary file that is a produced by a sensor. I am having problem in reading float with different precision (32 or 64). I can read them in MATLAB (64 bit version) but Qt (32 bit version on windows) is giving wrong values. I can read till dtmth (please ref structure below) . After it I am getting a value Inf for baseline . This value is 0 in fact. As you can see, I changed MSB (LittleEndian). If I keep BigEndian, I get 0 for baseline but others values are wrong then. My desktop is