fixed-point

Floating Point Algorithms in C

☆樱花仙子☆ 提交于 2019-12-12 11:35:07
问题 I am thinking recently on how floating point math works on computers and is hard for me understand all the tecnicals details behind the formulas. I would need to understand the basics of addition, subtraction, multiplication, division and remainder. With these I will be able to make trig functions and formulas. I can guess something about it, but its a bit unclear. I know that a fixed point can be made by separating a 4 byte integer by a signal flag, a radix and a mantissa. With this we have

Fixed Point to Floating Point and Backwards

柔情痞子 提交于 2019-12-11 14:46:02
问题 Is converting Fixed Pt. (fixed n bit for fraction) to IEEE double safe ? ie: does IEEE double format can represent all numbers a fixed point can represent ? The test: a number goes to floating pt format then back to it's original fixed pt format. 回答1: Assuming your fixed point numbers are stored as 32-bit integers, yes, IEEE double precision can represent any value representable in fixed point. This is because double has a 53-bit mantissa, your fixed point values only have 32 bits of

Float to fixed conversion

旧城冷巷雨未停 提交于 2019-12-10 11:47:31
问题 This is a basic question but I am confused. I have a register which has the format 1.4.12. Meaning it takes a float and takes the range -15.9999 - 15.9999, is that correct, or how many nines? I am confused by the range. I need to convert a c++ float to fixed point and put it in the register? Are there any std:: libraries to do that in C? If not is there any standard code that someone could point me to? Also, how to convert fixed to float would be good? 回答1: It's fairly simple to do this

Power of 2 approximation in fixed point

六月ゝ 毕业季﹏ 提交于 2019-12-09 23:36:16
问题 Currently, I am using a small lookup table and linear interpolation which is quite fast and also accurate enough (max error is less than 0.001). However I was wondering if there is an approximation which is even faster. Since the integer part of the exponent can be extracted and calculated by bitshifts, the approximation just needs to work in the range [-1,1] I have tried to find a chebyshev polynomial, but could not achieve a good accuracy for polynomials of low order. I could live with a

Inverse sqrt for fixed point

≡放荡痞女 提交于 2019-12-09 18:20:28
问题 I am looking for the best inverse square root algorithm for fixed point 16.16 numbers. The code below is what I have so far(but basically it takes the square root and divides by the original number, and I would like to get the inverse square root without a division). If it changes anything, the code will be compiled for armv5te. uint32_t INVSQRT(uint32_t n) { uint64_t op, res, one; op = ((uint64_t)n<<16); res = 0; one = (uint64_t)1 << 46; while (one > op) one >>= 2; while (one != 0) { if (op

Advantages and disadvantages of floating point and fixed point representations [closed]

非 Y 不嫁゛ 提交于 2019-12-09 07:23:12
问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 2 years ago . I have been trying for the last three days to understand the exact differences between floating and fixed point representations. I am confused reading the material and I'm unable to decide what is right and what is wrong. One of the problems is with the meaning of few technical

Log2 approximation in fixed-point

岁酱吖の 提交于 2019-12-08 07:51:05
问题 I'v already implemented fixed-point log2 function using lookup table and low-order polynomial approximation but not quite happy with accuracy across the entire 32-bit fixed-point range [-1,+1). The input format is s0.31 and the output format is s15.16. I'm posting this question here so that another user can post his answer (some comments were exchanged in another thread but they prefer to provide comprehensive answer in a separate thread). Any other answers are welcome, I would much

C++: Emulated Fixed Point Division/Multiplication

与世无争的帅哥 提交于 2019-12-08 06:44:43
问题 I'm writing a Fixedpoint class, but have ran into bit of a snag... The multiplication, division portions, I am not sure how to emulate. I took a very rough stab at the division operator but I am sure it's wrong. Here's what it looks like so far: class Fixed { Fixed(short int _value, short int _part) : value(long(_value + (_part >> 8))), part(long(_part & 0x0000FFFF)) {}; ... inline Fixed operator -() const // example of some of the bitwise it's doing { return Fixed(-value - 1, (~part)

Add saturate 32-bit signed ints intrinsics?

落爺英雄遲暮 提交于 2019-12-07 16:31:14
问题 Can someone recommend a fast way to add saturate 32-bit signed integers using Intel intrinsics (AVX, SSE4 ...) ? I looked at the intrinsics guide and found _mm256_adds_epi16 but this seems to only add 16-bit ints. I don't see anything similar for 32 bits. The other calls seem to wrap around. 回答1: A signed overflow will happen if (and only if): the signs of both inputs are the same, and the sign of the sum (when added with wrap-around) is different from the input Using C-Operators: overflow =

Square root of s15.16 fixed point number in Java

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-07 13:48:12
问题 I want to write a function to calculate the square root of a s15.16 fixed point number. I know its a signed number with 15 digit int and 16 digit fraction. Is there anyway to do it without any libraries? Any other languages is fine too. 回答1: I assume you are asking this question because the platform you are on does not provide floating-point, otherwise you can implement 15.16 fixed-point square root via the floating-point square root as follows (this is C code, I assume Java code will look