fpu | 易学教程

Performance comparison of FPU with software emulation

阅读更多关于 Performance comparison of FPU with software emulation

问题 While I know (so I have been told) that Floating-point coprocessors work faster than any software implementation of floating-point arithmetic, I totally lack the gut feeling how large this difference is, in order of magnitudes. The answer probably depends on the application and where you work, between microprocessors and supercomputers. I am particularly interested in computer simulations. Can you point out articles or papers for this question? 回答1: A general answer will obviously very vague,

Floating point rounding when truncating

阅读更多关于 Floating point rounding when truncating

This is probably a question for an x86 FPU expert: I am trying to write a function which generates a random floating point value in the range [min,max]. The problem is that my generator algorithm (the floating-point Mersenne Twister, if you're curious) only returns values in the range [1,2) - ie, I want an inclusive upper bound, but my "source" generated value is from an exclusive upper bound. The catch here is that the underlying generator returns an 8-byte double, but I only want a 4-byte float, and I am using the default FPU rounding mode of Nearest. What I want to know is whether the

Fast float to int conversion and floating point precision on ARM (iPhone 3GS/4)

阅读更多关于 Fast float to int conversion and floating point precision on ARM (iPhone 3GS/4)

I read ( http://www.stereopsis.com/FPU.html ) mentioned in ( What is the fastest way to convert float to int on x86 ). Does anyone know if the slow simple cast (see snippet below) does apply to ARM architecture, too? inline int Convert(float x) { int i = (int) x; return i; } To apply some tricks mentioned in the FPU article you have to set the precision for floating point operations. How do I do that on ARM? What is the fastest float to int conversion on ARM architecture? Thanks! Short version, "no". That article is ancient and doesn't even apply to modern x86 systems, let alone ARM. A simple

What algorithms do FPUs use to compute transcendental functions?

阅读更多关于 What algorithms do FPUs use to compute transcendental functions?

What methods would a modern FPU use to compute transcendental functions ? For example, Intel CPUs provide instructions such as FSIN , FCOS , FYL2X , etc. I am curious as to what algorithms would be used to actually implement these in hardware. My naïve guess would be Taylor series perhaps combined with some lookup tables, but that's nothing more than a wild guess. Please enlighten me. P.S. This question is more general than just Intel hardware. One place to start could be " New Algorithms for Improved Transcendental Functions on IA-64 " by Shane Story and Ping Tak Peter Tang, both from Intel.

Pow implementation for double

阅读更多关于 Pow implementation for double

问题 I am developing a code that will be used for motion control and I am having a issue with the pow function. I am using VS2010 as IDE. Here is my issue: I have: double p = 100.0000; double d = 1000.0000; t1 = pow((p/(8.0000*d),1.00/4.000); When evaluating this last function, I don't get the better approximation as result. I am getting a 7 decimal digits correct, and the consequent digits are all trash. I am guessing that pow function only casts any input variable as float and proceds with

Detect FPU rounding mode on a GPU

阅读更多关于 Detect FPU rounding mode on a GPU

I was delving into multi-precision arithmetics, and there is a nice fast class of algorithms, described in Jonathan Richard Shewchuk, "Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates", 1997, Discrete & Computational Geometry, pages: 305–363. However, these algorithms rely on the FPU using round-to-even tiebreaking. On CPU, it would be easy, one would just check or set the FPU state word and would be sure. However, there is no such instruction (yet?) for GPU programming. That is why I was wondering if there is a dependable way of detecting (not setting) the

Third party code is modifying the FPU control word

阅读更多关于 Third party code is modifying the FPU control word

问题 The introduction - the long and boring part (The question is at the end) I am getting severe head aches over a third party COM component that keeps changing the FPU control word. My development environment is Windows and Visual C++ 2008. The normal FPU control word specifies that no exceptions should be thrown during various conditions. I have verified this with both looking at the _CW_DEFAULT macro found in float.h , as well as looking at the control word in the debugger at startup.

Performance comparison of FPU with software emulation

阅读更多关于 Performance comparison of FPU with software emulation

While I know (so I have been told) that Floating-point coprocessors work faster than any software implementation of floating-point arithmetic, I totally lack the gut feeling how large this difference is, in order of magnitudes. The answer probably depends on the application and where you work, between microprocessors and supercomputers. I am particularly interested in computer simulations. Can you point out articles or papers for this question? A general answer will obviously very vague, because performance depends on so many factors. However, based on my understanding, in processors that do

Pow implementation for double

阅读更多关于 Pow implementation for double

I am developing a code that will be used for motion control and I am having a issue with the pow function. I am using VS2010 as IDE. Here is my issue: I have: double p = 100.0000; double d = 1000.0000; t1 = pow((p/(8.0000*d),1.00/4.000); When evaluating this last function, I don't get the better approximation as result. I am getting a 7 decimal digits correct, and the consequent digits are all trash. I am guessing that pow function only casts any input variable as float and proceds with calculation. Am I right? If so, is there any code I can get "inspired" with to reimplement pow for a better

Benefits of x87 over SSE

阅读更多关于 Benefits of x87 over SSE

I know that x87 has higher internal precision, which is probably the biggest difference that people see between it and SSE operations. But I have to wonder, is there any other benefit to using x87? I have a habit of typing -mfpmath=sse automatically in any project, and I wonder if I'm missing anything else that the x87 FPU offers. Nils Pipenbrinck For hand-written asm, x87 has some instructions that don't exist in the SSE instruction set. Off the top of my head, it's all trigonometric stuff like fsin, fcos, fatan, fatan2 and some exponential/logarithm stuff. With gcc -O3 -ffast-math -mfpmath