fpu

FLD floating-point instruction

◇◆丶佛笑我妖孽 提交于 2019-12-31 05:44:10
问题 According to http://cs.smith.edu/~thiebaut/ArtOfAssembly/CH14/CH14-4.html#HEADING4-5 14.4.4.1 The FLD Instruction fld mem_32 fld mem_64[bx] My objective is store a constant 10 into my fPU stack. Why I cant do this? __asm { move bx, 0x0004; fld dword ptr[bx] or fld bx; //------- fld 0x004; //Since it is 32 bits? fild 0x004; } 回答1: At least three things can go wrong here. One is the syntax of the assembler. The second is instruction set architecture. The third is the memory model (16 bit vs 32

MSVC win32: convert extended precision float (80-bit) to double (64-bit)

女生的网名这么多〃 提交于 2019-12-29 04:54:24
问题 What is the most portable and "right" way to do conversion from extended precision float (80-bit value, also known as "long double" in some compilers) to double (64-bit) in MSVC win32/win64? MSVC currently (as of 2010) assumes that "long double" is "double" synonym. I could probably write fld/fstp assembler pair in inline asm, but inline asm is not available for win64 code in MSVC. Do I need to move this assembler code to separate .asm file? Is that really so there are no good solution? 回答1:

iPhone 4 and iPad 2: Advantages of fixed point arithmetic over floating point

怎甘沉沦 提交于 2019-12-24 01:25:49
问题 I've heard that the iPhone 4 and the iPad have a fpu called the VFP that in some way optimizes floating point arithmetic, even allowing SIMD (though whether GCC takes advantage of that is doubtful). However, I've read that for some Android devices, the speedup of using fixed point over floating point can lead to increases of 20x in performance. What would be the advantages of implementing a floating point-intensive part of my code using fixed point arithmetic over floating point in those

division as multiply and LUT ? / fast float division reciprocal

本小妞迷上赌 提交于 2019-12-22 14:59:23
问题 Is it possible to make a reciprocal of float division in form of look up table (such like 1/f -> 1*inv[f] ) ? How it could be done? I think some and mask and shift should be appled to float to make it a form of index? How would be it exectly? 回答1: You can guess an approximate inverse like this: int x = bit_cast<int>(f); x = 0x7EEEEEEE - x; float inv = bit_cast<float>(x); In my tests, 0x7EF19D07 was slightly better (tested with the effects of 2 Newton-Raphson refinements included). Which you

Floating point rounding when truncating

六眼飞鱼酱① 提交于 2019-12-22 10:58:58
问题 This is probably a question for an x86 FPU expert: I am trying to write a function which generates a random floating point value in the range [min,max]. The problem is that my generator algorithm (the floating-point Mersenne Twister, if you're curious) only returns values in the range [1,2) - ie, I want an inclusive upper bound, but my "source" generated value is from an exclusive upper bound. The catch here is that the underlying generator returns an 8-byte double, but I only want a 4-byte

How can I set and restore FPU CTRL registers?

别说谁变了你拦得住时间么 提交于 2019-12-22 07:07:21
问题 I can reset FPU's CTRL registers with this: http://support.microsoft.com/kb/326219 But how can I save current registers, and restore them later? It's from .net code.. What I'm doing, is from Delphi calling an .net dll as an COM module. Checking the Ctrl registers in delphi yield one value, checking with controlfp in the .net code gives another value. What I need, is in essential is to do this: _controlfp(_CW_DEFAULT, 0xfffff); So my floatingpoint calculations in the .net code does not crash,

Detect FPU rounding mode on a GPU

给你一囗甜甜゛ 提交于 2019-12-21 23:22:37
问题 I was delving into multi-precision arithmetics, and there is a nice fast class of algorithms, described in Jonathan Richard Shewchuk, "Adaptive Precision Floating-Point Arithmetic and Fast Robust Geometric Predicates", 1997, Discrete & Computational Geometry, pages: 305–363. However, these algorithms rely on the FPU using round-to-even tiebreaking. On CPU, it would be easy, one would just check or set the FPU state word and would be sure. However, there is no such instruction (yet?) for GPU

Benefits of x87 over SSE

此生再无相见时 提交于 2019-12-20 11:15:28
问题 I know that x87 has higher internal precision, which is probably the biggest difference that people see between it and SSE operations. But I have to wonder, is there any other benefit to using x87? I have a habit of typing -mfpmath=sse automatically in any project, and I wonder if I'm missing anything else that the x87 FPU offers. 回答1: For hand-written asm, x87 has some instructions that don't exist in the SSE instruction set. Off the top of my head, it's all trigonometric stuff like fsin,

Detecting FPU presence on Android

孤人 提交于 2019-12-17 20:34:45
问题 I want to get the most performance of my mobile application on Android. I would like to know if someone is aware of a trick to check if the phone possesses an FPU. After some research it seems that using FloatMath class is slower on a unit that possesses an FPU, so I would like to have best of both worlds. Most newer phones have an FPU, but I would like to get the most performance the device can offer. 回答1: It's a Linux kernel underneath, and at least the default Android configuration will

How to input and output real numbers in assembly language

烈酒焚心 提交于 2019-12-11 15:39:18
问题 We solve problems with real numbers in assembly language using FPU. Usually we write input and output code using C language or ready functions.For example: ; Receiving input and output descriptors for the console invoke GetStdHandle, STD_INPUT_HANDLE mov hConsoleInput, eax invoke GetStdHandle, STD_OUTPUT_HANDLE mov hConsoleOutput, eax invoke ClearScreen ;input X invoke WriteConsole, hConsoleOutput, ADDR aszPromptX,\ LENGTHOF aszPromptX - 1, ADDR BufLen, NULL invoke ReadConsole, hConsoleInput,