I am googling the question for past hour, but there are only points to Taylor Series or some sample code that is either too slow or does not compile at all. Well, most answe
Just use the FPU with inline x86 for Wintel apps. The direct CPU sqrt function is reportedly still beating any other algorithms in speed. My custom x86 Math library code is for standard MSVC++ 2005 and forward. You need separate float/double versions if you want more precision which I covered. Sometimes the compiler's "__inline" strategy goes bad, so to be safe, you can remove it. With experience, you can switch to macros to totally avoid a function call each time.
extern __inline float __fastcall fs_sin(float x);
extern __inline double __fastcall fs_Sin(double x);
extern __inline float __fastcall fs_cos(float x);
extern __inline double __fastcall fs_Cos(double x);
extern __inline float __fastcall fs_atan(float x);
extern __inline double __fastcall fs_Atan(double x);
extern __inline float __fastcall fs_sqrt(float x);
extern __inline double __fastcall fs_Sqrt(double x);
extern __inline float __fastcall fs_log(float x);
extern __inline double __fastcall fs_Log(double x);
extern __inline float __fastcall fs_sqrt(float x) { __asm {
FLD x ;// Load/Push input value
FSQRT
}}
extern __inline double __fastcall fs_Sqrt(double x) { __asm {
FLD x ;// Load/Push input value
FSQRT
}}
extern __inline float __fastcall fs_sin(float x) { __asm {
FLD x ;// Load/Push input value
FSIN
}}
extern __inline double __fastcall fs_Sin(double x) { __asm {
FLD x ;// Load/Push input value
FSIN
}}
extern __inline float __fastcall fs_cos(float x) { __asm {
FLD x ;// Load/Push input value
FCOS
}}
extern __inline double __fastcall fs_Cos(double x) { __asm {
FLD x ;// Load/Push input value
FCOS
}}
extern __inline float __fastcall fs_tan(float x) { __asm {
FLD x ;// Load/Push input value
FPTAN
}}
extern __inline double __fastcall fs_Tan(double x) { __asm {
FLD x ;// Load/Push input value
FPTAN
}}
extern __inline float __fastcall fs_log(float x) { __asm {
FLDLN2
FLD x
FYL2X
FSTP ST(1) ;// Pop1, Pop2 occurs on return
}}
extern __inline double __fastcall fs_Log(double x) { __asm {
FLDLN2
FLD x
FYL2X
FSTP ST(1) ;// Pop1, Pop2 occurs on return
}}