Test code:
#include
#include
const int N = 4096;
const float PI = 3.1415926535897932384626;
float cosine[N][N];
float sine[N][
I guess the difference is that there are overloads for std::sin() for float and for double, while sin() only takes double. Inside std::sin() for floats, there may be a conversion to double, then a call to std::sin() for doubles, and then a conversion of the result back to float, making it slower.
You're using a different overload:
Try
double angle = i*j*2*PI/N;
cosine[i][j] = cos(angle);
sine[i][j] = sin(angle);
it should perform the same with or without using namespace std;
Use -S flag in compiler command line and check the difference between assembler output. Maybe using namespace std; gives a lot of unused stuff in executable file.
I did some measurements using clang with -O3 optimization, running on an Intel Core i7. I found that:
std::sin on float has the same cost as sinfstd::sin on double has the same cost as sindouble are 2.5x slower than on float (again, running on an Intel Core i7).Here is the full code to reproduce it:
#include <chrono>
#include <cmath>
#include <iostream>
template<typename Clock>
struct Timer
{
using rep = typename Clock::rep;
using time_point = typename Clock::time_point;
using resolution = typename Clock::duration;
Timer(rep& duration) :
duration(&duration) {
startTime = Clock::now();
}
~Timer() {
using namespace std::chrono;
*duration = duration_cast<resolution>(Clock::now() - startTime).count();
}
private:
time_point startTime;
rep* duration;
};
template<typename T, typename F>
void testSin(F sin_func) {
using namespace std;
using namespace std::chrono;
high_resolution_clock::rep duration = 0;
T sum {};
{
Timer<high_resolution_clock> t(duration);
for(int i=0; i<100000000; ++i) {
sum += sin_func(static_cast<T>(i));
}
}
cout << duration << endl;
cout << " " << sum << endl;
}
int main() {
testSin<float> ([] (float v) { return std::sin(v); });
testSin<float> ([] (float v) { return sinf(v); });
testSin<double>([] (double v) { return std::sin(v); });
testSin<double>([] (double v) { return sin(v); });
return 0;
}
I'd be interested if people could report, in the comments on the results on their architectures, especially regarding float vs. double time.