I am new to the SSE instructions and I was trying to learn them from this site: http://www.codeproject.com/Articles/4522/Introduction-to-SSE-Programming
I am using t
_aligned_malloc and _aligned_free are Microsoft-isms. Use posix_memalign or memalign on Linux et al. For Mac OS X you can just use malloc, as it is always 16 byte aligned. For portable SSE code you generally want to implement wrapper functions for aligned memory allocations, e.g.
void * malloc_simd(const size_t size)
{
#if defined WIN32 // WIN32
return _aligned_malloc(size, 16);
#elif defined __linux__ // Linux
return memalign(16, size);
#elif defined __MACH__ // Mac OS X
return malloc(size);
#else // other (use valloc for page-aligned memory)
return valloc(size);
#endif
}
Implementation of free_simd
is left as an exercise for the reader.