Vector Sum using AVX Inline Assembly on XeonPhi
问题 I am new to use XeonPhi Intel co-processor. I want to write code for a simple Vector sum using AVX 512 bit instructions. I use k1om-mpss-linux-gcc as a compiler and want to write inline assembly. Here it is my code: #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/time.h> #include <assert.h> #include <stdint.h> void* aligned_malloc(size_t size, size_t alignment) { uintptr_t r = (uintptr_t)malloc(size + --alignment + sizeof(uintptr_t)); uintptr_t t = r + sizeof(uintptr