I\'m looking to optimize this linear search:
static int
linear (const int *arr, int n, int key)
{
int i = 0;
while (i < n) {
this might force vector instructions (suggested by Gman):
for (int i = 0; i < N; i += 4) {
bool found = false;
found |= (array[i+0] >= key);
...
found |= ( array[i+3] >= key);
// slight variation would be to use max intrinsic
if (found) return i;
}
...
// quick search among four elements
this also makes fewer branch instructions. you make help by ensuring input array is aligned to 16 byte boundary
another thing that may help vectorization (doing vertical max comparison):
for (int i = 0; i < N; i += 8) {
bool found = false;
found |= max(array[i+0], array[i+4]) >= key;
...
found |= max(array[i+3], array[i+7] >= key;
if (found) return i;
}
// have to search eight elements