icc

What are good heuristics for inlining functions?

十年热恋 提交于 2019-12-01 04:41:56
Considering that you're trying solely to optimize for speed, what are good heuristics for deciding whether to inline a function or not? Obviously code size should be important, but are there any other factors typically used when (say) gcc or icc is determining whether to inline a function call? Has there been any significant academic work in the area? Wikipedia has a few paragraphs about this, with some links at the bottom: In addition to memory size and cache issues, another consideration is register pressure . From the compiler's point of view "the added variables from the inlined procedure

Deleted Function in std::pair when using a unique_ptr inside a map

邮差的信 提交于 2019-12-01 03:37:28
I have a piece of C++ code for which I am not sure whether it is correct or not. Consider the following code. #include <memory> #include <vector> #include <map> using namespace std; int main(int argc, char* argv[]) { vector<map<int, unique_ptr<int>>> v; v.resize(5); return EXIT_SUCCESS; } The GCC compiles this code without a problem. The Intel compiler (version 19), however, stops with an error: /usr/local/ [...] /include/c++/7.3.0/ext/new_allocator.h(136): error: function "std::pair<_T1, _T2>::pair(const std::pair<_T1, _T2> &) [with _T1=const int, _T2=std::unique_ptr<int, std::default_delete

Are compilers allowed to remove infinite loops like Intel C++ Compiler with -O2 does?

落爺英雄遲暮 提交于 2019-11-30 18:21:14
The following testing code does correctly in VS either with debug or release, and also in GCC. It also does correctly for ICC with debug, but not when optimization enabled ( -O2 ). #include <cstdio> class tClassA{ public: int m_first, m_last; tClassA() : m_first(0), m_last(0) {} ~tClassA() {} bool isEmpty() const {return (m_first == m_last);} void updateFirst() {m_first = m_first + 1;} void updateLast() {m_last = m_last + 1;} void doSomething() {printf("should not reach here\r\n");} }; int main() { tClassA q; while(true) { while(q.isEmpty()) ; q.doSomething(); } return 1; } It is supposed to

How to set icc color profile in Java and change colorspace

一曲冷凌霜 提交于 2019-11-30 10:25:50
First, I would like to say I'm not an image processing specialist. I would like to convert image colorspace from one to another, and change icc color profile at the same time. I managed to do it using JMagick (the ImageMagick Java port), but no way in pure Java (even using JAI). Use ColorConvertOp , this will do the color space conversion. You have several options to set a icc color profile. Either you use a predefined profile by using getInstance with the correct color space constant or you can specify a file, which contains a profile. Here is an example: ICC_Profile ip = ICC_Profile

How to allocate 16byte memory aligned data

拥有回忆 提交于 2019-11-30 08:52:15
I am trying to implement SSE vectorization on a piece of code for which I need my 1D array to be 16 byte memory aligned. However, I have tried several ways to allocate 16byte memory aligned data but it ends up being 4byte memory aligned. I have to work with the Intel icc compiler. This is a sample code I am testing with: #include <stdio.h> #include <stdlib.h> void error(char *str) { printf("Error:%s\n",str); exit(-1); } int main() { int i; //float *A=NULL; float *A = (float*) memalign(16,20*sizeof(float)); //align // if (posix_memalign((void **)&A, 16, 20*sizeof(void*)) != 0) // error("Cannot

【转载】基于RedHatEnterpriseLinux V7(RHEL7)下SPEC CPU 2006环境搭建以及测试流程(之一)——介绍、安装准备、安装、config文件以及运行脚本介绍

房东的猫 提交于 2019-11-30 02:32:10
基于RedHatEnterpriseLinux V7(RHEL7)下SPEC CPU 2006环境搭建以及测试流程(之一)——介绍、安装准备、安装、config文件以及运行脚本介绍 其他 2018-05-30 13:27:18 阅读次数: 0 https://www.codetd.com/article/1137423 《版权声明:本文为博主原创文章,未经博主允许不得转载》 本次利用SPECCPU2006测试工具来进行Intel CPU Xeon E7-**** v4的测试以及调优,计划在机器I840-G**测试。本次测试主要从硬件调优和操作系统调优两个方面进行。经过最终的测试,SPECint_rate_base和SPECfp_rate_base结果均超过Intel的预期。其中调优过程尤为重要,为后续继续的测试达下基础。下面记录中间的调优过程。 SPECCPU2006简介 SPECCPU2006安装和使用 config文件以及运行脚本介绍 测试准备以及基准值测试 硬件调优过程 OS调优过程 结果提交过程问题 FAQ 自动化测试脚本 Numa、memory interleaving、cgroup等相关内容学习 常用监控工具使用,最好写成自动化脚本时称log文件,可以用来观察。 top、sar、vmstat、oprofile、重拾pcp功能等 一、SPECCPU2006简介

Setting the Search Path for Plug In (Bundle / DyLib)

五迷三道 提交于 2019-11-29 17:48:07
I'm creating a Photoshop Plug In on OS X (Basically a Bundle / DyLib). I'm using Intel Compiler and uses OpenMP by linking against OpenMP ( libiomp5 ). When I use Static Linking it crashes Photoshop (Only on OS X, on Windows it works). So I tried dynamic linking. The host, Photoshop, uses by itself libiomp5.dylib which is available on its Framework folder. So, on Xcode I set on the Linking Part the Runpath Search Paths to @executable_path/../Frameworks/ yet when I try to load it on Photoshop it won't work. I also tried to set Runpath Search Paths to Intel Run Time Redistributable Libraries

Intel c++ compiler, ICC, seems to ingnore SSE/AVX seetings

我只是一个虾纸丫 提交于 2019-11-29 11:25:31
I have recently downloaded and installed the Intel C++ compiler, Composer XE 2013, for Linux which is free to use for non-commercial development. http://software.intel.com/en-us/non-commercial-software-development I'm running on a ivy bridge system (which has AVX). I have two versions of a function which do the same thing. One does not use SSE/AVX. The other version uses AVX. In GCC the AVX code is about four times faster than the scalar code. However, with the Intel C++ compiler the performance is much worse. With GCC I compile like this gcc m6.cpp -o m6_gcc -O3 -mavx -fopenmp -Wall -pedantic

Missing AVX-512 intrinsics for masks?

非 Y 不嫁゛ 提交于 2019-11-29 10:42:38
Intel's intrinsics guide lists a number of intrinsics for the AVX-512 K* mask instructions, but there seem to be a few missing: KSHIFT{L/R} KADD KTEST The Intel developer manual claims that intrinsics are not necessary as they are auto generated by the compiler. How does one do this though? If it means that __mmask* types can be treated as regular integers, it would make a lot of sense, but testing something like mask << 4 seems to cause the compiler to move the mask to a regular register, shift it, then move back to a mask. This was tested using Godbolt 's latest GCC and ICC with -O2

RDRAND and RDSEED intrinsics GCC and Intel C++

时光怂恿深爱的人放手 提交于 2019-11-29 08:04:59
Does Intel C++ compiler and/or GCC support the following intrinsics, like MSVC does since 2012 / 2013? int _rdrand16_step(uint16_t*); int _rdrand32_step(uint32_t*); int _rdrand64_step(uint64_t*); int _rdseed16_step(uint16_t*); int _rdseed32_step(uint32_t*); int _rdseed64_step(uint64_t*); And if these intrinsics are supported, since which version are they supported (with compile-time-constant please)? Both GCC and Intel compiler support them. GCC support was introduced at the end of 2010. They require the header <immintrin.h> . GCC support has been present since at least version 4.6, but there