Why not mark everything inline?

后端未结

关注

 11  610

长发绾君心

First off, I am not looking for a way to force the compiler to inline the implementation of every function.

To reduce the level of misguided answers make s

相关标签:

11条回答

庸人自扰

2020-12-08 13:48

This is semi-related, but note that Visual C++ does have the ability to do cross-module optimization, including inline across modules. See http://msdn.microsoft.com/en-us/library/0zza0de8%28VS.80%29.aspx for info.

To add an answer to your original question, I don't think there would be a downside at run time, assuming the optimizer was smart enough (hence why it was added as an optimization option in Visual Studio). Just use a compiler smart enough to do it automatically, without creating all the problems you mention. :)

0 讨论(0)
发布评论:

提交评论
- 加载中...
梦如初夏

2020-12-08 13:49

sqlite uses this idea. During development it uses a traditional source structure. But for actual use there is one huge c file (112k lines). They do this for maximum optimization. Claim about 5-10% performance improvement

http://www.sqlite.org/amalgamation.html

0 讨论(0)
发布评论:

提交评论
- 加载中...
夕颜

2020-12-08 13:50

The assumption here is that the compiler cannot optimize across functions. That is a limitation of specific compilers and not a general problem. Using this as a general solution for a specific problem might be bad. The compiler may very well just bloat your program with what could have been reusable functions at the same memory address (getting to use the cache) being compiled elsewhere (and losing performance because of the cache).

Big functions in general cost on optimization, there is a balance between the overhead of local variables and the amount of code in the function. Keeping the number of variables in the function (both passed in, local, and global) to within the number of disposable variables for the platform results in most everything being able to stay in registers and not have to be evicted to ram, also a stack frame is not required (depends on the target) so function calling overhead is noticeably reduced. Hard to do in real world applications all the time, but the alternative a small number of big functions with lots of local variables the code is going to spend a significant amount of time evicting and loading registers with variables to/from ram (depends on the target).

Try llvm it can optimize across the entire program not just function by function. Release 27 had caught up to gcc's optimizer, at least for a test or two, I didnt do exhaustive performance testing. And 28 is out so I assume it is better. Even with a few files the number of tuning knob combinations are too many to mess with. I find it best to not optimize at all until you have the whole program into one file, then perform your optimization, giving the optimizer the whole program to work with, basically what you are trying to do with inlining, but without the baggage.

0 讨论(0)
发布评论:

提交评论
- 加载中...
别跟我提以往

2020-12-08 13:51

Did you really mean #include everything? That would give you only a single module and let the optimizer see the entire program at once.

Actually, Microsoft's Visual C++ does exactly this when you use the /GL (Whole Program Optimization) switch, it doesn't actually compile anything until the linker runs and has access to all code. Other compilers have similar options.

0 讨论(0)
发布评论:

提交评论
- 加载中...
谎友^

2020-12-08 13:51
It is done already in some cases. It is very similar to the idea of unity builds, and the advantages and disadvantages are not fa from what you descibe:
- more potential for the compiler to optimize
- link time basically goes away (if everything is in a single translation unit, there is nothing to link, really)
- compile time goes, well, one way or the other. Incremental builds become impossible, as you mentioned. On the other hand, a complete build is going to be faster than it would be otherwise (as every line of code is compiled exactly once. In a regular build, code in headers ends up being compiled in every translation unit where the header is included)
But in cases where you already have a lot of header-only code (for example if you use a lot of Boost), it might be a very worthwhile optimization, both in terms of build time and executable performance.

As always though, when performance is involved, it depends. It's not a bad idea, but it's not universally applicable either.

As far as buld time goes, you have basically two ways to optimize it:
- minimize the number of translation units (so your headers are included in fewer places), or
- minimize the amount of code in headers (so that the cost of including a header in multiple translation units decreases)
C code typically takes the second option, pretty much to its extreme: almost nothing apart from forward declarations and macros are kept in headers. C++ often lies around the middle, which is where you get the worst possible total build time (but PCH's and/or incremental builds may shave some time off it again), but going further in the other direction, minimizing the number of translation units can really do wonders for the total build time.
0 讨论(0)
发布评论:

提交评论
- 加载中...
无人及你

2020-12-08 13:58

The problem with inlining is that you want high performance functions to fit in cache. You might think function call overhead is the big performance hit, but in many architectures a cache miss will blow the couple pushes and pops out of the water. For example, if you have a large (maybe deep) function that needs to be called very rarely from your main high performance path, it could cause your main high performance loop to grow to the point where it doesn't fit in L1 icache. That will slow your code down way, way more than the occasional function call.

0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页