Understanding and controlling gcc’s inlining decisions

Michael Stone, October 7, 2012, , (src)

Here are two cute tricks for understanding and controlling GCC’s inlining decisions that I learned recently while preparing to help mentor for MIT’s Performance Engineering class (2010 OCW lectures).

  1. GCC has a debugging option called -fdump-ipa-inline. When given, GCC will output a log file that describes, on a call-site-by-call-site basis, what inlining decision it made and why.

    (As a result, with this flag set, one can write unit tests to check the outcome of critical inlining decisions!)

  2. If you’ve found an inlining decision that makes you unhappy and your normal static inline ... declaration isn’t cutting the mustard, then GCC’s always_inline and noinline function attributes might be what you’re looking for!

Other tricks:

  1. Per Auto-vectorization in GCC and SO, something like -ftree-vectorize-verbose=7 (-vec-report3 with ICC) or -fdump-tree-vect will give you lots of information

Other links:

  1. Agner Fog’s Software Optimization Resources