Effects of Inlining on Sather Programs


Introduction

This documents describes the effect of inlining on Sather 1.0.7. The impacts of inlining on running time, executable size, and compilation time were investigated for 3 programs: FFT transform code that comes with Sather distribution, Salishan paraffins problem, and Sather compiler itself. FFT and paraffins are medium size, compute bound programs. All measurements were taken on a 66 MHz Sparc-10.

1. Effects of Inlining on FFT

The following measurements were done for a number of FFT classes in Sather distribution. The test program performs a series of FFT transforms.

The inlining threshold determines the maximum weight assigned to a function or iterator to be considered for inlining. In all measurements, the same thresholds were used for both functions and iterators. The weight is assigned by traversing the AM representation of a function and accumulating the weights assigned to expressions and statements.



2. Effects of Inlining on Salishan Paraffins problem

The Paraffins Problem: Given an integer n, output the chemical structure of all paraffin molecules for i<=n, without repetition and in order of increasing size. Include all isomers, but no duplicates. The chemical formula for paraffin molecules is C(i)H(2i+2). Any representation for the molecules could be chosen, as long as it clearly distinguishes among isomers.

The methodology of this experiment was the same as in for the FFT. The program compiled with different levels of inlining was run for three different problem sizes. As in the FFT case, the results are pretty consistent, and the change in the executable size is practically unobservable due to the page granularity of Sparc object code.

3. Effects of Inlining on compilation of Sather compiler

Finally, the Sather compiler itself was compiled with different levels of inlining, and resulting compilers were used as boot compilers to compile Sather 1.0.6 compiler. Only compilation of Sather code was timed (C compilation was not counted). Interestingly enough, in the best case, there is a reduction in size: from 1310720 for noninlined version to 1277952 for inlined (32K or 2.5% reduction), with resulting compiler being about 12% faster than the original one.


Data Evaluation and Comparison.

All graphs show similar influence of inlining on Sather code. In particular, FFT and paraffins results are very close - starting with the inlining threshold of about 16 no further improvement in run time is observable. The peak speedups in both cases are about 20-25%.

The Sather compiler experiment shows a very similar trend, although speedups are somewhat lower. The maximum speedup is about 12% for inlining threshold 16. Moreover, this also corresponds to the reduction of the executable size by 32K. Compilation times for the peak are about 15-20% larger than for non-inlining compilation. If C compilation were included in the measurements, these numbers would be more then twice lower.

It is interesting that all measurements seem to agree that the optimum inlining levels for Sparcs is about 16 in spite of the difference in algorithms, program sizes, coding styles, etc. Such inlining also corresponds to the reduction in executables sizes in the case of Sather compiler. The slowdown in compilation times due to additional complexity is reasonable and perheaps could be reduced even more. Moreover, if the compiler itself is built with inlining, it is about 10-12% faster, and thus it can generate tfaster inlined code at the same time as the old compiler generated slower code with no inlining. As expected, the benefits for the Sather compiler are somewhat less significant than for the other programs due to a large amount of I/O.

Command Option Proposal

-inline provides default inlining with both thresholds set to 16.

To override default inlining for exotic architectures or programs, two other options could be used:

-inline_functions level

-inline_iterators level

As measurements show, the best speedups are achieved for level<100

-fast option provides default inlining for functions and iterators.

-optimize, -O options provide default inlining for functions and iterators.