Jeffrey Bosboom's Blog

[blog] [projects] [about]

Papers: Knuth on profiling and optimizing the common case

An Empirical Study of FORTRAN programs (Knuth, 1970; PDF, gratis PDF)

In this paper, Knuth uses static and dynamic analysis on a sample of Fortran programs to understand how the language is actually used and the performance of the resulting programs. To both compiler writers and programmers, Knuth emphasizes the importance of optimizing the common case.

The static analysis results showed that most program statements are simple assignments with no arithmetic or of the form x = x op y; most DO loops (Fortran’s for) are only a few statements long, use an increment of one, and are not deeply nested; most functions/arrays (syntactially identical) use few arguments/offsets; most programs have simple control flow; in general, “compilers spend most of their time doing surprisingly simple things”. In this section, Knuth focuses on speed of compilation (compilers should translate simple constructs quickly). In the paper’s introduction, he says he was once impressed that an algorithm was able to save two machine instructions from an 11-operand expression compared to his algorithm, but now knows an expression’s average operand count is only 2. The implication, of course, is that a small savings on programs users don’t actually write does not merit the complexity of the new algorithm, and in general optimization effort should be focused on constructs actually appearing in programs.

Dynamic analysis was used to gather “frequency counts”, about which Knuth says, “it is difficult to express in words just how tremendously ‘eye-opening’ they are” and “frequency counts are so important they deserve a special name”, thus coining the term profile. (Knuth notes that frequency counts go back to von Neumann and Goldstine 1948, and were previously used both as a debugging aid and for performance analysis, but he treats profile as an original coinage.) Programs in his sample spent most of their time in 4 percent of their code; by the nature of Fortran programs, these were usually small inner loops. Profiling the profiling instrumenter itself identified two loops as hot spots, and an hour of optimization effort doubled its speed. Knuth also gives an example of profiling prompting a change in data structure: a program was spending most of its time inserting into a sorted array, but not much time performing binary search on that array, so it was modified to use a hash table instead.

To determine how advanced compiler optimizations affect hot spots, Knuth hand-translated some of the inner loops at various (notional) levels of optimization, concluding that the then-known optimizations performed well enough on the programs in his sample. Knuth also notes that compilers can use profiling data to spend more time on the most frequently-executed portions of the program. (Today’s profile-guided optimizations go further than just working harder; for example, a function called from both hot and cold regions might be cloned so different sets of optimizations can be applied.)

{% blockquote Donald Knuth http://www.cs.sjsu.edu/~mak/CS185C/KnuthStructuredProgrammingGoTo.pdf Structured Programming With go To Statements %} We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. {% endblockquote %}