Topic: code optimization

topics > computer science > programming > Group: code generation

code optimization by advice and statistics
code optimization by code rewrite
code optimization by flow analysis
code optimization by global analysis
code optimization by special case analysis
compile-time execution
dependency analysis
execution profile
flavor analysis and typestates for supplementary type checking
in-line code
intermediate representation of code
machine code and assembly language
memory cache
no need for efficiency
register allocation


A compiler can generate efficient code through code optimization, user advice, careful code generation, and program representation. The process is expensive because efficient code is dependent on manipulated data and program context. An interesting alternative is seen in ESL where the high level set representation is optimized before code generation.

Compilers have been compared using Ackermann's function. Generally system implementation languages do better than high-level languages on general purpose machines. The best were Wulf's Bliss language with explicit register declaration, and Burrough's Algol on Algol machines. (cbb 5/80)

Subtopic: most code seldom executed up

Quote: it is hard for code to take a lot of time if it isn't executed a lot; typically in a loop [»bentJL_1982]

Subtopic: program optimization up

Quote: how to optimize RSA encryption and decryption; software often faster than hardware implementations [»wienMJ2_2000]
Quote: formal semantics enables compile-time optimization, analysis, and verification
Quote: when optimizing a program, measure performance before and after each change [»blocJ_2001]

Subtopic: optimization design up

Quote: optimize complex operations, then expand the operation and reoptimize; e.g., max and min, boolean expressions, and function calls [»auslM6_1982]
Quote: simplify an optimization by depending on common subexpression elimination and other standard optimizations
Quote: instead of moving code out of a loop, insert a duplicate copy; common subexpression elimination to remove the original copy [»auslM6_1982]
Quote: graph coloring allocator developed for PL.8; simplified the compiler [»auslM6_1982]

Subtopic: memory safety up

Quote: compositional verification of memory safety in optimized code; encode low-level safety information via static single-assignment (SSA) proof variables [»menoVS1_2006]

Subtopic: measure efficiency by counting memory accesses up

Quote: evaluate code by number of copies and memory-based arguments [»cbb_1990, OK]
Quote: algorithms should reduce memory stalls as well as instructions; sequential access is faster than random access; using registers wins on both counts [»argeL9_2000]

Subtopic: benchmarks up

Quote: current compilers can generate object code within 15% of assembly code if the source exploits the strengths of the compiler's optimizer [»goodJB8_1976]
Quote: measure compiler optimizations by equivalent programs, one of which will not optimize any further [»klerM3_1988]
Quote: used Ackermann's function to compare code generators; Bliss/11 did as well as assembly code [»wichBA5_1977, OK]
Quote: compared compilers using Ackermann's function [»wichBA_1976, OK]
Quote: performance testing of cryptographic algorithms written in optimized assembly code [»prenB12_1998]

Subtopic: regression test up

Quote: PL.8 optimizer bugs produce catastrophic results since optimizer applies to all code; regression test of 150 self-checking programs [»auslM6_1982]

Subtopic: structured languages up

Quote: with structured languages, loops in intermediate code usually via 'DO' construct
QuoteRef: zelkMV1_1974 ;;optimized a structured goto-less language and found do very little (1-3%) very much like Data General's Algol 5

Subtopic: intermediate language up

Quote: for optimization, the intermediate representation should be tree-based with control flow, dominator, and use-def information included [»franM11_1998]
Quote: PL.8 used intermediate language for optimization
Quote: ESL is split into a set language and a Pascal-like base language; the compiler can optimize the set language in terms of the base language [»katzJ5_1979]

Subtopic: strings up

Quote: PL.8 intermediate language has string operations and symbolic registers; optimized [»auslM6_1982]

Subtopic: calling sequence, prologue, epilogue up

Quote: calling sequences must be compact
Quote: UNIX programs tend to use many small functions; 20% of time and space can be in function prologue, epilogue and calling sequence [»johnSC7_1978a]
Quote: final assembly generates function prologs and epilogs; handles stack frame, preserved registers [»auslM6_1982]
Quote: a calling convention is a contract between caller, callee, run-time system, and operating system; must agree on parameters, results, space allocation, reserved stack, stack inspection/modification; hard to get right [»olinR1_2006]
Quote: specify a calling convention with staged allocation; composes tiny allocators called stages; precise, formal semantics for compiler implementation and testing [»olinR1_2006]
Quote: staged allocation maps high-level types to a width, kind, and alignment, e.g., float [»olinR1_2006]
Quote: staged allocation assigns locations to a combination of registers and memory

Subtopic: annotation up

Quote: the key instruction is the first instruction executed that causes a source-level, visible state change for an atom; compiler may annotate [»ticeC4_2001]

Subtopic: static single assignment up

Quote: static single assignment (SSA) is useful for optimization

Subtopic: shared memory up

Quote: compilers can violate sequential consistency by reordering shared memory operations or allocating shared memory to registers; like relaxed consistency, it is difficult to analyze [»adveSV12_1996]
Quote: relaxed memory consistency models often provide a fence instruction or other safety net to force synchronization; these allow compilers to safely optimize large regions of code [»adveSV12_1996]

Subtopic: cache performance up

Quote: how to optimize programs for cache memory; unless hit rate 98% programs run at memory speed instead of processor speed [»searCB10_2000]
Quote: improve cache performance with locality grouping, dynamic data packing, pointer updates, and array alignment; up to 5x faster [»dingC5_1999]
Quote: to prevent memory bank thrashing, array processing programs should use sequential access to storage [»rymaJW3_1982]
Quote: improved prediction of cache behavior by analyzing calling contexts and separating first loop invocation from successive invocations [»martF3_1998]
Quote: use run-time data maps and data indirections to ensure correctness of run-time data transformations
Quote: automatically inline single-reference objects; 25% fewer cache misses; 14% faster programs [»dolbJ6_2000]
Quote: compress pointer structures by replacing pointers with computed offsets; e.g., implicit heap [»chilTM12_2000]

Subtopic: dead store elimination up

Quote: dead store elimination eliminates stored values that are not used [»auslM6_1982]

Subtopic: data fields up

Quote: field analysis for inexpensive interprocedural information about a class or small set of classes [»ghemS6_2000]
Quote: field properties refine type information for reference fields; e.g., does not reference subclasses of C, contains fixed-size array [»ghemS6_2000]

Subtopic: array bounds up

Quote: ABCD-eliminate array bounds checks on demand; sparse SSA analysis; removes 45% of bounds checks [»bodiR6_2000]
Quote: trap elimination moves bounds tests from loops by modifying the loop termination test [»auslM6_1982]
Quote: PL.8 guarantees that all array and offset references are within bounds; optimizes run-time checks; checking costs 5-10% [»auslM6_1982]

Subtopic: floating-point up

Quote: to bound numeric error, a programmer must be able to predicate how the compiler generates floating-point operations [»farnC7_1988]
Quote: an optimizer should not rearrange the order of floating-point evaluation in any way that changes the computed value or side effects

Subtopic: macro processing up

Quote: AUTOCODER's macro processors generated code that was data-sensitive and context-sensitive [»halpMI1_1968]

Subtopic: energy consumption up

Quote: can reduce energy consumption by assigning register labels to minimize switching costs; also compiler optimizations and better algorithms [»mehtR8_1997]
Quote: energy efficiency and price-performance are the primary factors for Google clusters
Quote: survey of power reduction techniques for circuits to architectures, system software, and applications [»venkV9_2005]
Quote: the power density and heat generation of computers is approaching that of nuclear reactors

Related Topics up

Topic: code optimization by advice and statistics (8 items)
Topic: code optimization by code rewrite (30 items)
Topic: code optimization by flow analysis (47 items)
Topic: code optimization by global analysis (24 items)
Topic: code optimization by special case analysis (22 items)
Topic: compile-time execution (17 items)
Topic: compiler (18 items)
Topic: dependency analysis (34 items)
Topic: efficiency (96 items)
Topic: execution profile (43 items)
Topic: flavor analysis and typestates for supplementary type checking (68 items)
Topic: in-line code (7 items)
Topic: intermediate representation of code (31 items)
Topic: machine code and assembly language (49 items)
Topic: memory cache (29 items)
Topic: no need for efficiency (28 items)
Topic: register allocation
(28 items)

Updated barberCB 6/05
Copyright © 2002-2008 by C. Bradford Barber. All rights reserved.
Thesa is a trademark of C. Bradford Barber.