Program execution |
---|
General concepts |
Types of code |
Compilation strategies |
Notable runtimes |
|
Notable compilers & toolchains |
|
In computer programming, a one-pass compiler is a compiler that processes each compilation unit only once, sequentially translating each source statement or declaration into something close to its final machine code. This is in contrast to a multi-pass compiler which converts the program into one or more intermediate representations in steps between source code and machine code, and which reprocesses the entire compilation unit in each sequential pass.
A one-pass compiler initially has to leave addresses for forward jumps unresolved. These can be handled in various ways (e.g. tables of forward jumps and targets) without needing to have another complete pass. Some nominally one-pass compilers effectively 'pass the buck' by generating assembly language and letting the assembler sort out the forward references, but this requires one or more passes in the assembler.
One-pass compilers are smaller and faster than multi-pass compilers.
One-pass compilers are unable to generate as efficient programs as multi-pass compilers due to the limited scope of available information. Many effective compiler optimizations require multiple passes over a basic block, loop (especially nested loops), subroutine, or entire module. Some require passes over an entire program.
Difficulties
A core requirement for any programming language intended for one-pass compilation is that all user-defined identifiers, except possibly labels, must be declared before use:
- Pascal contains a forward specification to allow routines to be mutually recursive.
- PL/I allows data declarations to be placed anywhere within a program, specifically, after some references to the not-yet-declared items, so one pass is needed to resolve data declarations, followed by one or more passes to generate code.
Any language which allows preprocessing such as PL/1, C, or C++, must have at least two passes, one pass for preprocessing and one or more passes for translation.
Some computer hardware (e.g. x86) may have short and long branch instructions: short if the destination is within about 127 bytes, and long otherwise. A one-pass compiler must assume that all jumps are long, whereas a multi-pass compiler can check on jump distance and generate possibly shorter code.
Languages which rely on context for statement identification rather than using reserved or stropped keywords may also cause problems. The following examples are from Fortran 77:
IF (B) l1,l2 ! two-way branch, where B is a boolean/logical expression
IF (N) l1,l2,l3 ! three-way branch, where N is a numeric expression
IF (B) THEN ! start conditional block
IF (B) THEN = 3.1 ! conditional assignment to variable THEN
IF (B) X = 10 ! single conditional statement
IF (B) GOTO l4 ! conditional jump
IF (N) = 2 ! assignment to a subscripted variable named IF
DO 12 I = 1,15 ! start of a count-controlled loop
DO 12 I = 1.15 ! assignment of the value 1.15 to a variable called DO12I
The entire statement must be scanned to determine what sort of statement it is; only then can translation take place. Hence any potentially ambiguous statements must be processed at least twice.
See also
References
- "Single pass, Two pass, and Multi pass Compilers". GeeksforGeeks. 2019-03-13. Retrieved 2023-05-15.