Compiler Design and Construction - Old Questions
1. Explain briefly about different phases involved in compiler, with a block diagram.
A compiler operates in phases. A phase is a logically interrelated operation that takes source program in one representation and produces output in another representation. The phases of a compiler are shown in below:
1. Analysis phase breaks up the source program into constituent pieces and creates an intermediate representation of the source program. Analysis of source program includes: lexical analysis, syntax analysis and semantic analysis.
2. Synthesis phase construct the desired target program from the intermediate representation. The synthesis part of compiler consists of the following phases: Intermediate code generation, Code optimization and Target code generation.
1. Lexical Analysis:
In this phase, lexical analyzer reads the source program and returns the tokens of the source program. Token is a sequence of characters that can be treated as a single logical entity (such as identifier, operators, keywords, constants etc.).
Input String: c = a + b * 3
Tokens: id1 = id2 + id3 * 3
2. Syntax Analysis:
In this phase, the syntax analyzer takes the token produced by lexical analyzer as input and generates a parse tree as output. In syntax analysis phase, the parser checks that the expression made by the token is syntactically correct or not, according to the rules that define the syntax of the source language.
3. Semantic Analysis:
In this phase, the semantic analyzer checks the source program for semantic errors and collects the type information for the code generation. Semantic analyzer checks whether they form a sensible set of instructions in the programming language or not. Type-checking is an important part of semantic analyzer.
4. Intermediate Code Generation:
If the program syntactically and semantically correct then intermediate code generator generates a simple machine independent intermediate language. The intermediate code should be generated in such a way that it can easily translated into the target machine code.
t1 = 3.0;
t2 = id3 * t1;
t3 = id2 + t2;
id1 = t3;
5. Code Optimization:
It is used to improve the intermediate code so that the output of the program could run faster and takes less space. It removes the unnecessary lines of the code and arranges the sequence of statements in order to speed up the program execution without wasting resources.
t2 = id3 * 3.0;
id1 = id2 + t2;
6. Code Generation:
Code generation is the final stage of the compilation process. It takes the optimized intermediate code as input and maps it to the target machine language.
MOV R1, id3
MUL R1, #3.0
MOV R2, id2
ADD R1, R2
MOV id1, R1
Symbol tables are data structures that are used by compilers to hold information about source-program constructs. The information is collected incrementally by the analysis phase of compiler and used by the synthesis phases to generate the target code. Entries in the symbol table contain information about an identifier such as its type, its position in storage, and any other relevant information.
Whenever an error is encountered during the compilation of the source program, an error handler is invoked. Error handler generates a suitable error reporting message regarding the error encountered.