Compiler Design and Construction - Question Paper 2074

Attempt all questions. (10x6=60)

Official Answer

AI Generated Answer

1. Draw block diagram of compiler. Explain different steps in synthesis phase.

6 marks

Official Answer

A compiler is a program that takes a program written in a source language and translates it into an equivalent program in a target language. As an important part of a compiler is error showing to the programmer.

Block diagram for phases of compiler:

COMPILER DESIGN: Phases of a compiler

Synthesis phase construct the desired target program from the intermediate representation. The synthesis part of compiler consists of the following phases: Intermediate code generation, Code optimization and Target code generation.

1. Intermediate Code Generation:

If the program syntactically and semantically correct then intermediate code generator generates a simple machine independent intermediate language. The intermediate code should be generated in such a way that it can easily translated into the target machine code.

Example:

Intermediate code for: id1 = id2 + id3 * 3.0

t1 = 3.0;

t2 = id3 * t1;

t3 = id2 + t2;

id1 = t3;

2. Code Optimization:

It is used to improve the intermediate code so that the output of the program could run faster and takes less space. It removes the unnecessary lines of the code and arranges the sequence of statements in order to speed up the program execution without wasting resources.

Example:

t2 = id3 * 3.0;

id1 = id2 + t2;

3. Code Generation:

Code generation is the final stage of the compilation process. It takes the optimized intermediate code as input and maps it to the target machine language.

Example:

MOV R1, id3

MUL R1, #3.0

MOV R2, id2

ADD R1, R2

MOV id1, R1

AI Generated Answer

2. What is the role of regular expression in lexical analysis? Also give example.

6 marks

Details

Official Answer

The lexical analyzer needs to scan and identify only a finite set of valid string/token/lexeme that belong to the language in hand. It searches for the pattern defined by the language rules.

Regular expressions have the capability to express finite languages by defining a pattern for finite strings of symbols. The grammar defined by regular expressions is known as regular grammar. The language defined by regular grammar is known as regular language.

An important notation for specifying the patterns in known as Regular expression. Set of strings is matched by each pattern and the names of the set of strings are served by the regular expressions. Regular languages describe the programming language tokens.

The various operations on languages are:

Union of two languages L and M is written as

L U M = {s | s is in L or s is in M}

Concatenation of two languages L and M is written as

LM = {st | s is in L and t is in M}

The Kleene Closure of a language L is written as

L* = Zero or more occurrence of language L.

Representing valid tokens of a language in regular expression:

If x is a regular expression, then:
x* means zero or more occurrence of x.
x⁺ means one or more occurrence of x.
x? means at most one occurrence of x
[a-z] is all lower-case alphabets of English language.
[A-Z] is all upper-case alphabets of English language.
[0-9] is all natural digits used in mathematics.

Representing occurrence of symbols using regular expressions:

letter = [a – z] or [A – Z]
digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 or [0-9]
sign = [ + | - ]

Representing language tokens using regular expressions:

Decimal = (sign)?(digit)+
Identifier = (letter)(letter | digit)*

AI Generated Answer

3. What is shift reduce parsing technique? Show shift reduce parsing action for the string a*(a+a), given the grammar

6 marks

Details

Official Answer

The process of reducing the given input string into the starting symbol is called shift-reducing parsing.

A shift reduce parser uses a stack to hold the grammar symbol and an input buffer to hold the input string W.

Sift reduce parsing performs the two actions: shift and reduce.

At the shift action, the current symbol in the input string is pushed to a stack.
At each reduction, the symbols will replaced by the non-terminals. The symbol is the right side of the production and non-terminal is the left side of the production.

Now,

Given grammar;

Input String: a*(a+a)

Shift reduce parsing action for given input string is shown below:

AI Generated Answer

4. Find first and follow for the non-terminals in the following grammar.

6 marks

Details

Official Answer

Given grammar;

Now, the FIRST and FOLLOW for the non-terminal symbols are:

Non-terminals	FIRST()	FOLLOW()
S	{a, b, d}	{$}
A	{d}	{a, c}
B	{a}	{a, c}

AI Generated Answer

5. What is semantic analysis? Why is it necessary to perform semantic analysis? Explain.

6 marks

Details

Official Answer

Semantic Analysis is the third phase of compiler. Semantic Analysis makes sure that declarations and statements of program are semantically correct. It is a collection of procedures which is called by parser as and when required by grammar. Both syntax tree of previous phase and symbol table are used to check the consistency of the given code.

Semantic analysis is necessary due to the following reasons:

1. In this phase, the semantic analyzer checks the source program for semantic errors and collects the type information for the code generation. Semantic analyzer checks whether they form a sensible set of instructions in the programming language or not.

2. Semantic analyzer recognizes following types of errors:

Type mismatch
Undeclared variables
Reserved identifier misuse
Multiple declaration of variable in a scope.
Accessing an out of scope variable.
Actual and formal parameter mismatch.

3. Semantic analysis performs the following function:

Type Checking : Ensures that data types are used in a way consistent with their definition.
Label Checking : A program should contain labels references.
Flow Control Check : Keeps a check that control structures are used in a proper manner.(example: no break statement outside a loop)

Example:

float x = 10.1;

float y = x*30;

In this example integer 30 will be typecasted to float 30.0 before multiplication, by semantic analyzer.

AI Generated Answer

6. Write Syntax Directed Definition to carry out type checking for the following expression.

E -> id | E1 op E2 | E1 relop E2 | E1[E2] | E1 ↑

6 marks

Details

Official Answer

E → id { E.type = lookup(id.entry) }

E → E1 op E2 { if (E1.type = integer and E2.type = integer) then E.type = integer

else if (E1.type = integer and E2.type = real) then E.type = real

else if (E1.type = real and E2.type = integer) then E.type = real

else if (E1.type = real and E2.type = real) then E.type = real

else E.type = type-error }

E → E1 relop E2 { if E1.type = boolean and E2.type = boolean then E.type = boolean

else E.type = error}

E → E1 [E2] { if (E2.type = int and E1.type = array(s, t)) then E.type = t

else E.type = type-error }

E →E1 ↑ { if E1.type = pointer(t) then E.type = t

else type-error}

AI Generated Answer

7. What is code optimization? What is its importance?

6 marks

Details

Official Answer

Code optimization is used to improve the intermediate code so that the output of the program could run faster and takes less space. It removes the unnecessary lines of the code and arranges the sequence of statements in order to speed up the program execution without wasting resources. It tries to improve the code by making it consume less resources (i.e. CPU, Memory) and deliver high speed. For e.g.

Intermediate code

Optimized Intermediate Code

t1 = 3.0;

t2 = id3 * t1;

t3 = id2 + t2;

id1 = t3;

t2 = id3 * 3.0;

id1 = id2 + t2;

A code optimizing process must follow the three rules given below:

The output code must not, in any way, change the meaning of the program.
Optimization should increase the speed of the program and if possible, the program should demand less number of resources.
Optimization should itself be fast and should not delay the overall compiling process.

Code optimization is important because:

Code optimization enhances the portability of the compiler to the target processor.
Code optimization allows consumption of fewer resources (i.e. CPU, Memory).
Optimized code has faster execution speed.
Optimized code utilizes the memory efficiently.
Optimized code gives better performance.
Optimized code often promotes re-usability.

AI Generated Answer

8. Explain different types of loop optimization techniques.

6 marks

Details

Official Answer

Loop Optimization is the process of increasing execution speed and reducing the overheads associated with loops. Loop Optimization is a machine independent optimization.

Loop Optimization techniques are given below:

1. Frequency Reduction (Code Motion): In frequency reduction, the amount of code in loop is decreased. A statement or expression, which can be moved outside the loop body without affecting the semantics of the program, is moved outside the loop. For e.g.

Before optimization:

while(i<100)

{

a = Sin(x)/Cos(x) + i;

i++;

}

After optimization:

t = Sin(x)/Cos(x);

while(i<100)

{

a = t + i;

i++;

}

2. Loop Unrolling: In this technique the number of jumps and tests can be optimized by writing the code to times without changing the meaning of a code. For e.g.

Before optimization:

int i = 1;

while(i<100)

{

a[i] = b[i];

i++;

}

After optimization:

int i = 1;

while(i<100)

{

a[i] = b[i];

i++;

a[i] = b[i];

i++;

}

The first code loop repeats 50 times whereas second code loop repeats 25 times. Hence, optimization is done.

3. Loop Fusion/Loop Jamming: This technique combines the bodies of two loops whenever the same index variable and number of iterations are shared. For e.g.

Before optimization:

for(i=0;i<=10;i++)

{

Printf(“Jayanta”);

}

for(i=0;i<=10;i++)

{

Printf(“Poudel”);

}

After optimization:

for(i=0;i<=10;i++)

{

Printf(“Jayanta”);

Printf(“Poudel”);

}

4. Reduction in Strength: The strength of certain operators is higher than other operators. For example, strength of * is higher than +. Usually, compiler takes more time for higher strength operators and execution speed is less. Replacement of higher strength operator by lower strength operator is called a strength reduction technique

Optimization can be done by applying strength reduction technique where higher strength can be replaced by lower strength operators. For e.g.

Before optimization:

for (i=1;i<=10;i++)

{

sum = i * 7;

printf(“%d”, sum);

}

After optimization:

temp = 7;

for(i=1;i<=10;i++)

{

temp = temp + 7;

sum = temp;

printf(“%d”, sum)

}

AI Generated Answer

9. Define three address codes. Write three address code for

a = -b*(c+d)

6 marks

Details

Official Answer

The address code that uses at most three addresses, two for operands and one for result is called three address code. Each instruction in three address code can be described as a 4-tuple: (operator, operand1, operand2, result).

A quadruple (Three address code) is of the form: x = y op z where x, y and z are names, constants or compiler-generated temporaries and op is any operator.

Followings are the 3-address code statements for different statements.

Assignment statement:

x = y op z ( binary operator op)

x = op y ( unary operator op)

Copy statement: x = z
Unconditional jump: goto L ( jump to label L)
Conditional jump: if x relop y goto L
Procedure call:

param x1

param x2

. .

..

param xn

call p, n i.e. call procedure P(x1,x2,….. xn)

Indexed assignment:

x = y[i]

x[i] = y

Address and pointer assignments:

x = &y

x = *y

*x = y

Given expression

a = -b*(c+d)

The three address code of given expression is given below:

t₁ = -b

t₂ = c + d

t₃ = t₁ * t₂

a = t₃

AI Generated Answer

10. Discuss the importance of error handler in compiler. How is it manipulated in the different phases of compilation.

6 marks

Details

Official Answer

Error detection and reporting of errors are important functions of the compiler. Whenever an error is encountered during the compilation of the source program, an error handler is invoked. Error handler generates a suitable error reporting message regarding the error encountered. The error reporting message allows the programmer to find out the exact location of the error. Errors can be encountered at any phase of the compiler during compilation of the source program for several reasons such as:

In lexical analysis phase, errors can occur due to misspelled tokens, unrecognized characters, etc. These errors are mostly the typing errors.
In syntax analysis phase, errors can occur due to the syntactic violation of the language.
In intermediate code generation phase, errors can occur due to incompatibility of operands type for an operator.
In code optimization phase, errors can occur during the control flow analysis due to some unreachable statements.
In code generation phase, errors can occurs due to the incompatibility with the computer architecture during the generation of machine code. For example, a constant created by compiler may be too large to fit in the word of the target machine.
In symbol table, errors can occur during the bookkeeping routine, due to the multiple declaration of an identifier with ambiguous attributes.

After detecting an error, a phase must somehow deal with that error, so that compilation can proceed, allowing further errors in the source program to be detected.

Error handler contains a set of routines for handling error encountered in any phase of the compiler. Every phase of compiler will call an appropriate routine of the error handler as per their specifications.

AI Generated Answer

Course Content

Other Courses

Compiler Design and Construction 2074

Representing valid tokens of a language in regular expression:

Representing occurrence of symbols using regular expressions:

letter = [a – z] or [A – Z]
digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 or [0-9]
sign = [ + | - ]

Representing language tokens using regular expressions:

Decimal = (sign)?(digit)+
Identifier = (letter)(letter | digit)*

Trivhuwan University

Pokhara University

Course Content

Other Courses

Compiler Design and Construction 2074

Representing valid tokens of a language in regular expression:

Representing occurrence of symbols using regular expressions:

letter = [a – z] or [A – Z]digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 or [0-9]sign = [ + | - ]

Representing language tokens using regular expressions:

Decimal = (sign)?(digit)+Identifier = (letter)(letter | digit)*

letter = [a – z] or [A – Z]
digit = 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 or [0-9]
sign = [ + | - ]

Decimal = (sign)?(digit)+
Identifier = (letter)(letter | digit)*