Compiler Design and Construction - Question Paper 2070

Attempt all questions. (10x6=60)

Official Answer

AI Generated Answer

2. Why are regular expressions used in token specification? Write the regular expression to specify the identifier like in C.

6 marks

Details

Official Answer

The lexical analyzer needs to scan and identify only a finite set of valid string/token/lexeme that belong to the language in hand. It searches for the pattern defined by the language rules. Regular expressions have the capability to express finite languages by defining a pattern for finite strings of symbols. So regular expressions are used in token specification.

Regular expressions are the algebraic expressions that are used to describe tokens of a programming language. It uses the three regular operations. These are called union/or, concatenation and star. Brackets ( and ) are used for grouping.

Union of two languages L and M is written as

L U M = {s | s is in L or s is in M}

Concatenation of two languages L and M is written as

LM = {st | s is in L and t is in M}

The Kleene Closure of a language L is written as

L* = Zero or more occurrence of language L.

Now,

In C the regular expression for identifiers can be written using the regular definition as;

letter → A | B | .... | Z | a | b | .... | z | _

digit → 0 | 1 | 2 | ..... |9

identifier → letter(letter | digit)*

The regular expression representing identifiers without using regular definition:

(A | B | C |.........| Z | a | b | c |.........| z | _). ((A | B | C |.........| Z | a | b | c |.........| z | _ |) (1 | 2 |................| 9))*

AI Generated Answer

3. Discuss the specification of lexical analyzer generator Lex.

6 marks

Details

Official Answer

Lexical Analyzer Generator (Lex/Flex) systematically translate regular definitions into C source code for efficient scanning. Generated code is easy to integrate in C applications. Flex is a latter version of lex.

Firstly, specification of a lexical analyzer is prepared by creating a program lex.l in Lex/Flex
lex.l is run into lex compiler to produce a C program lex.yy.c
lex.yy.c consists of the tabular representation of lex.l, together with a standard routine that uses the table to recognize lexeme.
Lex.yy.c is run through C compiler to produce the object program a.out which is a lexical analyzer that input into tokens.

A lex/flex specification consists of three parts:

1. regular definitions, C declarations in %{ %}

%%

2. translation rules

%%

3. user-defined auxiliary procedures

The translation rules are of the form:

p1 { action1 }

p2 { action2 }

…

pn { actionn }

Lex/flex regular definitions are of the form:

name definition

E.g. Digit [0-9]

Example of lex specification

%{

#include <stdio.h>

%}

%%

[0-9]+ { printf(“%s\\n”, yytext); }

. | \\n { }

%%

main()

{ yylex();

}

AI Generated Answer

4. Consider the grammar:

a. Show that this grammar is ambiguous by constructing two different leftmost derivations for sentence abab.

b. Construct the corresponding rightmost derivations for abab.

c. Construct the corresponding parse trees for abab.

6 marks

Details

Official Answer

Given grammar;

a) Leftmost derivation:

Since the above grammar produces two different left-most derivation for the sentence abab, the given grammar is ambiguous.

b) Rightmost derivation:

c) The corresponding parse tree for abab is

AI Generated Answer

5. Consider the following grammar:

a. Show steps of shift-reduce parsing for the input string id+id*id.

b. Identify conflicts during the parsing.

6 marks

Details

Official Answer

Given grammar;

a)

Input string: id + id * id

b)

Here, the shift-reduce conflict is encountered during the parsing. At $E+T, the conflict occured whether to reduce or shift during the parsing.

AI Generated Answer

6. Describe the L- attributed definitions. How L-attributed definitions are evaluated?

6 marks

Details

Official Answer

An syntax directed definition that uses both synthesized and inherited attribute with restriction of not taking values from right siblings is called L-attributed definition.

Mathematically,

A syntax-directed definition is L-attributed if each inherited attribute of X_jon the right side of A → X₁X₂ … X_n depends only on

The attributes of the symbols X₁, X₂, …, X_j-1
The inherited attributes of A

Consider a production: S → ABC

Here, S can take values from A, B, and C (synthesized). 'A' can take values from S only. B can take values from S and A. C can get values from S, A, and B.

L – attributed definitions allow for a natural order of evaluating attributes: depth-first and left to right. Every S-attributed syntax-directed definition is also L-attributed and the above rule applied only for inherited attributes.

Procedure for Depth First Evaluation of L – Attributed Definitions:

Procedure dfvisit(n: node); //depth-first evaluation

for each child m of n, from left to right do

evaluate inherited attributes of m;

dfvisit(m);

evaluate synthesized attributes of n

Example:

A → XY

X.i = A.i

Y.i = X.s

A.s = Y.s

AI Generated Answer

7. Write the type expressions for the following types:

a. An array of pointers to real’s, where the array index range from 1 to 100.

b. Function whose domains are function from two characters and whose range is a pointer of integer.

6 marks

Details

Official Answer

a)

T→ real { T.type = real}

E → array [100, T] { if T.Type = real then T.type = array(1..100, T) else type_error() }

E→ Pointer[E1] { if (E1.type = array[100, real)) then E.type = E1.type else E.type = type-error }

b)

T→ int { T.type = int }

T →char { T.type = char}

T→ Pointer[T1] { T.type = Pointer(T1.type) }

E→ E1[E2] { if ( E2.type = (char, char) and E1.type = (char, char) → Pointer(int) then E.type = E1.type else type_error }

AI Generated Answer

8. What do you mean by three address code? Write the syntax directed definition for following grammar to produce the three address codes for assignments

S -> id = E

E -> id

6 marks

Details

Official Answer

The address code that uses at most three addresses, two for operands and one for result is called three address code. Each instruction in three address code can be described as a 4-tuple: (operator, operand1, operand2, result).

A quadruple (Three address code) is of the form: x = y op z where x, y and z are names, constants or compiler-generated temporaries and op is any operator.

For e.g.

The three-address code for a+b*c-d/(b*c) is

Given grammar;

S -> id = E

E -> id

Syntax directed definition for given grammar to produce the three address codes is:

Production

Sematic Rules

S → id = E

S.place = newtemp();

S.code = E.code || gen (id.place ‘=’ E.place);

E → id

E .place = id.entry

E.code = null

AI Generated Answer

9. Discuss the issues in design of simple code generator.

6 marks

Details

Official Answer

The final phase in compiler model is the code generator. It takes as input an intermediate representation of the source program and produces as output an equivalent target program. Designing of code generator should be done in such a way so that it can be easily implemented, tested and maintained.

The issues to code generator design includes:

1. Input to the code generator: The input to the code generator is intermediate representation together with the information in the symbol table. Intermediate representation has the several choices:

Postfix notation,
Syntax tree or DAG,
Three address code

The code generation phase needs complete error-free intermediate code as an input requires.

2. Target Program: The target program is the output of the code generator. The output can be:

Assembly language: It allows subprogram to be separately compiled.
Relocatable machine language: It makes the process of code generation easier.
Absolute machine language: It can be placed in a fixed location in memory and can be executed immediately.

3. Target Machine: Implementing code generation requires thorough understanding of the target machine architecture and its instruction set.

4. Instruction Selection: Efficient and low cost instruction selection is important to obtain efficient code. When we consider the efficiency of target machine then the instruction speed and machine idioms are important factors. The quality of the generated code can be determined by its speed and size.

5. Register Allocation: Proper utilization of registers improve code efficiency. Use of registers make the computations faster in comparison to that of memory, so efficient utilization of registers is important. The use of registers are subdivided into two subproblems:

During Register allocation, we select only those set of variables that will reside in the registers at each point in the program.
During a subsequent Register assignment phase, the specific register is picked to access the variable.

6. Choice of Evaluation order: The efficiency of the target code can be affected by the order in which the computations are performed.

AI Generated Answer

10. Define the following optimization techniques:

a. Unreachable code elimination

b. Flow-of-control optimization

6 marks

Details

Official Answer

a. Unreachable code elimination

The unreachable code is a part of the source code that will never be executed due to inappropriate exit points/control flow.. By eliminating these useless things from a code, the code will get optimized. For e.g.

b. Flow-of-control optimization

An unlabeled instruction immediately following an unconditional jump can be removed. For e.g.

Using peephole optimization unnecessary jumps can eliminate in either the intermediate code or the target code.

Goto L1

. . . . . . .

L1: goto L2

By the sequence,

Goto L2

. . . . . . .

L1: goto L2

If there are no jumps to L1 then it may be possible to eliminate the statement L1: goto L2 provided it preceded by an unconditional jump.

AI Generated Answer

1. What do you mean by compiler? Explain the semantic analysis phase of compiler construction.

6 marks

Details

Official Answer

A compiler is a program that takes a program written in a source language and translates it into an equivalent program in a target language. As an important part of a compiler is error showing to the programmer.

Block diagram for phases of compiler:

COMPILER DESIGN: Phases of a compiler

In Semantic Analysis phase, the semantic analyzer checks the source program for semantic errors and collects the type information for the code generation. Semantic analyzer checks whether they form a sensible set of instructions in the programming language or not. Type-checking is an important part of semantic analyzer.

Semantic analyzer is required when the compiler may require performing some additional checks such as determining the type of expressions and checking that all statements are correct with respect to the typing rules, that variables have been properly declared before they are used, that functions are called with the proper number of parameters etc. This semantic analyzer phase is carried out using information from the parse tree and the symbol table.

Example:

AI Generated Answer

Course Content

Other Courses

Compiler Design and Construction 2070

Trivhuwan University

Pokhara University

Course Content

Other Courses

Compiler Design and Construction 2070