Compiler Design and Construction - Question Paper 2072

Attempt all questions. (10x6=60)

Official Answer

AI Generated Answer

1. Explain the analysis phase of the compiling a source program with example.

6 marks

Official Answer

Analysis phase breaks up the source program into constituent pieces and creates an intermediate representation of the source program. Analysis of source program includes: lexical analysis, syntax analysis and semantic analysis.

Block diagram for phases of compiler:

COMPILER DESIGN: Phases of a compiler

1. Lexical Analysis:

In this phase, lexical analyzer reads the source program and returns the tokens of the source program. Token is a sequence of characters that can be treated as a single logical entity (such as identifier, operators, keywords, constants etc.).

2. Syntax Analysis:

In this phase, the syntax analyzer takes the token produced by lexical analyzer as input and generates a parse tree as output. In syntax analysis phase, the parser checks that the expression made by the token is syntactically correct or not, according to the rules that define the syntax of the source language.

3. Semantic Analysis:

In this phase, the semantic analyzer checks the source program for semantic errors and collects the type information for the code generation. Semantic analyzer checks whether they form a sensible set of instructions in the programming language or not. Type-checking is an important part of semantic analyzer.

Example of Analysis phase:

AI Generated Answer

2. What are the functional responsibilities of a lexical analyzer? Write a lexical analyzer that recognizes the identifier of C language.

6 marks

Details

Official Answer

Lexical Analyzer reads the input characters of the source program, groups them into lexemes, and produces a sequence of tokens for each lexeme. The tokens are then sent to the parser for syntax analysis.

Responsibilities of a lexical analyzer are:

It reads the input character and produces output sequence of tokens that the Parser uses for syntax analysis.
Lexical analyzer helps to identify token into the symbol table.
Lexical Analyzer is also responsible for eliminating comments and white spaces from the source program.
It also generates lexical errors.
Lexical analyzer is used by web browsers to format and display a web page with the help of parsed data from JavsScript, HTML, CSS

Second part:

/* Lex program to recognize the identifier of C language */

% {

#include <stdio.h>

%}

% %

^[a - z A - Z _ ][a - z A - Z 0 - 9 _ ]* printf("Valid Identifier");

^[^a - z A - Z _] printf("Invalid Identifier");

. ;

% %

main()

{

 yylex();

}

AI Generated Answer

3. What are regular definitions? Given the following CFG,

Write the regular definitions for the terminals used. Also, construct the transition diagram for id and num.

6 marks

Details

Official Answer

If Σ is an alphabet of basic symbols, then a regular definition is a sequence of definition of the form

d₁ → r₁

d₂ → r₂

.....

d_n → r_n

where, d_i is a distinct name and r_i is a regular expression over symbols in Σ ∪ {d₁, d₂, ...., d_i-1}

Now,

Regular definitions for the terminals used in given grammar are:

digit → [0-9]
num → digit⁺(. digit⁺)?(E[+|-]? digit⁺)?
letter → [A-Za-z]
id → letter ( letter | digit )^*
if → if
then → then
else → else
relop → < | > | <= | >= | = | <>

Transition diagram for id:

trans dia id

Here 9, 10 & 11 are the name of states.

Transition diagram for num:

trans dia num

AI Generated Answer

4. Given a regular expression (a+∈) bc*. Construct the DFA recognizing the pattern described by the regular expression using syntax tree based reduction.

6 marks

Details

Official Answer

Given RE;

(a+∈) bc*

The augmented RE of the given RE is:

(a+∈) bc* #

Syntax tree for augmented R.E. is

Calculating followpos:

followpos(1) = {2}

followpos(2) = {3, 4}

followpos(3) = {3, 4}

Now,

1) Start state of DFA = firstpos(root) = {1, 2} = S₀

Use followpos of symbol representing position in RE to obtain the next state of DFA.

2) From S₀ = {1, 2}

1 represents 'a'

2 represents 'b'

followpos(1) = {2} = S₁

δ(s₀, a) = S₁

followpos(2) = {3, 4} = S₂

δ(S₀, b) = S₂

3) From S₁ = {2}

Here 2 represents 'b'

followpos(2) = {3, 4} = S₂

δ(S₁, b) = S₂

4) From S₂ = {3, 4}

Here 3 represents 'c'

4 represents '#'

followpos(3) = {3, 4} = S₂

δ(S₂, c) = S₂

Now there was no new states occur.

Therefore, Final state = {S₂}

Now the resulting DFA of given regular expression is:

AI Generated Answer

5. Explain in detail about the recursive-descent parsing with example.

6 marks

Details

Official Answer

Recursive decent parsing is a top-down parsing technique that uses a set of recursive procedure to scan its input from left to right. It may involve backtracking. It starts from the root node (start symbol) and use the first production, match the input string against the production rules and if there is no match backtrack and apply another rule.

Rules:

1. Use two pointers iptr for pointing the input symbol to be read and optr for pointing the symbol of output string that is initially start symbol S.

2. If the symbol pointed by optr is non-terminal use the first production rule for expansion.

3. While the symbol pointed by iptr and optr is same increment the both pointers.

4. The loop at the above step terminates when

a. A non-terminal is encountered in output

b. The end of input is reached.

c. Symbol pointed by iptr and optr is unmatched.

5. If 'a' is true, expand the non-terminal using the first rule and goto step 2

6. If 'b' is true, terminate with success

7. If 'c' is true, decrement both the pointers to the place of last non-terminal expansion and use the next production rule for non-terminal

8. If there is no more production and 'b' is not true report error.

Example:

Consider a grammar

S → cAd

A → ab | a

Input string is “cad”

Initially, iptr = optr = 0

Input	Output	Rules Fired
iptr(cad)	optr(S)	Try S → cAd
iptr(cad)	optr(cAd)	Match c
c iptr(ad)	c optr(Ad)	Try A → ab
c iptr(ad)	c optr(abd)	Match a
ca iptr(d)	ca optr(bd)	dead end, backtrack
c iptr(ad)	c optr(Ad)	Try A → a
c iptr(ad)	c optr(ad)	Match a
ca iptr(d)	ca optr(d)	Match d
cad iptr(eof)	cad optr(eof)
Parsing completed.

AI Generated Answer

6. Given the following grammar:

Construct the parsing table for this grammar for non-recursive predictive parser.

6 marks

Details

Official Answer

Given Grammar;

Now, Calculating First and Follow:

FIRST

First(E) = First(T) = First(F) = { (, id }

First(E’) = { +, ∈ }

First(T’) = { *, ∈ }

FOLLOW

Follow(E) = { ), $}

Follow(E’) = { ), $}

Follow(F) = { *, +, ), $ }

Follow(T) = { +, ) , $}

Follow(T’) = { +, ) , $}

Now, the Parsing table for given grammar is as below:

AI Generated Answer

7. Define the term ‘item’, ‘closure’ and ‘goto’ used in LR parsing. Compute the SLR DFA for the following grammar.

6 marks

Details

Official Answer

Item

An “item” is a production rule that contains a dot(.) somewhere in the right side of the production. For example, A→ .αAβ , A→ α.Aβ, A→ αA.β, A→ αAβ. are items if there is a production A→ αAβ in the grammar.

Closure

If I is a set of items for a grammar G, then closure(I) is the set of LR(0) items constructed from I using the following rules:

1. Initially, every item in I is added to closure(I).

2. If A→ α.Bβ is in closure(I) and B →γ is a production, then add the item B → .γ to I if it is not already there.

Apply this rule until no more new items can be added to closure(I).

goto

If I is a set of LR(0) items and X is a grammar symbol (terminal or non-terminal), then goto(I, X) is defined as follows:

If A → α•Xβ in I then every item in closure({A → αX•β}) will be in goto(I, X).

Now,

The augmented grammar of the given grammar is,

S‘ → S

S → CC

C → aC | b

Next, we obtain the canonical collection of sets of LR(0) items, as follows,

I₀ = closure ({S‘ → •S}) = {S‘ → •S, S → •CC, C → •aC, C → •b}

goto (I₀, S) = closure ({S‘ → S•}) = {S‘ → S•} = I₁

goto (I₀, C) = closure ({S → C•C}) = {S → C•C, C → •aC, C→ •b} = I₂

goto (I₀, a) = closure ({C→ a•C}) = {C → a•C, C → •aC, C → •b} = I₃

goto (I₀, b) = closure ({C → b•}) = {C → b•} = I₄

goto (I₂, C) = closure ({S → CC•}) = {S → CC•} = I₅

goto (I₂, a) = closure ({C → a•C}) = {C → a•C, C → •aC, C → •b} = I₃

goto (I₂, b) = closure ({C → b•}) = {C → b•} = I₄

goto (I₃, C) = closure ({C → aC•}) = {C → aC•} = I₆

goto (I₃, a) = closure ({C → a•C}) = {C→ a•C, C → •aC, C → •b} = I₃

goto (I₃, b) = closure ({C → b•}) = {C → b•} = I₄

Now, the SLR DFA of the given grammar is:

AI Generated Answer

8. For the following grammar,

Annotate the grammar with syntax directed definitions using synthesized attributes. Remove left recursion from the grammar and re-write the annotation correspondingly.

6 marks

Details

Official Answer

First part:

E → E + E {E.val = E1.val + E2.val}

E → E * E {E.val = E1.val * E2.val}

E → (E) {E.val = E1.val}

E →id {E.val = id.lexval}

Second part:

Removing left recursion as,

E → (E)R | id R

R → +ER1 | *ER1 | ε

Now add the attributes within this non-left recursive grammar as,

E → (E) {R.in = E1.val}R {E.val = R.s}

E →id {R.in = id.lexval} R1 { E.val = R.s}

R → +E {R1.in = E.val+R.in}R1 {R.s = R1.s}

R → *E { R1.in = E.val*R.in }R1 { R.s = R1.s }

R → ε { R.s = R.in}

AI Generated Answer

9. Consider the following grammar for arithmetic expression using an operator ‘op’ to integer or real numbers.

E → E₁op E₂| num.num | num | id

Give the syntax directed definitions to determine the type of expression as when two integers are used in expression, resulting type is integer otherwise real.

6 marks

Details

Official Answer

E → id { E.type = lookup(id.entry)}

E → num {E.type = integer }

E → num.num { E.type = real}

E → E₁op E₂ { E.type = if(E1.type = integer and E2.type = integer)

then integer

else if (E1.type = integer and E2.type = real)

then real

else if (E1.type = real and E2.type = integer)

then real

else if (E1.type = real and E2.type = real)

then real

else type_error()

}

AI Generated Answer

10. What is three address code? Translate the expression a = b * - c + b * - c in to three address code statement.

6 marks

Details

Official Answer

The address code that uses at most three addresses, two for operands and one for result is called three address code. Each instruction in three address code can be described as a 4-tuple: (operator, operand1, operand2, result).

A quadruple (Three address code) is of the form: x = y op z where x, y and z are names, constants or compiler-generated temporaries and op is any operator.

Followings are the 3-address code statements for different statements.

Assignment statement:

x = y op z ( binary operator op)

x = op y ( unary operator op)

Copy statement: x = z
Unconditional jump: goto L ( jump to label L)
Conditional jump: if x relop y goto L
Procedure call:

param x1

param x2

. .

..

param xn

call p, n i.e. call procedure P(x1,x2,….. xn)

Indexed assignment:

x = y[i]

x[i] = y

Address and pointer assignments:

x = &y

x = *y

*x = y

Given expression

a = b * - c + b * - c

The three address code of given statement is given below:

t₁ = -c

t₂ = b * t₁

t₃ = -c

t₄ = b * t₃

t₅ = t₂ + t₄

a = t₅

AI Generated Answer

Course Content

Other Courses

Compiler Design and Construction 2072

Trivhuwan University

Pokhara University

Course Content

Other Courses

Compiler Design and Construction 2072