Compiler Design and Construction - Old Questions

9.  What are the advantages of intermediate code? Describe various representation of intermediate code.

6 marks | Asked in 2073

The given program in a source language is converted into an equivalent program in an intermediate language by the intermediate code generator. Intermediate codes are machine independent codes.

Advantages of intermediate code:

  • If a compiler translates the source language to its target machine language without having the option for generating intermediate code, then for each new machine, a full native compiler is required.
  • Intermediate code eliminates the need of a new full compiler for every unique machine by keeping the analysis portion same for all the compilers.
  • It becomes easier to apply the source code modifications to improve code performance by applying code optimization techniques on the intermediate code.

There are three kinds of  intermediate code representations:

  • Graphical representation (Syntax tree or DAGs)
  • Postfix notations
  • Three Address codes

Syntax Tree

Syntax tree is a graphic representation of given source program and it is also called variant of parse tree. A tree in which each leaf represents an operand and each interior node represents an operator is called syntax tree. For e.g. Syntax tree for the expression a+b*c-d/(b*c)

Directed Acyclic Graph (DAG)

A DAG for an expression identifies the common sub expressions in the expression. It is similar to syntax tree, only difference is that a node in a DAG representing a common sub expression has more than one parent, but in syntax tree the common sub expression would be represented as a duplicate sub tree. For e.g. DAG for the expression a+b*c-d/(b*c)

Postfix Notation

The representation of an expression in operators followed by operands is called postfix notation of that expression. In general if x and y be any two postfix expressions and OP is a binary operator then the result of applying OP to the x and y in postfix notation by "x y OP". For e.g. (a+ b) * c in postfix notation is: a b + c *

Three Address Code

The address code that uses at most three addresses, two for operands and one for result is called three address code. Each instruction in three address code can be described as a 4-tuple: (operator, operand1, operand2, result).

A quadruple (Three address code) is of the form: x = y op z where x, y and z are names, constants or compiler-generated temporaries and op is any operator.

For e.g.

The three-address code for a+b*c-d/(b*c)