Syntax Reading and Exercises for Exam 2

Instructions

Exercises C (Grammars, Parse Trees)

Read sections 2.3 (Lexical Syntax) and 2.4 (Context-Free Grammars) on pp. 33 – 41 of Sethi Do problems 2.4, 2.6, 2.9, and 2.14 on pp. 49 – 51.

Problem 2.4

Give context-free grammars describing the syntax of each of the following:

Part a: Strings of length one or more over the set of terminals {blank, tab, newline}.
Solution
```
<str> ::= <item> | <str> <item>
<item> ::= blank | tab | newline
```

Part b: Sequences of letters or digits, starting with a letter.

Solution

<seq> ::= <letter>
       | <seq> <letter>
       | <seq> <digit>
<letter> ::= a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z
<digit> ::= 0|1|2|3|4|5|6|7|8|9

Part c: Real numbers in which either the integer part or the fractional part can be empty, but not both. Thus, the grammar must allow 31., 3.1, and .13, but not a decimal point by itself.
Solution
```
<real num> ::= <int part> . <frac> | <int part> . | . <frac>
<int part> ::= <digit> | <int part> <digit>
<frac> ::= <digit> | <digit> <frac>
<digit> ::= 0|1|2|3|4|5|6|7|8|9
```

Problem 2.6

For each of the following strings, draw a parse tree with respect to the grammar for arithmetic expressions in Fig. 2.6:

Note: E stands for Expression, T for Term, F for Factor

alt text

Part a: 2 + 3

Solution

           E
         / | \
        E  +  T
       /       \
      T         F
      |         |
      F       number:3
      |
   number:2

Part b: (2 + 3)

Solution

         E
         |
         T
         |
         F
       / | \
      (  E  )
       / | \
      E  +  T
     /       \
    T         F
    |         |
    F      number:3
    |
 number:2

Part c: 2 + 3 * 5

Solution

              E
            / | \
           E  +  T
         /     / | \
        T     T  *  F
       /      |     |
      F       F   number:5
     /        |
  number:2  number:3

Part d: (2 + 3) * 5

Solution

             E
             |
             T
           / | \
          T  *  F
         /      |
        F     number:5
      / | \
    (   E   )
      / | \
     E  +  T
     |     |
     T     F
     |     |
     F   number:3
     |
  number:2

Part e: 2 + (3 * 5)

Solution

          E
        / | \
       E  +   T
      /        \
     T          F
    /         / | \
   F        (   E   )
   |            |
number:2        T
              / | \
             T  *  F
            /      |
           F     number:5
           |
        number:3

Problem 2.9

The BNF rules in each of the following cases describe a construct that consists of a list of elements. Describe the list. How many elements does it have? Does anything appear between or after the elements?

Part a.
```
<name-list> ::= <name>
              | <name> , <name-list>
```
Solution

This describes a list of one or more <name>s, separated by commas.
- Number of elements: One or more
- What appears between elements: Commas (,)
- What appears after elements: Nothing
Part b.
```
<field-list> ::= <field> ;
               | <field> ; <field-list>
```
Solution

This describes one or more occurrences of <field>; (i.e., one or more fields, with each <field> followed by a semicolon).
- Number of elements: One or more
- What appears between elements: Semicolons (;)
- What appears after elements: Each field is followed by a semicolon
Part c.
```
<statement-list> ::= <empty>
                   | <statement> ; <statement-list>
```
Solution

This describes zero or more occurrences of <statement>; (i.e., zero or more statements, each followed by a semicolon).
- Number of elements: Zero or more
- What appears between elements: Semicolons (;)
- What appears after elements: Each statement is followed by a semicolon
Part d.
```
<term> ::= <term> * <factor>
         | <factor>
```
Solution

This describes one or more <factor>s, separated by *.
- Number of elements: One or more
- What appears between elements: Multiplication operators (*)
- What appears after elements: Nothing
Part e.
```
<variables> ::= <empty>
              | var <var-decls>
<var-decls> ::= <name-list> : <type> ;
              | <name-list> : <type> ; <var-decls>
```
Solution

This describes either the empty string, or "var" followed by one or more occurrences of <name-list> : <type>; (thus each occurrence of <name-list> : <type> is followed by a semicolon).
- Number of elements: Zero or more
- What appears between elements: Semicolons (;)
- What appears after elements: Each <name-list> : <type> is followed by a semicolon
Part f.
```
<constants>   ::= <empty>
                | <const-decls>
<const-decls> ::= const <name> = <constant> ;
                | <const-decls> <name> = <constant> ;
```
Solution

This describes either the empty string, or "const" followed by one or more occurrences of <name> = <constant>; (thus each occurrence of <name> = <constant> is followed by a semicolon).
- Number of elements: Zero or more
- What appears between elements: Semicolons (;)
- What appears after elements: Each <name> = <constant> is followed by a semicolon

Problem 2.14

The following grammar is based on the syntax of statements in Pascal:

S  ::= id := expr
    |  if expr then S
    |  if expr then S else S
    |  while expr do S
    |  begin SL end

SL ::= S
    |  S ; SL

Note: There is a misprint in the original problem. The first production with SL on the left should read "SL ::= S" rather than "SL ::= S ;".

Draw parse trees for each of the following:

Part a: while expr do id := expr

Solution

S
  while
  expr
  do
  S
    id
    :=
    expr

Part b: begin id := expr end

Solution

S
  begin
  SL
    S
      id
      :=
      expr
  end

Part c: if expr then if expr then S else S
Solution
```
S
  if
  expr
  then
  S
    if
    expr
    then
    S
    else
    S
```
This demonstrates the “dangling else” problem. In this grammar, the else associates with the closest preceding if (the inner one).

Exercises D (EBNF)

Read the Extended BNF subsection of Sec. 2.6. Do problems 2.9 and 2.16a†† on pp. 50 – 52.

†† Re 2.16: The first line of the EBNF specification on p. 52 uses [ ] to mean an empty string.

Problem 2.16(a)

The following EBNF grammar is based on the syntax of statements in the Modula-2 programming language:

S  ::= [ ]
    |  id := expr
    |  if expr then SL { elsif expr then SL } [ else SL ] end
    |  loop SL end
    |  while expr do SL end

SL ::= S { ; S }

Token id represents a variable and token expr represents an expression. Note that [ ] stands for the empty string.

Part a: Write a BNF version of this grammar.

Solution

Note: In the EBNF grammar, [ ] is used to represent the empty string (ε).

<S> ::= id := expr
      | if expr then SL elsif-part else-part end
      | loop SL end
      | while expr do SL end
      | <empty>

<SL> ::= SL ; S
       | S

<elsif-part> ::= elsif-part elsif expr then SL
               | <empty>

<else-part> ::= else SL
              | <empty>

Exercises E (Derivations, Grammars for Expressions, Syntactic Ambiguity, Syntax Charts)

Do problems 2.5¶, 2.6, 2.10‡, 2.11, 2.12§, 2.13§, 2.15a, 2.15b, and 2.16b on pp. 49 – 52.

¶ Re 2.5(a): A leftmost derivation is a derivation in which the leftmost nonterminal is replaced at each step. A rightmost derivation is a derivation in which the rightmost nonterminal is replaced at each step. The derivation of 2 1 . 8 9 from real-number that is shown on p. 41 is a leftmost derivation:

For example, its second step replaces the leftmost nonterminal (integer-part) in the string integer-part . fraction with integer-part digit, and then its third step replaces the leftmost nonterminal in the string integer-part digit . fraction with digit. The second step of a rightmost derivation of 2 1 . 8 9 from real-number would replace the rightmost nonterminal (fraction) in the string integer-part . fraction with digit fraction

‡ Hint for 2.10(b): Put the unary – operator in the same precedence class as binary + and binary –

§ Even though problem 2.12 does not say this, the grammar you write for that problem must be able to generate expressions that involve ! (the logical negation operator) as well as the binary operators shown in Figure 2.9; otherwise you will not be able to do parts b and c of problem 2.13. ! is a right-associative unary prefix operator and has higher precedence than any binary operator.

Problem 2.5

Consider the grammar for real numbers in Fig. 2.3:

alt text

Part a: Give a leftmost derivation of the string 2.89

Solution

A leftmost derivation replaces the leftmost nonterminal at each step.

<real number> ⇒ <integer part> . <fraction>
              ⇒ <digit> . <fraction>
              ⇒ 2 . <fraction>
              ⇒ 2 . <digit> <fraction>
              ⇒ 2 . 8 <fraction>
              ⇒ 2 . 8 <digit>
              ⇒ 2 . 8 9

Part b: Draw tree snapshots corresponding to the derivation in (a).

Solution

These are snapshots of the parse tree at each step of the leftmost derivation:

Snapshot 1:

<real number>

Snapshot 2:

           <real number>
              /   |   \
<integer part>    .    <fraction>

Snapshot 3:

          <real number>
             /   |   \
<integer part>   .    <fraction>
      |
   <digit>

Snapshot 4:

              <real number>
              /    |      \
<integer part>     .       <fraction>
      |
   <digit>
      |
      2

Snapshot 5:

              <real number>
              /    |      \
<integer part>     .       <fraction>
      |                      /    \
   <digit>              <digit>    <fraction>
      |
      2

Snapshot 6:

              <real number>
              /    |      \
<integer part>     .       <fraction>
      |                      /    \
   <digit>              <digit>    <fraction>
      |                   |
      2                   8

Snapshot 7:

              <real number>
              /    |      \
<integer part>     .       <fraction>
      |                      /    \
   <digit>              <digit>    <fraction>
      |                    |           |
      2                    8        <digit>

Snapshot 8 (Final):

              <real number>
              /    |      \
<integer part>     .       <fraction>
      |                      /    \
   <digit>              <digit>    <fraction>
      |                    |           |
      2                    8        <digit>
                                       |
                                       9

Problem 2.10

The operators of Pascal are as follows (the in keyword tests set membership):

<  <=  =  <>  >=  >  in
+  -  or
*  /  div  mod and
not

All of these operators are left associative. All operators on the same line have the same precedence. The lines are in order of increasing precedence.

Part a: Write an expression grammar for Pascal.

Solution

<class 1 prefix op> ::= not
<class 2 binary op> ::= * | / | div | mod | and
<class 3 binary op> ::= + | - | or
<class 4 binary op> ::= < | > | <= | >= | = | <> | in

<exp0> ::= <constant> | <variable> | ( <exp4> )
<exp1> ::= <exp0> |        <class 1 prefix op> <exp0>
<exp2> ::= <exp1> | <exp2> <class 2 binary op> <exp1>
<exp3> ::= <exp2> | <exp3> <class 3 binary op> <exp2>
<exp4> ::= <exp3> | <exp4> <class 4 binary op> <exp3>

<exp4> is the starting nonterminal.

Part b: Extend your grammar to handle expressions with a leading minus sign, to allow -1 and -(a-b), but not 2 + -3.
Solution

Put unary - into class 3, the same precedence class as binary + and -. Thus we add the production
```
<class 3 prefix op> ::= -
```
and change the <exp3> ::= productions to:
```
<exp3> ::= <exp2>
        | <exp3> <class 3 binary op> <exp2>
        | <class 3 prefix op> <exp2>
```
Note: The solution to (a) is based on the problem as it is stated, rather than on the syntax of Standard Pascal. In Standard Pascal, at most one relational operator may appear at the top level of an expression. However, the problem does not mention this restriction. In Standard Pascal, expressions of the form “not not e” are legal. However, the problem says that NOT is left-associative, which some people interpret as implying that such expressions are illegal–see, e.g., the paragraph of: https://stackoverflow.com/a/14085309

If at most one relational operator is to be allowed at the top level, then the productions for <exp4> should be changed to:
```
<exp4> ::= <exp3> | <exp3> <class 4 binary op> <exp3>
```
To allow expressions of the form “not not e”, the productions for <exp1> should be changed to:
```
<exp1> ::= <exp0> | <class 1 prefix op> <exp1>
```

Problem 2.11

For each of the following expressions, draw parse trees with respect to your grammar for Exercise 2.10.

Part a: i >= 0

Solution

We write <bop4>, <bop2> and <prop1> for <class 4 binary op>, <class 2 binary op> and <class 1 prefix op>, respectively.

          <exp4>
         /   |   \
   <exp4>  <bop4> <exp3>
      |      |      |
    <exp3>   >=   <exp2>
      |             |
    <exp2>        <exp1>
      |             |
    <exp1>        <exp0>
      |             |
    <exp0>     <constant:0>
      |
 <variable:i>

Part b: (i >= 0) and not p

Solution

                  <exp4>
                    |
                  <exp3>
                    |
                  <exp2>
                 /  |   \
           <exp2> <bop2> <exp1>
             |      |     |    \
          <exp1>   and  <prop1> <exp0>
             |            |       |
          <exp0>         not  <variable:p>
         /   |   \
       (  <exp4>   )
         /   |   \
   <exp4>  <bop4> <exp3>
      |     |       |
   <exp3>   >=    <exp2>
      |             |
   <exp2>         <exp1>
      |             |
    <exp1>        <exp0>
      |             |
    <exp0>      <constant:0>
      |
  <variable:i>

Part c: i >= 0 and not p

Solution

                      <exp4>
                    /   |   \
              <exp4> <bop4>  <exp3>
              /        |           \
        <exp3>         >=          <exp2>
          |                      /   |    \
        <exp2>             <exp2>  <bop2>  <exp1>
          |                   |      |      |    \
        <exp1>             <exp1>   and  <prop1>  <exp0>
          |                   |             |       |
        <exp0>              <exp0>         not   <variable:p>
          |                   |
    <variable:i>          <constant:0>

Part d: (i >= 0) and (x <> y)

Solution

                          <exp4>
                            |
                          <exp3>
                            |
                          <exp2>
                        /   |   \
                  <exp2>  <bop2> <exp1>
                    |      |          \
                  <exp1>  and          <exp0>
                    |                 /   |   \
                  <exp0>            (   <exp4>  )
                /   |  \               /  |   \
              (  <exp4>  )       <exp4> <bop4> <exp3>
                /  |   \            |     |        |
         <exp4>  <bop4> <exp3>   <exp3>  <>     <exp2>
          |        |      |         |              |
        <exp3>     >=   <exp2>   <exp2>         <exp1>
          |                |        |              |
        <exp2>           <exp1>   <exp1>         <exp0>
          |                |        |              |
        <exp1>           <exp0>   <exp0>       <variable:y>
          |                |        |
        <exp0>      <constant:0>  <variable:x>
                           |
                    <variable:i>

Problem 2.12

Write an expression grammar using the table of C operators in Fig. 2.9.

alt text

Solution

It is clear from problem 2.13 that the ! operator should be included in the table; in C/C++ ! has higher precedence than all other operators in the table, and it is right-associative.

<class 1 prefix op> ::= !
<class 2 binary op> ::= * | / | %
<class 3 binary op> ::= + | -
<class 4 binary op> ::= << | >>
<class 5 binary op> ::= < | > | <= | >=
<class 6 binary op> ::= == | !=
<class 7 binary op> ::= &
<class 8 binary op> ::= ^
<class 9 binary op> ::= |
<class 10 binary op> ::= &&
<class 11 binary op> ::= ||
<class 12 binary op> ::= =

<exp[0]> ::= <variable> | <constant> | ( <exp[12]> )
<exp[1]> ::= <exp[0]> | <class 1 prefix op> <exp[1]>
For 2 <= i <= 11,
  <exp[i]> ::= <exp[i-1]> | <exp[i]> <class i binary op> <exp[i-1]>
<exp[12]> ::= <exp[11]> | <exp[11]> <class 12 binary op> <exp[12]>

<exp[12]> is the starting nonterminal.

Problem 2.13

For each of the following expressions, draw parse trees with respect to your grammar for Exercise 2.12.

Part a: i >= 0

Solution

We write <bop10>, <bop6>, <bop5> and <prop1> for <class 10 binary op>, <class 6 binary op>, <class 5 binary op> and <class 1 prefix op>, respectively.

           <exp[12]>
               |
           <exp[11]>
               |
           <exp[10]>
               |
            <exp[9]>
               |
            <exp[8]>
               |
            <exp[7]>
               |
            <exp[6]>
               |
            <exp[5]>
          /    |    \
  <exp[5]>   <bop5>  <exp[4]>
     |         |        |
  <exp[4]>    >=     <exp[3]>
     |                  |
  <exp[3]>          <exp[2]>
     |                  |
  <exp[2]>          <exp[1]>
     |                  |
  <exp[1]>          <exp[0]>
     |                  |
  <exp[0]>          <constant:0>
     |
 <variable:i>

Part b: (i >= 0) && ! p
Part c: i >= 0 && ! p
Part d: i >= 0 && x != y

Problem 2.15

The following grammar is motivated by declarations in C:

   Declaration ::= Type Declarator ;
          Type ::= int
                 | char
    Declarator ::= * Declarator
                 | Declarator '[' number ']'
                 | Declarator '(' Type ')'
                 | '(' Declarator ')'
                 | name

Part a: Prove the syntactic ambiguity of this grammar by finding a string that has more than one parse tree. Draw the parse trees.

Solution

Any Declaration that involves * Declarator ( Type ) or * Declarator [ number ] has more than one parse tree.

For example, int * name [ number ] has two parse trees:

Parse Tree 1:

          Declaration
         /            \
     Type            Declarator
    /               /  |    |   \
  int      Declarator  [  number  ]
              /  \
             *    Declarator
                      |
                     name

Parse Tree 2:

          Declaration
          /           \
        Type          Declarator
       /               /       \
     int              *      Declarator
                           /   |    |   \
                  Declarator  [  number  ]
                      |
                     name

Part b: The constructs ‘[’ number ‘]’ and ‘(‘ Type ‘)’ can be thought of as being postfix operators with a declarator as an operand. Suppose that * has lower precedence than these operators. Write an unambiguous grammar that generates the same strings as this grammar and makes it easy to identify the components of a declarator.
Solution
```
Decl0 ::= name | ( Decl2 )
Decl1 ::= Decl0 | Decl1 <class 1 postfix op>
Decl2 ::= Decl1 | <class 2 prefix op> Decl2

<class 1 postfix op> ::= [ number ] | ( Type )
<class 2 prefix op> ::= *

Decl2 is the starting nonterminal.
```

Problem 2.16(b)

The following EBNF grammar is based on the syntax of statements in the Modula-2 programming language:

S  ::= [ ]
    |  id := expr
    |  if expr then SL { elsif expr then SL } [ else SL ] end
    |  loop SL end
    |  while expr do SL end

SL ::= S { ; S }

Token id represents a variable and token expr represents an expression. Note that [ ] stands for the empty string.

Part b: Write a syntax chart for this grammar.

Solution

NOTATION: (...) denotes an oval node. |...| denotes a rectangular node.

In the given grammar expr is a terminal, even though it would be a non-terminal in a larger grammar that specifies the syntax of an entire programming language. That is why expr nodes in the syntax charts below are oval.

‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐>‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
  |                                                             |
  |‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐(id)‐‐(:=)‐‐(expr)‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐|
  |                                                             |
  |‐‐(if)‐‐‐(expr)‐‐(then)‐‐|SL|‐‐‐‐‐‐‐‐(else)‐‐|SL|‐‐‐‐‐(end)‐‐|
  |       |                       |  |                |         |
  |        ‐‐‐‐‐<‐‐‐‐(elsif)‐‐‐<‐‐‐   ‐‐‐‐‐‐‐>‐‐‐‐‐‐‐           |
  |                                                             |
  |‐‐‐‐‐‐‐‐‐‐‐‐‐‐(loop)‐‐|SL|‐‐(end)‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐|
  |                                                             |
   ‐‐‐‐‐‐‐‐‐‐‐‐(while)‐‐(expr)‐‐(do)‐‐|SL|‐‐(end)‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐

‐‐‐‐>‐‐‐‐|S|‐‐‐>‐‐‐‐‐
  |              |
   ‐‐‐<‐‐(;)‐‐<‐‐