LR Parsing #4: Runtime Encoding of LR(0)/SLR(1) Parsing DFA (How to Construct the Parsing Tables)

July 22, 2025 5 分钟阅读

1. $LR(0)$ Parsing Table Construction

Recall LR Parsing #2: Structural Encoding of LR(0) Parsing DFA and compute the Canonical Collection of Set of $LR(0)$ Items $C$ for the augmented grammar $G’ = (V, \Sigma, S’, P)$:

\[C = \operatorname{CC}(G') = \lbrace I_0, I_1, \dots, I_n \rbrace\]

Number the productions from $1$ to $m$ where $m = \vert P \vert$.

Make two tables, $T_{\operatorname{ACTION}}$ and $T_{\operatorname{GOTO}}$, as in Knuth’s Parsing Table. Or you can think them as two functions:

\[\begin{align} \operatorname{ACTION}:& \; C \times (\Sigma \cup \lbrace \Finv \rbrace) \to \lbrace \text{"shift"}, \text{"reduce 1"}, \dots \text{"reduce m"}, \text{"accept"} \rbrace \newline \operatorname{GOTO}:& \; C \times (V \cup \Sigma) \to C \end{align}\]

where $\Finv$ is the EOF symbol, and $\operatorname{GOTO}$ function is exactly the same one we talked about in LR Parsing #2: Structural Encoding of LR(0) Parsing DFA.

$\forall I_i \in C$, fill the $LR(0)$ parsing table following the rules below:

Rule	Condition	Table Construction
1	if $\exists a \in \Sigma$ such that $\operatorname{GOTO}(I_i, a) = I_j$	mark $T_{\operatorname{ACTION}}[i, a] = \text{“shift”}$
		mark $T_{\operatorname{GOTO}}[i, a] = j$
2	if $\exists A \in V$ such that $\operatorname{GOTO}(I_i, A) = I_j$	mark $T_{\operatorname{GOTO}}[i, A] = j$
3	if $\exists [S’ \to S \cdot] \in I_i$	mark $T_{\operatorname{ACTION}}[i, \Finv] = \text{“accept”}$
4	if $\exists [A \to \alpha \cdot] \in I_i$, $A \neq S’$, and $A \to \alpha$ is production $k$	$\forall e \in \Sigma \cup \lbrace \Finv \rbrace$, mark $T_{\operatorname{ACTION}}[i, e] = \text{“reduce k”}$

All entries not marked by the procedure can be considered $\text{“error”}$, and indicate that the input string is not accepted by the grammar.

We can see that $\operatorname{GOTO}$ function and $T_{\operatorname{GOTO}}$ table are essentially the same thing.

Note that for Rule 3:

If you augment grammar $G$ into $G’$ with $S’ \to S$, then $[S’ \to S \cdot]$ is the accepting position
If you augment grammar $G$ into $G’$ with $S’ \to S \Finv$, then $[S’ \to S \cdot \Finv]$ is the accepting position
These two notations mean the same thing; the only difference is how they look on the surface.
- However the latter makes $T_{\operatorname{ACTION}}[i, \Finv]$ easier to understand (since $[S’ \to S \cdot \Finv]$ is a position explicitly expecting $\Finv$)

Note that for Rule 4:

You can treat Rule 3 as a special case of Rule 4
I.e. $\text{“accept”}$ is exactly a $\text{“reduce”}$ to $S’$, right before $\Finv$

Note that for Rule 1, modern parsing tables would combine the two table entries into one:

Rule	Condition	Table Construction
1	if $\exists a \in \Sigma$ such that $\operatorname{GOTO}(I_i, a) = I_j$	mark $T_{\operatorname{ACTION}}[i, a] = \text{“shift j”}$

and in this way, the nature of the two tables have changed accordingly:

\[\begin{align} T_{\operatorname{ACTION}}:& \; C \times (\Sigma \cup \lbrace \Finv \rbrace) \to \lbrace \text{"shift 0"}, \dots, \text{"shift n"}, \text{"reduce 1"}, \dots \text{"reduce m"}, \text{"accept"} \rbrace \newline T_{\operatorname{GOTO}}:& \; C \times V \to C \end{align}\]

2. $SLR(1)$ Parsing Table Construction

Note that $SLR$ and $SLR(1)$ are interchangable, and $SLR(k>1)$ does exist but is out of our discussion here.

$SLR(1)$ differs with $LR(0)$ on Rule 4. For $SLR(1)$, we have:

Rule	Condition	Table Construction
4	if $\exists [A \to \alpha \cdot] \in I_i$, $A \neq S’$, and $A \to \alpha$ is production $k$	$\forall e \in \operatorname{FOLLOW}(A)$, mark $T_{\operatorname{ACTION}}[i, e] = \text{“reduce k”}$

3. Shift-Reduce Conflicts

$T_{\operatorname{GOTO}}$ is impossible to have conflicts.

$T_{\operatorname{ACTION}}$ may have conflicts, the most common being shift-reduce conflicts, i.e. one entry $T_{\operatorname{ACTION}}(I_i, a)$ may have two values, a $\text{“shift”}$ and a $\text{“reduce k”}$.

It’s impossible to have shift-shift conflicts in $T_{\operatorname{ACTION}}$.

Reduce-reduce conflicts are possible but kinda rare. E.g. when you grammar has both $A \to a$ and $B \to a$, how do you reduce $a \rhd ?$

Theorem: If any conflicting actions result from $LR(0)$ parsing table construction, we say the grammar is not $LR(0)$. $\blacksquare$

Theorem: If any conflicting actions result from $SLR(1)$ parsing table construction, we say the grammar is not $SLR(1)$. $\blacksquare$

4. $LR(0)$ vs $SLR(1)$

Theorem: Every $LR(0)$ grammar is $SLR(1)$. $\blacksquare$

The following grammar is $SLR(1)$ but not $LR(0)$:

S' -> S   // production 1
S -> a S  // production 2
S -> a    // production 3

---
config:
  layout: elk
  look: handDrawn
---
flowchart LR
    subgraph I0["State#0"]
        direction TB
        A["$$S' \to \cdot S$$"]
        B["$$S \to \cdot aS$$"]
        C["$$S \to \cdot a$$"]
    end

    subgraph I1["State#1"]
        direction TB
        D["$$S \to a \cdot S$$"]
        E["$$S \to a \cdot$$"]
        F["$$S \to \cdot aS$$"]
        G["$$S \to \cdot a$$"]
    end

    subgraph I2["State#2"]
        direction TB
        H["$$S \to a S \cdot $$"]
    end

    subgraph I3["State#3"]
        direction TB
        I["$$S' \to S \cdot$$"]
    end

    I0 -->|$$a$$| I1 -->|$$S$$| I2
    I0 -->|$$S$$| I3
    I1 -->|$$a$$| I1

    style I3 color:#f66

“State#3” in red font represents the accept state.

If we construct a $LR(0)$ parsing table for the grammar, we’ll get:

State#	ACTION		GOTO
	$a$	$\Finv$	$a$	$S$
$0$	shift		$1$	$3$
$1$	shift / reduce 3	reduce 3	$1$	$2$
$2$	reduce 2	reduce 2
$3$		accept

The conflict at $T_{\operatorname{ACTION}}(I_1, a)$ is a dilemma at such a configuration:

\[\big[ (\_, I_0), (a, I_1) \big] \mid a\]

you can either shift the input $a$ (according to $[S \to \cdot a S]$)
or reduce the current $a$ on stack top (according to $[S \to \cdot a]$)

If we construct a $SLR(1)$ parsing table for the grammar, we’ll get:

State#	ACTION		GOTO
	$a$	$\Finv$	$a$	$S$
$0$	shift		$1$	$3$
$1$	shift	reduce 3	$1$	$2$
$2$	reduce 2	reduce 2
$3$		accept

since $\operatorname{FOLLOW}(S) = \lbrace \Finv \rbrace$ and therefore for the $I_0$ row, only $T_{\operatorname{ACTION}}(I_1, \Finv)$ is marked reduce.

It means at the same configuration:

\[\big[ (\_, I_0), (a, I_1) \big] \mid a\]

$SLR(1)$ will prefer shifting in a new $a$ over reducing the current $a$ on stack top.

5. Digression: Different Representation of DFA and Parsing Table

ComVis will represent our DFA as:

---
config:
  layout: elk
  look: handDrawn
---
flowchart LR
    subgraph I0["State#0"]
        direction TB
        A["$$S' \to \cdot S$$"]
        B["$$S \to \cdot aS$$"]
        C["$$S \to \cdot a$$"]
    end

    subgraph I1["State#1"]
        direction TB
        D["$$S \to a \cdot S$$"]
        E["$$S \to a \cdot$$"]
        F["$$S \to \cdot aS$$"]
        G["$$S \to \cdot a$$"]
    end

    subgraph I2["State#2"]
        direction TB
        H["$$S \to a S \cdot $$"]
    end

    subgraph I3["State#3"]
        direction TB
        I["$$S' \to S \cdot \Finv$$"]
    end

    subgraph I4["State#4"]
        direction TB
        J["$$S' \to S \Finv \cdot$$"]
    end

    I0 -->|$$a$$| I1 -->|$$S$$| I2
    I0 -->|$$S$$| I3 -->|$$\Finv$$| I4
    I1 -->|$$a$$| I1

    style I4 color:#f66

and the corresponding $SLR(1)$ parsing table would be like:

State#	ACTION		GOTO
	$a$	$\Finv$	$S$
$0$	shift 1		$3$
$1$	shift 1	reduce 3	$2$
$2$	reduce 2	reduce 2
$3$		shift 4
$4$	accept	accept

X Facebook LinkedIn Bluesky

LR Parsing #4: Runtime Encoding of LR(0)/SLR(1) Parsing DFA (How to Construct the Parsing Tables)

1. $LR(0)$ Parsing Table Construction

2. $SLR(1)$ Parsing Table Construction

3. Shift-Reduce Conflicts

4. $LR(0)$ vs $SLR(1)$

5. Digression: Different Representation of DFA and Parsing Table

分享

留下评论

猜您还喜欢

Zig: the Hash Map Types Maze

Zig: `defer` / `errdefer`

Bootstrapping in Compiler Design / Self-Hosting Compilers

Left recursion is not a problem for LR parsing

1. $LR(0)$ Parsing Table Construction

2. $SLR(1)$ Parsing Table Construction

3. Shift-Reduce Conflicts

4. $LR(0)$ vs $SLR(1)$

5. Digression: Different Representation of DFA and Parsing Table

分享

留下评论

猜您还喜欢

Zig: the Hash Map Types Maze

Zig: defer / errdefer

Bootstrapping in Compiler Design / Self-Hosting Compilers

Left recursion is not a problem for LR parsing

Zig: `defer` / `errdefer`