# Chapter 1: Regular Languages

• Regular languages play an important role in lexical analysis (the scanner) for a compiler.

## Chapter 1.3, Regular Expressions

### GNFAs

The book requires GNFAs to have the following three properties:

• The start state has transition arrows going to every other state but no arrows coming in from any other state.
• There is only a single accept state, and it has arrows coming in from every other state but no arrows going to any other state. Furthermore, the accept state is not the start state.
• Except for the start and accept states, one arrow goes from every state to every other state and also from each state to itself.

A generalized nondeterministic finite automaton (GNFA) is a 5-tuple where

• Q is the finite set of states,
• Σ is the alphabet,
• δ(Q - {qaccept}) × (Q - {qstart}) —> R is the transition function (R is a regular expression),
• qstart is the start state, and
• qaccept is the accept state.

A GNFA accepts a string w in Σ* if w = w1...wk where each wi is in Σ* and a sequence of states q0...qk exists such that

• q0 = qstart is the start state,
• qk = qaccept is the accept state, and
• for each i, wi ∈ L(Ri) where Ri = δ(qi-1, qi).

### DFA to Regular Expression Conversion Process

1. Create an n+2 state GNFA from an n state DFA as follows:
• Add a new start state with an ε transition to the DFA start state.
• Add a new accept state with ε transitions from the DFA accept states to the new accept state. (Change the DFA accept states to non-accept states.)
• A transition should contain the union of the DFA transition labels.
• Add the ∅ transition to pairs of states in the DFA that had no transition between them.
2. Repeatedly rip out one of the former DFA states using the following procedure until only the start state and accept state are left
• Call the state being removed qrip
• Consider a pair of states qj and qk
• if qj goes to qrip with R1, qrip goes to qrip with R2, qrip goes to qk with R3, and qj goes to qk with R4, then the transition from qj to qk in the machine with qrip removed is now R1(R2)*R3 ∪ R4.
3. The regular expression equivalent appears on the transition from the start state to the accept state.

### Lecture Problem

• Construct a DFA that recognizes any string that starts with an a over Σ = {a, b}.
• Using the procedure above, find the equivalent regular expression.

### Active Learning Problem

• Construct a DFA that recognizes any string that starts with an a and ends with a b over Σ = {a, b}.
• Using the procedure above, find the equivalent regular expression.

### Active Learning Problem

• Understand Example 1.68 on page 76.