Chapter 1: Regular Languages
For Your Enjoyment
- Regular languages play an important role in
lexical analysis
(the scanner) for a compiler.
Chapter 1.3, Regular Expressions
GNFAs
The book requires GNFAs to have the following three properties:
- The start state has transition arrows going to every other state but
no arrows coming in from any other state.
- There is only a single accept state, and it has arrows coming in from
every other state but no arrows going to any other state. Furthermore,
the accept state is not the start state.
- Except for the start and accept states, one arrow goes from every state
to every other state and also from each state to itself.
A generalized nondeterministic finite automaton (GNFA) is a 5-tuple where
- Q is the finite set of states,
- Σ is the alphabet,
- δ(Q - {qaccept}) × (Q - {qstart})
—> R is the transition function (R is a regular expression),
- qstart is the start state, and
- qaccept is the accept state.
A GNFA accepts a string w in Σ* if
w = w1...wk where each wi is in
Σ* and a sequence of states q0...qk
exists such that
- q0 = qstart is the start state,
- qk = qaccept is the accept state, and
- for each i, wi ∈ L(Ri) where
Ri = δ(qi-1, qi).
DFA to Regular Expression Conversion Process
- Create an n+2 state GNFA from an n state DFA as follows:
- Add a new start state with an ε transition to the DFA start state.
- Add a new accept state with ε transitions from the DFA accept states
to the new accept state. (Change the DFA accept states to non-accept
states.)
- A transition should contain the union of the DFA transition labels.
- Add the ∅ transition to pairs of states in the DFA that had no
transition between them.
- Repeatedly rip out one of the former DFA states using the following
procedure until only the start state and accept state are left
- Call the state being removed qrip
- Consider a pair of states qj and qk
- if qj goes to qrip with R1,
qrip goes to qrip with R2,
qrip goes to qk with R3, and
qj goes to qk with R4, then
the transition from qj to qk in the machine
with qrip removed is now
R1(R2)*R3 ∪ R4.
- The regular expression equivalent appears on the transition from
the start state to the accept state.
Lecture Problem
- Construct a DFA that recognizes any string that starts with an a
over Σ = {a, b}.
- Using the procedure above, find the equivalent regular expression.
Active Learning Problem
- Construct a DFA that recognizes any string that starts with an a
and ends with a b over Σ = {a, b}.
- Using the procedure above, find the equivalent regular expression.
Active Learning Problem
- Understand Example 1.68 on page 76.