A Formal Definition of Regular Languages
We just defined an FSA in formal terms on the previous page, but just to keep you from flipping back and forth, we'll copy the definition here:
A finite state automaton is a mathematical object with five conponents, M = (Q, Σ, δ, q0, F), where
- Q is a finite set of states
- Σ is a finite set of input symbols
- δ: Q×Σ →Q is a total function (called the transition function)
- q0∈Σ is the start state
- F ⊆ Q is the set of accepting states. □
We know by now what an FSA computes. It simply computes whether a given string over the alphabet Σ is accepted or not. Some strings it accepts, others it rejects. Yes or no, in or out, up or down to every string submitted to it. The FSA thus solves a decision problem, namely whether a given input string is in the language recognized by the FSA or not.
Intuitively we know how an FSA works. We supply a string to the FSA, let it run on that string, and check whether the FSA is in an accepting state or a non-accepting state at the end. Intuition is good, but now we need to specifiy this more formally. We need some way to convey that a computation of an FSA is what happens when:
- the FSA starts in the start state
- at each step of its computation the FSA transitions to the next state according to the transition δ function applied to the current state and current input symbol
- the FSA stops after consuming all input in either an accepting or non-accepting state.
That is, we need to know what state the FSA winds up in when δ is applied to the current state and input symbol at each step. If the input string, w, to be processed by the FSA is
w = a1 a2...an
where each ai is a symbol from Σ, we can write the computation of the FSA as
δ(...(δ(δ(q0, a1), a2)...),an)
This long expression is constructed by normal function composition. Notice that the innermost function evaluation is
δ(q0, a1).
Evaluation of the function with these value results in some new state, p, so the next innermost function evaluation will be
δ(p, a2),
which will lead to another state r, and so forth, until the last (outermost) function evaluation is applied to the last input symbol, an, leading to the last state of the computation. If that state is an accepting state, then the original string w is accepted by the FSA, otherwise it is rejected.
This way of writing the computation, δ(..(δ(δ(q0, a1), a2)...),an), is just too cumbersome and not entirely clear, so we define a new function that captures the same meaning.
Definition
Let M = (Q, Σ, δ, q0, F), = be a finite state automaton. Define the function : Q× → Q as follows:
(q, w) = (δ(q, a),x) for any string w∈ , where a is the first symbol of w; that is, w = ax, a ∈ Σ, and x ∈
(q, ε) = q
A string w ∈ is said to be accepted by M if (q0, w) ∈ F; otherwise the string w is said to be rejected.
The set
L(M) = {w | (q0,w) ∈ F}
is refered to as the set, or language, recognized by M. □
Notice that this is a recursive definition that simply states with mathematical precision that applied to any string over the alphabet of an FSA is computed by applying
Definition
A set L of strings over a specified alphabet is said to be a regular language if and only if L is recognized by some FSA. □