Chapter 10: Learning Sets of Rules

Decision trees can be used to learn sets of rules
Genetic algorithms can be used to learn sets of rules
The focus of this chapter will be learning rules one at a time
This chapter will show how to learn both propositional rules and first order logic rules

Sequential Covering Algorithms

Set the learned rules to the empty set
Apply the learn one rule algorithm to return a new rule
While the performance of the new rule on the training examples exceeds some threshold
- Add the new rule to the learned rules
- Remove from the training examples the examples that are correctly classified by the new rule
- Apply the learn one rule algorithm to return another new rule
Sort the learned rules according to some performance metric over the remaining training examples

Learn the rules using a general to specific beam search vs. learn the rules using a specific to general beam search
Learn one rule at a time vs. learn rules using simultaneous covering such as in ID3
Produce rules via a generate-then-test strategy vs. produce rules in an example-driven fashion
Do not post-prune rules vs. post-prune rules
Use entropy as the performance measure vs. use some other measure such as the number of correctly classified examples divided by the total number of classified examples

Constants - capitalized
Predicate symbols - capitalized
Variables - lowercase
Function symbols - lowercase
Horn clause: a disjunctive clause (∨) with at most one positive literal, for example, A ∧ B ∧ C ∧ D ==> E