Chapter 10: Learning Sets of Rules
FOIL
- Learns sets of first-order rules
- Rules are in disjunctive form, but function symbols are not allowed
- Uses hill-climbing (or a beam of size 1)
- Produces rules that predict when an example is a positive instance
Algorithm
- Identify the positive examples, call this POS
- Identify the negative examples, call this NEG
- Set the learned rules to be the empty set
- While the POS set is not empty
- Set the NewRule to be a rule with no preconditions
- Set NewRuleNeg to be NEG
- While the NewRuleNeg set is not empty
- Generate candidate literals to add to NewRule from the possible predicates
- Calculate BestLiteral using FoilGain(literal, NewRule)
- Add BestLiteral to precondition of NewRule
- Update the NewRuleNeg set so that each item in it satisfies the NewRule
- Add NewRule to the learned rules
- Delete from POS any examples covered by NewRule
- Return the learned rules
Generating Candidate Specializations
If the current rule is L1, ... Ln =>
P(x1, ... xk) then FOIL considers the
following
- Add Q(v1, ... vr) where Q is a predicate
and the vi are either new variables or variables already
present in the rule. At least one of the vi must already
exist as a variable in the rule.
- Add Equal(xj, xk) where these are variables
already present in the rule.
- The negation of either of the above forms.
Example
If the predicates are Father(a,b) and Female(c) and we are trying to
learn the concept of Daughter(d,e), we start by assuming that
everything implies Daughter(d,e).
We then add the following preconditions and their negations:
- Equal(d,e)
- Female(d)
- Female(e)
- Father(d,e)
- Father(e,d)
- Father(d,f)
- Father(e,f)
- Father(f,d)
- Father(f,e)
FoilGain
FoilGain(L,R) ≡ t (log2(p1 / (p1 + n1)) -
log2(p0 / (p0 + n0))
- L is the candidate literal
- R is the rule
- p0 is the number of positive bindings of rule R
- p1 is the number of positive bindings of adding literal L to rule R
- t is the number of positive bindings of rule R that are still covered
after adding literal L to R
Induction as Inverted Deduction
The task is to discover a hypothesis h, such that
(∀<xi, f(xi)> ∈ D)
(B ∧ h ∧ xi) ⊢ f(xi)
where B is the background knowledge.
There are some practical difficulties:
- Noisy data is not easily accommodated
- The hypothesis space search is intractable
- The complexity of the hypothesis space increases with B
Inverting Resolution
Propositional Resolution
- Given A ∨ B
- Given ¬ B ∨ C
- Conclude A ∨ C
- Table 10.5 provides a procedure:
C = (C1 - {L}) ∪ (C2 - {¬ L}
Propositional Inverse Resolution
- Given resolvant A ∨ C
- Given initial clause A ∨ B
- Deduce other clause is ¬ B ∨ C
- Another possibility: ¬ B ∨ C ∨ A
- Table 10.6 provides a procedure:
C2 = (C - (C1 - {L})) ∪ {¬ L}
First Order Resolution
- Given White(x) ∨ ¬ Swan(x)
- Given Swan(Fred)
- Conclude White(Fred)
- Table 10.7 provides a procedure:
C = (C1 - {L})θ ∪ (C2 - {¬ L}θ
First Order Inverse Resolution
- Given GrandChild(Bob,Shannon)
- Given background information Father(Shannon, Tom)
- Deduce Grandchild(Bob,x) ∨ ¬Father(x,Tom)
- Equation 10.4 summarizes the procedure
Exercises
- 10.3
- 10.4
- 10.5 (one example is fine)
- 10.6 (one example is fine)