Chapter 5: Evaluating Hypotheses
Major Issues
- How can we determine the accuracy of a learned hypothesis?
- How can we compare the accuracy of two learned hypotheses?
- How can we compare the accuracy of two learning algorithms?
Notation
- X: the space of instances
- D: the probability distribution of encountering
instances from X
- f: the target function
- H: the hypothesis space
- h: a particular hypothesis in H
- (x, f(x)): a training instance
- S: all training instances
Two Questions
- Given h constructed from n examples drawn randomly
from D, what is the best estimate of h over future instances
drawn from D?
- What is the probable error in this accuracy estimate?
Definitions
- Sample Error:
errorS(h) = (1/n) * Σ x ∈ S
δ(f(x), h(x))
where δ(f(x), h(x)) is 1 is f(x) and h(x)
predict differently and 0 otherwise.
- True Error:
errorD(h) = Prx ∈ D [ f(x) ≠ h(x) ]
Key Question: How good of an estimate is
errorS(h) for errorD(h)?
Discrete Valued Hypotheses
IF
- S contains n examples drawn independently over D
- n ≥ 30 (or n*p*(1 - p) ≥ 5)
- n commits r errors
THEN
- the most probable value of errorD(h) is
errorS(h) which is r/n
- with 95% confidence, errorD(h) lies in
errorS(h) ∓ 1.96 * sqrt[errorS(h) *
(1 - errorS(h)) / n ]
Table 5.1 shows values for various confidence intervals.
Table 5.2
Shows basic definitions and facts from statistics.
Binomial Distribution
- n: number of training instances
- r: number of errors
- p: the probability of an error
- P(r) = C(n,r) * pr * (1 - p)r
- E[X] = n * p
- Var(X) = n * p * (1 - p)
- &sigmaX = sqrt (n * p * (1 - p))
- If (n * p * (1 - p)) ≥ 5, the binomial distribution
is closely approximated by the normal distribution with
the same mean and variance
- Table 5.3
Estimators, Bias, and Variance
- errorS(h) = r/n
- errorD(h) = p
- errorS(h) is an estimator for errorS(h)
- The estimation bias is errorS(h) - errorD(h)
- If the estimation bias is 0, the estimator is unbiased
- For a binomial distribution, the estimator is unbiased!
- In general, σerrorS(h) =
σr / n =
sqrt ( p * (1 - p) / n) =
sqrt ( errorS(h) * (1 - errorS(h)) / n )
- A practical example is worked on page 138
Confidence Interval
- Definition: an N% confidence interval for some parameter p is
an interval that is expected with probability N% to contain p
- See Figure 5.1 for a picture
- Confidence intervals are relatively easy to find if we
use the normal distribution as an approximation to the
binomial distribution
- If a random variable Y obeys a normal distribution, the
measured value y of Y will fall into the following interval
N% of the time: μ ∓ ZN * σ
where ZN values are given in Table 5.1
- Confidence intervals can have two-sided or one-sided bounds
Normal Distribution
- See Table 5.4
- E[X] = μ
- Var(X) = σ2
- σX = σ
- The Central Limit Theorem states that the sum a large number
of independent, identically distributed random variables follows
a distribution that is approximately normal