Chapter 5: Evaluating Hypotheses

Major Issues

Notation

Two Questions

Definitions

Key Question: How good of an estimate is errorS(h) for errorD(h)?

Discrete Valued Hypotheses

IF

  1. S contains n examples drawn independently over D
  2. n ≥ 30 (or n*p*(1 - p) ≥ 5)
  3. n commits r errors

THEN

  1. the most probable value of errorD(h) is errorS(h) which is r/n
  2. with 95% confidence, errorD(h) lies in errorS(h) ∓ 1.96 * sqrt[errorS(h) * (1 - errorS(h)) / n ]

Table 5.1 shows values for various confidence intervals.

Table 5.2

Shows basic definitions and facts from statistics.

Binomial Distribution

Estimators, Bias, and Variance

Confidence Interval

Normal Distribution

Valid XHTML 1.0!