Chapter 8: Instance-Based Learning

8.1 Introduction

Store training cases
Evaluate when a new query is made
Local approximations to the target function of the nearby cases
Symbolic representations can be used in some algorithms
k-Nearest Neighbor Learning
Locally Weighted Regression
Radial Basis Functions
Case-Based Reasoning

8.2 k-Nearest Neighbor Learning

Use an n-dimensional space for the instances
The k nearest training examples are used to evaluate an approximation
ie. when k = 1, only the closest training case determines the query
Processing can be done prior to queries, by generating the decision space
This diagram is often called a Voronoi diagram

Distance-Weighted Nearest Neighbor Algorithm

The closer the neighbor the greater the influence
Uses a weighting function, such as equation 8.3
local method
global method

Remarks on k-Nearest Algorithm

robust to noisy training data
Inductive bias
Use of all attributes & the curse of dimensionality
Weighting each attribute by the relevancy of each attribute using Cross-validation
Setting the weight of an attribute to zero
Memory indexing: kd-tree

A Note on Terminology

Regression: approximating a real-valued target function
Residual: the error f^(x) - f(x) in approximating the target function
Kernel function the function of the distance that is used to determine the weight of each training example.

8.3 Locally Weighted Regression

Local: function is approximated based on data near the query point
Weighted: each training example is weighted by its distance from the query point
Regression: the term in the statistical learning community for the problem of approximating real-valued functions
Construction of a function that matches the data around the query is used to find an approximate value

Locally Weighted Linear Regression

Linear Regression
Gradient descent can be used to find a regression
Minimize the squared error over near neighbors
Minimize the squared error over entire set of training examples by weighting
Minimize the squared error over near neighbors by weighting

8.4 Radial Basis Functions

Similar to distance-weighted regression in concept
Neural networks are used to find a matching function
Gaussian kernel functions
How to select the number of kernel functions
1 kernel function for each training example
Distribute kernel functions across the instance space either linearly or supervised clustering

8.5 Case-Based Reasoning

Symbolic notation instead of real-valued numbers
Nearest neighbors are found using different methods
Combinations of nearest cases can be used to propose solutions
CADET

8.6 Remarks on Lazy and Eager Learning

Lazy

k-Nearest Neighbor
locally weighted regression
case-based reasoning

Eager

Radial basis function networks
every other algorithm in this book

Computation time
Quality of classifications