Chapter 8: Instance-Based Learning
8.1 Introduction
- Store training cases
- Evaluate when a new query is made
- Local approximations to the target function of the nearby cases
- Symbolic representations can be used in some algorithms
- k-Nearest Neighbor Learning
- Locally Weighted Regression
- Radial Basis Functions
- Case-Based Reasoning
8.2 k-Nearest Neighbor Learning
- Use an n-dimensional space for the instances
- The k nearest training examples are used to evaluate an approximation
- ie. when k = 1, only the closest training case determines the query
- Processing can be done prior to queries, by generating the decision space
- This diagram is often called a Voronoi diagram
Distance-Weighted Nearest Neighbor Algorithm
- The closer the neighbor the greater the influence
- Uses a weighting function, such as equation 8.3
- local method
- global method
Remarks on k-Nearest Algorithm
- robust to noisy training data
- Inductive bias
- Use of all attributes & the curse of dimensionality
- Weighting each attribute by the relevancy of each attribute using Cross-validation
- Setting the weight of an attribute to zero
- Memory indexing: kd-tree
A Note on Terminology
- Regression: approximating a real-valued target function
- Residual: the error f^(x) - f(x) in approximating the target function
- Kernel function the function of the distance that is used to determine the weight of each training example.
8.3 Locally Weighted Regression
- Local: function is approximated based on data near the query point
- Weighted: each training example is weighted by its distance from the query point
- Regression: the term in the statistical learning community for the problem of approximating real-valued functions
- Construction of a function that matches the data around the query is used to find an approximate value
Locally Weighted Linear Regression
- Linear Regression
- Gradient descent can be used to find a regression
- Minimize the squared error over near neighbors
- Minimize the squared error over entire set of training examples by weighting
- Minimize the squared error over near neighbors by weighting
8.4 Radial Basis Functions
- Similar to distance-weighted regression in concept
- Neural networks are used to find a matching function
- Gaussian kernel functions
- How to select the number of kernel functions
- 1 kernel function for each training example
- Distribute kernel functions across the instance space either linearly or supervised clustering
8.5 Case-Based Reasoning
- Symbolic notation instead of real-valued numbers
- Nearest neighbors are found using different methods
- Combinations of nearest cases can be used to propose solutions
- CADET
8.6 Remarks on Lazy and Eager Learning
- Lazy
- k-Nearest Neighbor
- locally weighted regression
- case-based reasoning
- Eager
- Radial basis function networks
- every other algorithm in this book
- Computation time
- Quality of classifications