Program 2: Machine Learning
Due Date
This assignment is due at the beginning of
the lecture on Thursday, March 11th.
Partners
You are required to work with one other person on
this assignment. Please submit just one solution
with both of your names on it.
Purpose
The purpose of this assignment is to introduce you to
the Naive Bayes method, the k-nearest neighbors algorithm,
and decision stumps augmented with AdaBoost.
Data Set
For this assignment, we will be using the
automobile database. The .names file describes the
data and the .data file provides the data.
Learning Techniques To Implement
- Naive Bayes Method (p. 718). You
will need to decide how to deal with continuous
attributes. Conduct an experiment to find out
how accurate this method is.
- k-Nearest Neighbors with k = 5 (p. 773).
You will need to decide how to deal with
continuous attributes. Conduct an experiment to
find out how accurate this method is.
- Decision Stumps (p. 666) with AdaBoost (p. 667). Conduct
experiments with M = 1, 5, 10, and 20 to find out
how accurate this method is. You must use
one decision stump for each attribute. Stumps for
discrete attributes must have one branch for each value.
Stumps for continuous attributes must have two branches:
one branch if the value is within a particular range
[X..Y] and one branch for all other values.
Report
Write a professional report that includes
the following sections:
- A description of your k-nearest neighbors algorithm
and a report on its effectiveness. Use graphs and
tables where appropriate.
- A description of your Naive Bayes method and a report
on its effectiveness. Use graphs and tables where
appropriate.
- A description of your decision stump (augmented with
AdaBoost) algorithm and a report on its effectiveness.
Use graphs and tables where appropriate.
General Requirements
- In all experiments, you are trying to predict the
price of the vehicle (attribute 26).
- You may use any programming language you like for this assignment.
- Use 10-fold cross-validation (see page 663) for
all of your experiments.
- Be sure to explain your experiments carefully:
Describe the experiment. Display the results in
a manner that is meaningful (graphs, tables, etc.).
Interpret the results.
- Design, conduct and report on two meaningful experiments in addition
to the ones that are required.
- In the report, be sure to emphasize any
non-standard choices you made (for example, how are
you dealing with continuous values in the Naive Bayes
Method?).
What to Submit
- A printout of the source code that you produce.
- A printout of your program running in a representative fashion.
- The report.
Grading
- 40% The correctness and quality of the code.
- 60% The correctness and quality of the report.