Computational Research Topics (CS 513): Structured Probabilistic Models
(Updated: 05/05/09)
Course Description
From the catalog: NP-completeness and NP-hardness. Abstract complexity classes. Intractability. Interactive of
zero-knowledge proof systems. Approximability. Polynomial and non-polynomial time hierarchies.
What the course is really about: This graduate level course focuses on a current research topic in computer science
and explores that topic using a seminar-based, project-oriented format. This year, students will be exploring issues
in solving problems using structured probabilistic models (sometimes called "graphical" models). Topics in probabilistic
representations, inference algorithms, and learning such models from data will be explored. Students will be expected to
lead discussions on various topics and to complete and present a non-trivial project.
Course Dates
Thursday, January 15, 2009 through Wednesday, May 6, 2009
Time and Location
EPS 350, TR, 12:45-2:00 pm
Prerequisites
CS 436 or 536 recommended. Background in probability recommended.
Professor Information
Dr. John W. Sheppard
EPS 363
Phone: 994-4835
Email: john.sheppard@cs.montana.edu
Biography: Dr. Sheppard is the RightNow Technologies Distinguished Professor in Computer Science. Dr.
Sheppard received his BS in computer science from Southern Methodist University in 1983. Later, while a full-time member
of industry, he received an MS in computer science in the Johns Hopkins Part Time Engineering program (1989). He then
continued his studies and received his Ph.D. in computer science from Johns Hopkins in the day school (1996). His research
interests include model-based and Bayesian reasoning, reinforcement learning and games, and fault diagnosis/prognosis of
complex systems. He has been elected as a Fellow of the IEEE "for contributions to system-level diagnosis and prognosis."
Prior to entering academia, Dr. Sheppard was a member of industry for 20 years. His prior position was as a research fellow
at ARINC. Currently, he also holds a position as an Associate Research Professor at Johns Hopkins where is he advising
three PhD students.
Office Hours: MWF 11:00-11:50 or by appointment
Course Objectives
At the end of this course, the student will be able to:
- Formulate and assess uncertainty problems using directed and undirected probabilistic graphical models.
- Assess the strengths and weaknesses of alternative representation and inference procedures.
- Understand and be able to implement inference algorithms for structured probabilistic models.
- Understand and be able to implement parameter and structure learning algorithms for structured probabilistic models.
- Develop a proposal for an extended research project in structured probabilistic models.
- Design and conduct an experiment involving Bayesian or probabilistic reasoning.
- Prepare a written, technical paper on the results of individual research.
- Deliver a talk, describing the results of individual research.
Course Readings
- Draft textbook: Structured Probabilistic Models, Daphne Koller and Nir Friedman.
- Relevant research papers as required.
- A PhD Dissertation from the following list (password protected).
Anticipated Topics to be Covered
- Joint probability distributions and conditional independence
- Bayesian networks
- Naive Bayes classification and modern extensions
- Markov networks
- Factor graphs
- Dynamic Bayesian networks
- Variable elimination
- The junction tree algorithm
- Markov Chain Monte Carlo
- Particle filtering
- Variational inference
- The Viterbi algorithm
- Expectation-maximization and the Baum-Welch algorithm.
- Markov decision processes and partially observable Markov decision processes
Course Evaluation
Grading will be based on in class discussions, discussion leadership, ability to report on progress in the field
through oral presentation and written critique, and the ability of the student to design and implement a research project.
Students will be responsible for periodically leading class discussion and then summarizing the results of the discussion
in an informal report. Each student will also conduct a research project, documented with a formal, technical paper and
presentation describing the experimental method and results. The following weights will be placed on each of the course
requirements:
Here is a summary of actual assignment point values:
- Discussion Reviews (3-5 pages) -- 10%
- Discussion Leadership -- 20%
- Dissertation Critique -- 15%
[Thesis choice due 02/17; Critique due 03/05.]
- Project Proposal -- 10%
[Proposal due 03/12.]
- Project Report -- 25%
[Report due 04/30.]
- Project Presentation -- 10%
[Presentations given 05/06, 12:00]
- Class Participation -- 10%
Course Notes (all PDF and now password protected)
Full summary notes can be found here.
Papers for Discussion (all PDF and password protected)
- 03/24/09 (Bob Wall): Z. Ghahramani and M. Jordan, "Factorial Hidden Markov Models," Machine Learning, 29, 245-273 (1997).
- 03/26/09 (Doug Galarus): Y. Bengio and P. Frasconi, "Diffusion of Context and Credit Information in Markovian Models," Journal of Artificial Intelligence Research 3 (1995) 249-270.
- 03/31/09 (Scott Wahl): S. Ross, J. Pineau, S. Paquet, and B. Chaib-draa, "Online Planning Algorithms for POMDPs," Journal of Artificial Intelligence Research, 32 (2008) 663-704.
- 04/02/09 (Anthony Arnone): Y. Bengio and P. Frasconi, "Input-Output HMM's for Sequence Processing," IEEE Transactions on Neural Networks, Vol. 7, No. 5, September 1996, pp. 1231-1249.
- 04/07/09 (Hasari Tosun): D. Karger and N. Srebro, Learning Markov Networks: Maximum Bounded Tree-Width Graphs," Proceedings of the 12th ACM-SIAM Symposium on Discrete Algorithms (SODA), Washington, DC, pp. 392-401, 2001.
- 04/09/09 (Bob Wall): F. Bromberg, D. Margaritis, and V. Honavar, "Efficient Markov Network Structure Discovery using Independence Tests," Proceedings of the Sixth SIAM International Conference on Data Mining (SDM), April 20-22, 2006.
- 04/14/09 (Doug Galarus): Y. Wang, N. Zhang, and T. Chen, "Latent Tree Models and Approximate Inference in Bayesian Networks," Journal of Machine Learning Research 32 (2008) 879-900.
- 04/16/09 (Scott Wahl): J. Filho and J. Wainer, "HPB: A Model for Handling BN Nodes with High Cardinality Parents," Journal of Machine Learning Research 9 (2008) 2141-2170.
- 04/21/09: No Class ... Time for project development and discussion
- 04/23/09: No Class ... Time for project development and discussion
- 04/28/09 (Anthony Arnone): C. Sutton, A. McCalllum, and K. Rohanimansesh, "Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data," Journal of Machine Learning Research 8 (2007) 693-723.
- 04/30/09 (Hasari Tosun): J. Zhu, Z. Nie, B. Zhang, and J-R. Wen, Dynamic Hierarchical Markov Random Fields for Integrated Web Data Extraction 9 (2008) 1583-1614.
Discussion Leadership
This course is formatted as a seminar in which research papers are read and discussed each week.
To make the course more interesting and to encourage involvement by all students in the discussion,
the seminar will be conducted such that the students will be responsible for presenting the weekly
material.
A course meeting will be structured as follows. On Tuesday of each week, the instructor will
review background material that is pertinent to the discussion but that may not have been included
in the assigned readings. On Thursday of each week, class leadership will be turned over to the
assigned discussion leader. At that point the leader will present an overview of the paper(s) for the
day and formulate questions and issues for class discussion. After the overview has been presented,
the class will be encouraged to engage in discussion of the issues. The instructor will participate
as another member of the discussion, interjecting additional material as necessary to provide information
on background and current research in the field.
To prepare for leading discussion, the leader should read all of the papers very carefully, being
sensitive to issues such as
- Algorithms proposed
- Technical approach used in experiments and analyses
- Comparisons with other methods Claims of contributions made by the paper
- Biases of the investigators (either explicit or implicit)
- Deficiencies in the paper or the research reported
- Directions for future research
- Possible ideas for course research projects
To properly prepare, the discussion leader may need to look at related papers as indicated in the references
of the assigned readings. The instructor will be able to recommend related papers as well.
The evaluation criteria for discussion leadership are as follows
- Understanding of topic/papers (25%)
- Summary of topic/paper (25%)
- Ability to answer questions on topic/paper (25%)
- Ability to stimulate discussion on topic/paper (25%)
The discussion leader is also responsible for preparing a written summary of the class discussion. This
summary will be due the week following the discussion. The summary should include a review of the papers,
a review of any related material presented in class, and a review of the issues and associated discussions
raised in class. Sufficient copies of the review should be made to provide to every member of the class.
The evaluation criteria for discussion summaries are as follows:
- Summary of topic (25%)
- Summary of discussion (25%)
- Identification of key issues (25%)
- Review of related material (15%)
- Construction and readability (10%)
Dissertation Critique
Fundamentally, this is a research-oriented course, and a large number of topics will be covered as a
foundation for a researcher to apply in solving complex probabilistic problems. Furthermore, this course
is oriented around introducing the student to current research in the field of probabilistic reasoning.
Unfortunately, in a course such as this, it is difficult during class time to explore any one topic in
depth. Therefore, each student will be responsible for selecting a PhD dissertation from several provided
by the instructor and writing a critique of the research reported in that dissertation. This critique
should include a summary of the research reported, a discussion of the major contributions claimed, and
an assessment of the significance of those contributions and of the research itself. The critique should
also include a brief literature review of the topic related to the thesis, discussion
of relevant algorithms, and application areas for the research reported. Where appropriate, the critique
should include a comparison with other issues discussed in class. Students are encouraged to
select a dissertation that is related to their course projects.
The evaluation criteria for the critique are as follows:
- Overview of the research reported (20%)
- Review of the related literature (15%)
- Major contributions of the thesis (20%)
- Understanding of techniques and algorithms (20%)
- Application areas (15%)
- Proper construction and readability of paper (10%)
Research Projects
As a semester long project, each student in this course will be responsible for conducting an independent
research project in machine learning with an emphasis either on agent control or data mining. This project
will provide direct experience in proposing and executing a complete research project over the length of the
course. The project can be experimental or theoretical. If an experimental project is proposed, be prepared to
include enough theoretical work to explain or motivate the work. If the project is theoretical, some
experimentation should be included to demonstrate whatever results are obtained.
Each student will write a short proposal describing the intended research. This proposal must be approved
by the instructor prior to commencing the major portions of the research. Obviously, some amount of research
should be done to prepare the proposal. The proposal should include a brief literature survey on the topic
area, a clear statement of the problem to be solved, and a description of the approach to be taken. The
proposal is due around the time typical for a midterm exam.
The evaluation criteria for proposals are as follows:
- Statement of hypothesis (25%)
- Proposed approach (25%)
- Relevance and interest in topic (25%)
- Literature review (15%)
- Construction and readability of proposal (10%)
The actual execution of the project is left entirely at the discretion of the student.
Any computer and programming language may be used to support the project, and additional tools for
analysis and presentation (e.g., MATLAB and excel) are encouraged. At the end of the project, each
student or group will be required to submit a comprehensive research report. This report will include
background and discussion of previous work done related to the topic, a clear description of the
problem to be solved, discussion of the approach taken, in depth discussion of any algorithms used or
developed, detailed presentation of the results obtained, discussion of the importance and implications
of the results, directions for future work, and references. The student or group should use the
research papers read in this course as guidance for what research papers look like. Note that
submitting code is not required.
The evaluation criteria for the report are as follows:
- Statement of problem (15%)
- Approach (20%)
- Background and related work in field (15%)
- Results (15%)
- Discussion and analysis of results (20%)
- Relevant future work (5%)
- Construction and readability of report (10%)
During the finals period, results of research projects will be presented in class. If necessary, the
last week of class will also be used. Each student or groupu should be prepared to give a short presentation
of the research performed and the results achieved. The format will be similar to a research talk given
at a conference. Presentations will give the class a chance to see other projects and to provide feedback
to the student. Note that the student is required to prepare visual aids (e.g., PowerPoint presentations) for
their talks.
The evaluation criteria for the presentation are as follows:
- Statement of problem (15%)
- Approach (20%)
- Background and related work in field (15%)
- Results (15%)
- Discussion and analysis of results (20%)
- Relevant future work (5%)
- Clarity in presentation (10%)
Policy on Academic Misconduct
Academic misconduct is unacceptable. It is the responsibility of all full-time, part-time or non-degree (special) students to adhere to strict standards of integrity in their professional and scholarly activities, as well as to high standards of conduct in their non-academic activities. Misconduct will be treated swiftly and harshly.
Examples of academic misconduct:
- Cheating on Examinations
- Use of unauthorized materials during examinations and in completing assignments.
- Consultation of unauthorized materials while being excused from an examination room.
- Discussion of an exam's contents during its administration.
- Copying answers from another student on an examination.
- Obtaining an examination or answers to an examination prior to its administration
- Studying from an old exam whose circulation was prohibited by the instructor.
- Plagiarism
- Submission of the same or substantially similar work of another person, even if re-worded
- Use of another person's work while representing it as your own
- Improper documentation of quotations, ideas, or paraphrased passages taken from published or unpublished works
- Reuse of Assignments
- Submission of the same or substantially similar assignment to fulfill the requirements of more than one course.
- Improper Use of the Internet
- Plagiarism from a published or unpublished Internet source.
- Improper documentation of an Internet source.
- Use of paper writing services of paper databases on the Internet.
- Improper Use of Electronic Devices
- Consultation of unauthorized electronic devices during an examination.
- Use of electronic devices to communicate within or outside of an examination room.
- Storage of test answers, class notes, and other references in electronic devices during examinations.
- Unauthorized Collaboration
- Collaboration when solving homework problems or writing lab reports, computer programs, or papers, unless explicitly approved by the professor.
- Alteration of Graded Assignments
- Submission of an examination or assignment for a re-grade after making changes to the original answers or text.
- Forgery or Falsification
- Falsification or invention of data in a laboratory experiment.
- Citation of nonexistent sources or creation of false information in a written assignment.
- Attributing to a source ideas or information that is not included in the source.
- Forgery of university documents.
- Impersonating a faculty member.
- Lying
- Request for special consideration from professors or university officials based upon false information or deception.
- Fabrication of a medical or emergency excuse.
- Claiming falsely to have completed and/or turned in an assignment.
- Falsely reporting an ethics violation by another student.
- Facilitating Academic Dishonesty
- Intentionally or knowingly aiding another student to commit a violation of academic conduct.
- Allowing another student to copy from one's own examination during its administration.
- Providing copies of course materials whose circulation was prohibited to students enrolled in or planning to take that course.
- Taking an examination or completing an assignment for another student, or permitting another student to do so on one's behalf.
- Unfair Competition
- Willfully damaging the academic efforts of other students.
- Stealing another student's academic materials.
- Denying another student needed resources.
Policy on Assignments
This course has several assignments requiring outside work of the students. The assignments are critical for gaining understanding and experience using the materials presented in class. Due to the importance of these assignments, the following policy is set forth.
- All assignments will be completed by the individual student and will be the original work of that student.
- All assignments are due at the beginning of class on the dates indicated in the syllabus. No assignments will be accepted late without prior approval of the instructor (other than exceptions noted below). Approval will not be granted based on personal time-management issues.
- Unapproved late assignments will receive no credit. Approved late assignments may still receive a penalty, depending on the circumstances. It is the responsibility of the student to ensure that the instructor is kept informed of any problems related to turning in assignments on time. Only serious, uncontrollable circumstances (such as serious illness or family tragedy) will result in accepting late assignments without prior notification. In such cases, documentation of these circumstances must be provided.
- Attending class sessions is critical to a successful course. If an absence is anticipated, please notify the instructor beforehand by phone or email. Unexplained absences will adversely affect the final grade.
- All written assignments are expected to be typed or neatly printed. Avoid hand drawn figures if at all possible. If the assignments are not legible, they will be returned to the student with a grade of zero. Be sure each assignment includes name, department, daytime phone number, and email address.
- While a computer account is provided to all students, any language and any computer system can be used to complete the programming assignments unless otherwise specified in the assignment itself. That said, it is expected that all programs can be run on the department network, and students may be requested to demonstrate this.
- All programming assignments must include fully commented code and several sample runs to demonstrate proper functioning of the assigned program. It is the responsibility of the student to ensure that the code is readable and understandable. It is also the responsibility of the student to ensure that the output is understandable and accurately reflects the functioning of the program.
- The world-wide web provides a tremendous resource to both students and instructors, and students may be tempted to look for problem solutions on the web. Any student turning in an assignment with a solution obtained from the web must give full attribution to the source of that solution. Failure to do so is plagiarism and is grounds for failing the course. Even with proper attribution, credit received for the assignment will depend on the nature of the web information used and on the problem assigned. (See policy on web usage.)
Policy on Web Usage
The world-wide web provides a resource for finding and using a tremendous amount of information in the computer science and engineering fields. As computer scientists, we are able to use the web to maximize our productivity in all aspects of our life, including home, work, and school, and in this class, all students are encouraged to use the web as an educational resource. Unfortunately, as with any resource, use of the web can be abused to the point where the educational experience is diminished. As an attempt to limit such abuse, the following policy on using the web in this class is set forth.
- Students are free to explore the web to visit sites related to topics discussed in this course insofar as such exploration identifies material that elucidates and expands on material related to or discussed in class.
- Students are not permitted to seek out solutions to any of the problems assigned unless the assignment specifically states that web use is permitted.
- On programming assignments, students may not download solutions from a web site implementing similar or equivalent programs. Further, students are discouraged from even examining such solutions. Turning in a program obtained from the web will result in no credit for that program. Further, turning in a program obtained from the web without attribution to the web source constitutes plagiarism and will result in failure of the course.
- On problem assignments, students may not download or examine solutions from a web site focusing on similar or equivalent problems. Turning in a solution from the web will result in no credit for that problem. Further, turning in a solution obtained from the web without attribution to the web source constitutes plagiarism and will result in failure of the course.
- If there are any questions or even a hint of doubt concerning appropriate use of the web for completing class assignments, the student should consult the instructor for guidance.
Policy on Class Attendance
A large amount of material will be covered in a relatively limited amount of time. In addition, a fairly large amount of work will be done by the student. Consequently, class attendance is required. If a student must miss class for any reason, he or she should notify the instructor as soon as the absence is known. In the event of emergency absences, the instructor reserves the right to request an excuse from some cognizant authority such as a supervisor or physician. Note that class attendance accounts for 10% of the student's final grade.
Policy on Personal Communications Devices
It is unfortunate, but the advances in personal communications technologies has also resulted in the need for a policy concerning the use of these devices. Since students receiving and/or responding to pages or cell phone calls creates a distraction to other students, no pagers or cell phones will be permitted to be brought into the classroom without prior authorization of the instructor.