John W. Sheppard's Research

Bayesian Learning

Current approaches to learning Bayesian networks focus on learning network structures, probability tables or both. From a classification standpoint, such networks involve learning the relationships between and among feature values conditioned on one or more class variables. Currently, our research is focusing on several fundamental problems in learning these networks.

  1. How do we identify conditionally dependent features in the data?
  2. What are the effects of incorporating dependence relationships in a network when no such dependence relationship exists in the data?
  3. How should dependence relationships be modeled when such dependence is limited to a specific subset of values of a random variable?
  4. Does any advantage exist to creating Bayesian models targeted at specific instances of end systems?
  5. How does one learning a highly accurate Bayesian network with small, underrepresented data sets?
  6. Can general and specific models be combined in a composite model to better approximate probability distributions when end system-specific data is limited?
  7. To what extent can one learn dynamic Bayesian networks?

Ontology-Directed Data Mining

With the advent of the "semantic web" and the "Cyc" project, considerable attention has been devoted to developing ontologies to represent the semantics of key concepts in commonsense reasoning, customer-driven search, and model-based design/acquisition. Currently, we are explore the application of semantic modeling in support of information exchange and data mining. To what extent can semantic models guide search through large amounts of data to identify unknown but significant relationships in the data.

Currently, semantic modeling approaches are making use of the ISO 10303-11, EXPRESS modeling language. EXPRESS was designed specifically to support information exchange in support of electronic design automation. Since the invention of the EXPRESS language and its subsequent standardization in 1994, semantic modeling has been used extensively to support the development of information exchange standards.

Evolutionary Computation

A wide variety of evolutionary computation problems have been explored. The current project involves implementing a hierarchical cellular automaton to model a traffic network for city and then applying a genetic algorithm to learn signaling rules for the intersections. The object of this research is to determine the extent to which such a decentralized model can be used to minimize traffic delay through the learning of the signal control rules. Several issues are being explored including:

  • The effects of various migration strategies in distributing rules throughout the network.
  • The effects of these migration strategies in maintaining diversity across the populations represented in the network.
  • The ability to predict traffic flow from simple parameters using a purely local computation model.
  • The evaluation of global fitness based on local interactions.

System-Level Diagnosis and Prognosis

The current focus of Bayesian learning research is in learning Bayesian networks for fault diagnosis and prognosis. Current sponsors are interested in developing networks tied to specific end systems rather than classes of systems with the expectation that these networks better represent the probability distributions underlying their unique operational scenarios. We are also looking at extending these ideas to creating prognostic network through the development of dynamic Bayesian networks and factorial hidden Markov models.

In addition to Bayesian learning, we are actively involved in the development of three families of standards making use of EXPRESS modeling:

  • AI-ESTATE (IEEE 1232): Artificial Intelligence Exchange and Service Tie to All Test Environments
  • SIMICA (IEEE 1636): Software Interface for Maintenance Information Collection and Analysis
  • ATML (IEEE 1671): Automatic Test Markup Language

Of particular interest here is recent work performed to demonstrate the viability of the AI-ESTATE standard for the Department of Defense Automatic Test System Framework. Four "screencasts" of the demonstration are being posted here:

One of the key focus areas for SIMICA is referred to as "diagnostic maturation." Specifically, the semantic models being developed for SIMICA will be used with data mining and analysis algorithms to identify and correct diagnostic deficiencies.