Seminars 2015

The next seminar will be held Monday, October 5. Please see below for details.

Precision Agriculture

Date/Time: Monday, November 30, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Rob Payn and Bruce Maxwell, Land Resources and Environmental Science, Montana State University

Abstract: TBD

Machine Learning in Software Engineering

Date/Time: Monday, November 23, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Upulee Kanewala, Computer Science, Montana State University

Abstract: TBD

Title: TBD

Date/Time: Monday, November 16, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Rick Sojda, Computer Science, Montana State University

Abstract: TBD

Columnar Store Architecture for a Specialized Genomic Big Data Warehousing and Analytics

Date/Time: Monday, November 2, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Gabe Rudy, Golden Helix

Abstract: Starting with an overview of database and big-data solutions for the different use cases of Online Transactional Processing (OLTP), Data Warehousing and Analytical Processing (OLAP) and the No-SQL mix bag of big-data needs, we will dive into the read-optimized requirements of data warehousing and big-data scientific workflows. Golden Helix builds tools for researchers and clinicians to work on genome-scale datasets on their own hardware, and we will present on the specialized columnar-store data backend built to power their genomic application suite. Column-store architectures have won as the leading design of data warehousing solutions, but does not require forgoing the analytical power and convenience of a SQL interface. In fact, we will review the implementation details of a PostgreSQL foreign-data-wrapper we wrote for this customized column-store file format that powers Golden Helix genomic data warehousing solution built to scale to hundreds of millions of unique genomic variant sites for thousands of exomes and genomes.

An Analysis of Existing Contributions to Continuous Time Bayesian Networks (PhD Qualifier)

Date/Time: Monday, October 26, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Logan Perreault, MSU

Abstract: Continuous time Bayesian networks (CTBNs) are a relatively new model used for representing discrete state systems that evolve in continuous time. Here we provide a survey of the literature and identify the major contributions that have been made in the CTBN literature. In addition, we identify deficiencies and open questions that have not yet been addressed in the field. We use the results of this survey to suggest potential research avenues that extend the forefront of the CTBN literature.

Disposable Infrastructure: Developing Cloud Infrastructure as a Utility

Date/Time: Monday, October 19, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: James Hirmas, US Geological Survey

Abstract: As cloud computing becomes conventional, organizations will be required to transition from traditional rigid systems engineering, networking, application development, and architectures to a more flexible cloud on-demand model.  Instead of capacity planning for compute, networking, and storage 2-5 years in the future, the expectation will be that cloud architectures can dynamically right scale to specific demand per hour or in some cases per minute.  Solutions will need to be architected to decouple hard dependencies to compute, network, applications, and storage so that each layer can scale independently and/or transition seamlessly to other cloud vendors without effecting other layers.  The evolution of IT infrastructure transitioning to disposable cloud infrastructure will also have a dramatic impact to Security Operations, Financial Management, Development Lifecycle, and overall operational management.  In this seminar, we will discuss the principles of cloud disposable architectures, Cloud operational management, and real-world use cases.

An Introduction to Persistent Homology

Date/Time: Monday, October 5, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Brittany Fasy, Computer Science, Montana State University

Abstract: Persistent homology is a widely used tool in Topological Data Analysis that encodes multi-scale topological information as a multi-set of points in the plane, called a persistence diagram. The method involves tracking the birth and death of topological features as one varies a tuning parameter. Features with short lifetimes are informally considered to be "topological noise," and those with a long lifetime are considered to be "topological signal." To formally distinguish signal from noise, we bring some statistical ideas to persistent homology in order to derive confidence sets for persistence diagrams.

Database for Dynamics: A New Approach to Model Gene Regulatory Networks

Date/Time: Monday, September 28, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Tomas Gedeon, Mathematical Sciences, Montana State University

Abstract: Experimental data on gene regulation   is mostly qualitative, where the only information available about pairwise interactions is the presence of  either up-or down- regulation. Quantitative data is often subject to large uncertainty and is mostly in terms of fold differences. Given these realities, it is very difficult to make reliable predictions  using mathematical models. The current approach of choosing  reasonable  parameter values, a few initial conditions and then making predictions based on resulting solutions is  severely subsampling both the parameter and phase space. This approach does not produce provable and reliable predictions.

We present a new approach that uses continuous time Boolean networks as a platform for qualitative studies of gene regulation.    We compute a  Database for Dynamics, which  rigorously approximates  global dynamics over  entire parameter space. The results obtained by this method provably capture the dynamics at a predetermined spatial scale.

Video recording available:

Ecological Remote Sensing: Confronting the Challenges of Autonomous Data Collection

Date/Time: Monday, September 21, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Robb Diehl, US Geological Survey

Abstract: Not long ago, ecological data was collected by biologists tromping around in the field, pencil and paper in hand.  These observations are still critical to the discipline, but a new era in remote sensing is automating the collection of ecological data, dramatically expanding the research productivity of the individual investigator.  Increasingly, field personnel are being replaced by durable, autonomous hardware that runs continuously and gathers data with high accuracy.  But these advances bring new challenges.  By replacing humans with autonomous sensors, we create enormous software challenges; in effect, we have moved intelligent data processing from the front-end to the back-end of the data collection process.  Our ability to effectively post-process data still lags our ability to gather it, but this has created new and challenging opportunities to advance scientific computing.  I give examples and consider the wide-ranging opportunities for productive collaboration between computer scientists and ecologists.

Video recording available:


The Effectiveness of Software Development Instruction Through the Software Factory Method for High School Students

Date/Time: Monday, August 31, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108


  • Jessica Jorgenson, James Jacobs, Zach Hansen — Bozeman High School
  • Mike Trenk, MacKenzie O'Bleness — Montana State University

Abstract: Teaching software development in environments that mimic industry practices is essential for teaching applicable real-word development skills. In addition, these kinds of delivery based projects engage students in meaningful design work that encourages clear, sustainable code. The Software Factory has provided such projects and environment to students at MSU for the past year. This project aimed to explore the effectiveness of such instruction for high school students with limited programming experience. Three students from Bozeman High School were selected to work in a team with two undergraduates with the goal of creating an android application. In the process these students were exposed to Java, sorting algorithms, version control, and software development practices in an industry setting. We will discuss the challenges and rewards of this teaching method and the Software Factory for students so early in their computing education.

Welcome Seminar

Date/Time: Monday, August 24, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: John Paxton, Computer ScienceMontana State University

Abstract: This seminar will provide new and continuing graduate students with useful information about the Computer Science Department. It will also provide an opportunity for graduate students to meet one another, the CS faculty, and the CS staff.

Learning Spectral Filters for Single- and Multi-Label Classification of Musical Instruments (PhD Defense)

Date/Time: Tuesday, July 28, 2015 from 9:00 a.m. - 10:00 a.m.

Location: EPS 126

Presenter: Patrick Donnelly, Montana State University

Abstract: Musical instrument recognition is an important research task in the area of music information retrieval. While many studies have explored the recognition of individual musical instruments in isolation, the field has only recently begun to explore the more difficult multi-label classification problem of identifying the musical instruments present in polyphonic mixtures. This dissertation presents a novel method for feature extraction in multi-label instrument classification and makes important contributions to the domain of instrument classification and to the general research area of multilabel classification.

In this work, we consider the largest collection of instrument samples to date in the musical instrument classification literature. We examine 13 musical instruments common to four datasets, including the first use of a dataset in this research domain. We consider multiple performers, multiple dynamic levels, and all possible musical pitches within the range of the instruments.

To the area of multi-label classification, we introduce a binary-relevance feature extraction scheme to couple with the common binary-relevance classification paradigm. This approach allows consideration of a unique feature space for each binary classifier, allowing selection of features unique to each class label. We present a data-driven approach to learning areas of spectral prominence for each instrument and use these locations to guide our binary-relevance feature extraction. We use this approach to estimate source separation of our polyphonic mixtures.

We contribute the largest study of single- and multi-label classification in musical instrument literature and demonstrate that our results track with or improve upon the results of comparable approaches in the literature. In our solo instrument classification experiments, we provide the seminal use of Bayesian classifiers in the domain, introduce the grid-augmented topology for na ive Bayes, and demonstrate the utility of conditional dependencies between frequency- and time-based features for the instrument classification problem. For multi-label instrument classification, we explore the question of dataset bias for polyphonic test sets derived from the monophonic training sets in cross-validation study controlled for dataset independence. Additionally, we present the most comprehensive cross-dataset study in the instrument classification literature and demonstrate the generalizability of our approach.

Furthermore, we consider the difficulty of the multi-label problem with regards to label density and cardinality and present experiments with a reduced label set, comparable to many studies in the literature, and demonstrate the efficacy of our system on this easier problem. We provide a comprehensive set of multi-label evaluation measures with the goal of aligning the instrument classification literature with the standard evaluation practices of the general multi-label community.

Bounding Rationality by Computational Complexity

Date/Time: Monday, May 4, 2015 from 4:10 p.m. - 5:00 p.m.

Location: Byker Auditorium, Chemistry and Biochemistry Building

Presenter: Lance Fortnow, Georgia Institute of Technology


Traditional microeconomic theory treats individuals and institutions of completely understanding the consequences of their decisions given the information they have available. These assumptions may not be valid as we might have to solve hard computational problems to optimize our choices. What happens if we restrict the computational power of economic agents?

There has been some work in economics treating computation as a fixed cost or simply considering the size of a program. This talk will explore a new direction bringing the rich tools of computational complexity into economic models, a tricky prospect where even basic concepts like "input size" are not well defined.

We show how to incorporate computational complexity into a number of economic models including game theory, prediction markets, forecast testing, preference revelation and awareness.

This talk will not assume any background in either economics or computational complexity.

End of Year Celebration and Awards Ceremony

Date/Time: Monday, April 20th from 4:10 p.m. - 5:00 p.m.

Location: EPS 108


We will reflect on some of our accomplishments from the 2014-2015 academic year.  Awards will be given to recognize achievement of graduate students (e.g. Outstanding Ph.D. Researcher, Outstanding GTA) and faculty (e.g. Researcher of the Year).  Light snacks and refreshments will be served.

Reactive Game Engine Programming for STEM Outreach

Date/Time: Monday, February 23, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Alan Cleary, Montana State University


Science, Technology, Engineering, and Mathematics (STEM) are pervasive in our society. For this reason it is important that we incorporate STEM topics in our education system. The crux of the problem is how to make these topics accessible to younger students in an engaging manner. We present our experiences using a novel programming style, reactive programming, to deliver a summer camp for students in grades 8 through 12. This software uses a declarative programming approach to allow students without a background in computing to explore a wide variety of subject material within a 3D virtual environment, including computer science, mathematics, physics, and art. This work is based on PyFRP, a reactive programming library written in Python. We describe our camp experience and provide examples of how this style of programming supports a wide variety of educational activities.


Predicting Metamorphic Relations for Testing Scientific Software: A Machine Learning Approach Using Graph Kernels

Date/Time: Monday, February 2, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Upulee Kanewala, Colorado State University


Comprehensive, automated software testing requires an oracle to check whether the output produced by a test case matches the expected behavior of the program. But the challenges in creating suitable oracles limit the ability to perform automated testing in some programs including scientific software. Metamorphic testing is a method for automating the testing process for programs without test oracles. This technique operates by checking whether the program behaves according to a certain set of properties called metamorphic relations. A metamorphic relation is a relationship between multiple input and output pairs of the program. Unfortunately, finding the appropriate metamorphic relations required for use in metamorphic testing remains a labor intensive task, which is generally performed by a domain expert or a programmer. This talk describes MRpred: an automated technique for predicting metamorphic relations for a given program. MRpred applies a machine learning based approach that uses graph kernels to create predictive models. MRpred achieves a high prediction accuracy, and the predicted metamorphic relations are highly effective in identifying faults in scientific programs.

Topological Data Analysis and Road Network Comparison

Date/Time: Friday, January 30, 2015 from 4:10 p.m. - 5:00 p.m.

Location: Roberts 301

Presenter: Brittany Fasy, Tulane University


Vast amount of data are routinely collected, and analyzing them effectively has become a central challenge we face across science and engineering. Topological data analysis (TDA) is a field that has recently emerged in order to tackle this challenge. This talk will focus on the problem of comparing two road networks (for example, to detect where and by how much a road network has changed over the course of a year). Surprisingly, only recently have distance measures between embedded graphs (representing road networks) been studied. We will see how one of the tools from TDA, namely, persistent homology, can be used to define a local distance measure between two graphs. Persistent homology describes the homology (in particular, the number of connected components and loops) of a data set, at different scales. An example to keep in mind is impressionistic paintings: at one scale, all that is seen are brush strokes; at a larger scale, the brush strokes blur together to form the subject of the painting. The (local) persistent homology distance measure is one of the first theoretically justified approaches to road network comparison. This talk should be accessible to both students and faculty.

How to Use CS to Become a Nuclear Physicist: Applying Computational Geometry to Reactor Physics

Date/Time: Friday, January 30, 2015 from 1:30 p.m. - 2:30 p.m.

Location: CS Conference Room

Presenter: David Millman, Lead Cloud Developer at ProductionPro


Simulating a nuclear reactor is challenging. Often, it involves many computers, working together to solve a very complex differential equation. While many methods exist for solving the equation, the most accurate are Monte Carlo (MC) methods. MC methods use Constructive Solid Geometry (CSG) to model a complex domain with high fidelity. Recent efforts to include feedback effects (e.g., depletion, thermal, xenon, etc.) have forced MC methods to calculate volumes and tight bounding boxes of spatial regions quickly and accurately.

In this talk, I describe a framework for approximating (to a user specified tolerance) volumes and bounding boxes of regions given their equivalent CSG definition. The framework relies on domain decomposition to recursively subdivide regions until they can be computed analytically or approximated. While the framework is general enough to handle any valid region, it was optimized for models used in criticality safety and reactor analysis. For bounding boxes, this is the first algorithm that has strong enough accuracy guarantees yet is fast enough for use within a production level nuclear reactor code. For volume calculations, numerical experiments show that the framework is over 500x faster and two orders of magnitude more accurate than the standard stochastic volume calculation algorithms currently in use.

Virtual Reality and Cybersickness

Date/Time: Monday, January 26, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Lisa Rebenitsch, Michigan State University


Often, familiarity with virtual reality is due to movies and shows such as Star Trek, The Matrix, Sword Art Online, and Iron Man's holograms. Real world virtual reality exists beyond flight simulators, and Oculus Rift has increased interest in the field. However, virtual reality implementations differ from television versions. Most rely strictly on sight with some including sounds and, more rarely, touch. Two common paradigms in virtual reality are projection screen systems such as CAVEs and visor display systems such as Oculus Rift. Applications for virtual reality include medical training, military training, museums, collaboration, design, and entertainment.

One safety issue inhibiting use of virtual reality is cybersickness, or the feeling of motion sickness-like symptoms in virtual environments. For example, three-dimensional and shaky camera movies have reports of "movie theater sickness." The likelihood and severity of these symptoms increase in virtual environments. The source of the issue is unclear with research in the field posing over forty potential factors. Factors lie in three categories: individual, hardware, and software. Prior attempts predicting cybersickness are the Cybersickness Dose Value (CSDV) and Kolasinski's linear model. The CSDV correlates well with cybersickness, but only includes software factors. Kolasinski's model explains 34% of the variance and excludes individual factors. Cybersickness is highly individual with a resistant population upwards of 50%. New models using individual characteristics and include the effect of resistance are needed. Statistical and modeling methods called zero-inflated models are examined for better comparison of factors and prediction of cybersickness.

Privacy in Social Computing and Mobile Networking

Date/Time: Friday, January 23, 2015 from 4:10 p.m. - 5:00 p.m.

Location: Roberts 301

Presenter: Na Li, Northwest Missouri State University


With the development of web and wireless technologies and mobile devices, more and more people are conducting their daily activities online. These activities generate large amounts of data including the sensitive information that people are not willing to share with others. Therefore, the disclosure of users’ privacy becomes an intensive concern. This talk will focus on preserving users’ privacy in social media and mobile networks. Specifically, two projects will be introduced. One is to design a privacy-aware friend search engine in Online Social Networks (OSN), and the other one is to preserve users’ relationship privacy in OSN operators sharing data with the third parties. Additionally, this talk will briefly discuss the problems of privacy disclosure in mobile networks.

Classification of Musical Instruments

Date/Time: Wednesday, January 21, 2015 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Patrick Donnelly, Montana State University


Musical instrument classification is an important task in the area of Music Information Retrieval. While there have been many approaches to recognize individual instruments, the majority of these are not extensible to the more complex case of identifying the musical instruments present in polyphonic mixtures. We present a data-driven clustering technique for learning regions of spectral prominence in an instrument's timbre, exploiting these regions as spectral filters in the feature extraction stage of a binary relevance classification task. We demonstrate the approach over several large datasets consisting of multiple articulations, dynamics, and performers, validating the approach across datasets and with several classifiers. Lastly, we discuss ongoing work in the extension of these spectral filters for source separation estimation in identification of instruments present in polyphonic mixtures.

Seminars from 2014.