For a planned list of upcoming speakers see this Google Calendar.

Automated Prediction and Curation of Bio-Ontology Terms

Date/Time: Monday, May 8, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Indika Kahanda

Abstract: A key component of Precision Medicine is to take into account the individual variability in genes for disease treatment. The successes of this process is highly dependent on the reliability of large-scale biological databases. Typically, these databases are manually curated by professional biocurators who extract the information on biological entities from biomedical literature in the form of standard vocabularies called bio-ontologies. However, this process is highly resource consuming and thus leads to the incompleteness of these databases. Furthermore, wet-lab experiments that are used to generate evidence on many different biological entities such as proteins are also highly resource consuming in nature. We identify these as  two of the major bottlenecks of this pipeline and attempt to find answers to (1) can we develop accurate high-throughput computational tools for predicting bio-ontology terms?, and (2) can we automate the process of biocuration using natural language processing techniques?. In this talk, I will describe two projects, involving automated prediction and curation of Human Phenotype Ontology (HPO) and Gene Ontology (GO) terms, that attempt to provide answers to these questions.

Bio: Dr. Indika Kahanda is an Assistant Teaching Professor in the Gianforte School of Computing at Montana State University. His research interests include Bioinformatics and Biomedical Natural Language Processing. He works on the application of machine learning, data mining and natural language processing techniques for solving problems related to large-scale biological data. His current work focuses on predicting mental illness categories for biomedical literature, protein function prediction and protein-function relation extraction from biomedical literature. He received his Ph.D. in Computer Science from Colorado State University in 2016 in the area of Bioinformatics, a Master of Science in Computer Engineering from Purdue University in 2010, and a Bachelor of Science in Computer Engineering from University of Peradeniya, Sri Lanka in 2007.

Transport Profiling for Big Data Transfer Over Dedicated Channels

Date/Time: Thursday, April 27, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Daqing Yun

Abstract: Extreme-scale scientific applications in various domains such as earth science and high energy physics among multiple national laboratories within U.S. are generating colossal amounts of data, now frequently termed as “big data”, which must be stored, managed and moved to different geographical locations for distributed data processing and analysis. High-performance networks featuring high bandwidth and advance reservation are being developed and deployed to support such scientific applications. However, even if a dedicated channel is provisioned, the end-to-end data transfer performance still largely depends on the transport protocols being used on the end-hosts and maximizing their throughput performance is still very challenging mainly because: i) their optimal operational zone is affected by the configurations and dynamics of the network, the endhosts, and the protocol itself, ii) their default parameter setting does not always yield the best performance, iii) application users, who are domain experts, typically do not have the necessary knowledge to choose which transport protocol to use and which parameter value to set.
    We design and develop a network connection profiler named “Transport Profile Generator” (TPG) to characterize and enhance the end-to-end throughput performance of a specifically selected data transfer protocol for big data movement over high-speed dedicated network connections. TPG employs an exhaustive search-based profiling approach to sweep through the combinations of parameter settings and enables users to determine the “best” set of parameter values for the optimal data transfer performance. To improve the efficiency of transport profiling, we propose a stochastic approximation-based profiling method, referred to as FastProf, which employs the Simultaneous Perturbation Stochastic Approximation (SPSA) algorithm to accelerate the exploration of the parameter space. Furthermore, we extend the “fast” profiling approach to other transport protocols and propose a profiling optimization-based data transfer advisor to help end users determine the most effective data transfer method with the most appropriate control parameter values to achieve the best data transfer performance.
    In this talk, I will introduce our profiling approach to explore the optimal operational zone of a data transfer protocol in a given network environment and then present extensive experimental results of both TPG and FastProf collected in various network environments including a 10 Gb/s back-to-back connection in our local testbed, 10 Gb/s emulated long-haul connections with various RTT delays at Oak Ridge National Laboratory, and 10 Gb/s physical connections with both short and long delays from Argonne National Laboratory to University of Chicago.

Bio: Daqing Yun received his Ph.D. degree in computer science from New Jersey Institute of Technology in August 2016. He is currently an assistant professor at Harrisburg University of Science and Technology. His research interests include high-performance networking, parallel and distributed computing, green networking, and big data.

High-Performance Computing and its Application in Power System Dynamic Simulation

Date/Time: Monday, April 17, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Shuangshuang  Jin

Abstract: Dynamic simulation for transient stability assessment is one of the most important computational tasks that affect the secure operation of the bulk electric power system. However, modeling the system dynamics and network involves the computational intensive time-domain solution of numerous differential and algebraic equations (DAE), which limits the ability to operate a much-evolved power system with significant dynamic and stochastic behaviors introduced by the increasing penetration of renewable generation and the deployment of smart grid technologies. 

Modern High Performance Computing (HPC) holds the promise to accelerate power system application by parallelizing its kernel algorithms without compromising computational accuracy. The improved performance is expected to have a significant impact on online power grid dynamic security assessment, ultimately leading to better reliability and asset utilization for the power industry. 

This talk will introduce the basic structure of power system, the HPC concept, and its application to power system dynamic simulation, discuss how to utilize advanced computing techniques for real-time power grid modeling and simulation, and present research outcomes of some parallel power system dynamic simulation applications. 

Bio: Dr. Shuangshuang Jin is a senior research scientist at Electricity Infrastructure Group of Pacific Northwest National Laboratory. Her research interests include high-performance computing, parallel programming, advanced grid analytics, and computer modeling and visualization. She has authored or coauthored 30+ journal articles and conference papers in the area of Computer Science, Power Engineering, and Bioinformatics. She received her M.S. in Computer Science with a specialty in Computer Graphics and Visualization, and Ph.D. in Computer Science with a specialty in Scientific Computation from Washington State University in 2003, and 2007, respectively.

Folds, Intersections, and Inflections: Seven Ways to Distinguish a Cylinder from a Möbius Band

Date/Time: Monday, April 10, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 103
Speaker: Tom Banchoff

Abstract: This talk develops seven different visual ways to distinguish whether a strip neighborhood of a curve on a surface is an oriented cylinder or a non-orientable Möbius band. Computer graphics illustrations will explore fold curves of projections of surfaces into planes, self-intersection curves of surfaces in three space, and a new criterion in terms of surface inflections.

Bio: Thomas Francis Banchoff is an American mathematician specializing in geometry. He is an emeritus professor at brown university, where he taught since 1967. He is known for his research in differential geometry in three and four dimensions, for his efforts to develop methods of computer graphics in the early 1990s, and for his pioneering work in methods of undergraduate education utilizing online resources.

Banchoff attended the university of Notre Dame and received his Ph.D from UC Berkeley in 1964, where he was a student of Shiing-Shen Chern. Before going to Brown he taught at Harvard University and the University of Amsterdam. In 2012 he became a fellow of the American Mathematical Society. He was a president of the Mathematical Association of America from 1999-2000.

Anomaly Detection Through Spatio-Temportal Data Mining, with Application to Real-Time Outlying Sensor Identification

Date/Time: Monday, April 3, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Doug Galarus

Abstract: There is a need for robust solutions to the challenges of real-time spatio-temporal outlier and anomaly detection. In our dissertation, we define and demonstrate quality measures for evaluation and comparison of overlapping, real-time, spatio-temporal data providers and for assessment and optimization of data acquisition, system operation and data redistribution. Our measures are tested on real-world data and applications, and our results show the need and potential to develop our own mechanisms for outlier and anomaly detection. We then develop a representative, real-time solution for the identification of outlying sensors that far outperforms state-of-the-practice methods in terms of accuracy and is computationally efficient. When applied to a real-world, meteorological data set, we identify numerous problematic sites that otherwise have not been flagged as bad. We identify sites for which metadata is incorrect. We identify observations that have been mislabeled by provider quality control processes. And, we demonstrate that our method outperforms enhanced versions of state-of-the-practice methods for assessment of accuracy using comparable or less computation time. There are many quality-related problems with real data sets and, in the absence of an approach like ours, these problems may have largely gone unidentified. Our approach is novel for the simple but effective way that it accounts for spatial and temporal variation, and that it addresses more than just accuracy. Collectively these contributions form an overarching data-mining framework and example that can be used and extended for data-mining method development, model building and evaluation of spatio-temporal outlier and anomaly detection processes.

Bio: For the past 13 years, Doug Galarus has grown a nationally-recognized, award-winning research program, has supervised numerous students and staff, and has overseen multiple labs at the Western Transportation Institute at Montana State University. In his “spare time”, Doug has worked towards a PhD in Computer Science. Doug is also an accomplished educator, having taught both mathematics and computer science course at the college level. Doug has taught and led certification and continuing education programs, and worked on several nationally-published curriculum projects, developing mathematics texts and technology for middle school and high school students. Doug has an active teaching certificate for mathematics and computer science in grades 5-12. All total, Doug has 27 years of professional experience in systems engineering, information technology development, testing, implementation, management and instruction. He has extensive experience as the project manager and technical lead for mobile data communications systems, database-driven web sites, web site design, desktop applications, kiosk development, smartphone and tablet–based development, and interactive multimedia.

Deep Neural Networks for Artificial Intelligence: Talking with Machines

Date/Time: Monday, March 27, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Larry Heck

Abstract: Neural networks have been a topic of research for many decades. However, neural networks have only recently begun to achieve widespread adoption. In this talk, I will give my perspective on 'why now?'  and highlight recent R&D on a special class of neural networks called 'deep neural networks'. In the second part of the talk, I will focus on a frontier for deep learning research - talking to machines. Natural conversational interfaces have long been viewed as a watermark for intelligent systems including the often cited Turing Test. I will give an overview of our work within Google Research on how we are leveraging deep learning to make rapid advancements in this area.  

Bio: Dr. Larry Heck is Director of Research of the Deep Dialogue team at Google, an advanced R&D effort behind the Google Assistant. From 2009-­2014, he was the Chief Scientist of the Microsoft Speech products team and later a Distinguished Engineer in Microsoft Research. In 2009, he co­-founded the initiative that led to Microsoft’s Cortana personal assistant. From 2005 to 2009, he was Vice President of Search & Advertising Sciences at Yahoo!, responsible for the creation, development, and deployment of the algorithms powering Yahoo! Search, Yahoo! Sponsored Search, Yahoo! Content Match, and Yahoo! display advertising. From 1998 to 2005, he was with Nuance Communications and served as Vice President of R&D, responsible for natural language processing, speech recognition, voice authentication, and text-to-speech synthesis technology. He began his career as a researcher at the Stanford Research Institute (1992-1998), initially in the field of acoustics and later in speech research with the Speech Technology and Research (STAR) Laboratory. Dr. Heck received the PhD in Electrical Engineering from the Georgia Institute of Technology in 1991. He is a Fellow of the IEEE and has over 50 United States patents.

Biodiversity and Databases:  The Odd Couple?

Date/Time: Monday, March 20, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Dave Roberts

Abstract: Ecologists worldwide are concerned about the loss of biodiversity at local, regional, and global scales.  Determining the distribution and abundance of species by site is a primary activity of ecologists everywhere.  Computer scientists are concerned with the efficient access and storage of information and the design of user interfaces to facilitate that access.  Biodiversity databases offer the potential to capitalize on the work of computer scientists to address global biodiversity concerns.  This seminar is an effort to bridge the gap between ecologists and computer scientists and to possible recruit students and faculty with an interest in contributing to global sustainability.

Title: Dave Roberts is a vegetation ecologist with extensive experience in the vegetation of the northern Rocky Mountains. He is a member of the Executive Committee of the Panel for the National Vegetation Classification (which seeks to document the entirety of the vegetation of the continental US) and a journeyman computer programmer with an interest in multivariate analysis, geographic information systems (GIS) and database design. Dave is currently the Head of the Ecology Department at MSU.

Flux Analysis of a Metabolic Network in Cells Stimulated by Compression

Date/Time: Monday, March 6, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Ron June and Daniel Salinas

Abstract: Cells are the fundamental units of life, and most cells use glucose to produce energy and various precursors for the machinery needed to operate. We have developed a network model of glucose metabolism. We apply this model to cartilage cells. Cartilage cells are compressed during everyday activities such as walking. We use experimental data from in vitro chondrocytes subject to sinusoidal compression. We determine changes in metabolic flux and present future directions for this work.

June Bio: Ron June has longstanding research interests in osteoarthritis and biomechanics related to improving human health. At Dartmouth College Dr. June studied Engineering Sciences focused on biomechanics and developed a novel wrist protection strategy, contributed to the design and manufacture of a system for monitoring 3D head accelerations in helmeted sports, and helped to develop a finite element model to understand the biomechanics of spinal pain in rats. As a graduate student at the University of California, Davis, Dr. June studied cartilage biomechanics. Specifically, he investigated a novel mechanism of cartilage flow-independent material properties. During the course of this project, he discovered novel biomechanical phenomena and made several experimental observations that are consistent with polymer dynamics as a potential physiological mechanism of cartilage viscoelasticity. As a postdoctoral fellow, Dr. June has implemented a surgical model of mouse osteoarthritis and studied protein transduction. He developed a pH-sensitive system for intracellular delivery of macromolecules and has investigated protein transduction in cartilage and chondrocytes. Dr. June’s laboratory at Montana State University was completed in March 2012, and his research involves synovial joint drug delivery and mechanotransduction. Dr. June has been named a GAANN Fellow, NIH Kirchstein Fellow, and the Montgomery Street Scholar by the ARCS Foundation. His long-term research interests lie in understanding cartilage and joint mechanobiology to develop novel therapeutic strategies for joint disease.

Salinas Bio: Daniel Salinas is a Ph.D. student at the Gianforte School of Computing. He is co-advised by Drs. Mumey and June. He has been in the Ph.D. program three years, following the completion of his M.S., also from the Computer Science department where his thesis work examined minimal cuts in metabolic networks. His research interests are metabolic networks and metabolic flux analysis.

Use of a Modeling and Simulation Framework to Identify and Quantify Emergent Behavior in System of Systems Simulations

Date/Time: Monday, February 27, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Mary Ann Cummings

Abstract: This presentation describes a Modeling and Simulation (M&S) framework for building System of Systems (SoS) simulations, known as Orchestrated Simulation through Modeling (OSM).  This framework allows Discrete Event System Specification (DEVS) M&S components and output visualizations to be developed separately as plug-ins and combined to form a complete system.  Independently developed plug-ins can be added and removed as desired to dramatically change the system.  With the OSM framework, an evolutionary System of Systems can be intelligently created by a community.  Each community member only needs to fully understand the pieces they develop.   With this framework, we can now define a software architecture that allows the collection and graphing of SoS metrics in one location such that these metrics can then be used to evaluate the emergent behavior of the SoS.  This can be accomplished by architecting swappable and reusable Simulators and Experimental Frames to provide the changing of these elements without any of the other elements, including the models, having to change.  This research involves determining if these collected metrics will enable the identification and analysis of emergent behavior among the interactions of the models (component systems). 

Bio: Dr. Mary Ann Cummings earned her Ph.D. from the Naval Postgraduate School in 2015.  Her research interests include software frameworks, software reuse, modeling and simulation software, and formal methods.  Dr. Cummings works for the Naval Surface Warfare Center as a Principal Computer Scientist/GS15.

Useful Math for Data Analytics (that most students forget before their first jobs)

Date/Time: Monday, February 13, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Mark Pratt

Abstract: Machine learning tools applied to large and noisy data sets can be extremely powerful and require little preparation to use.  However, they are usually difficult to interpret.  Sometimes simpler is better.  In this talk, we will go over a some general methods useful to explore and get first quantitative results from partially understood data sets.   The general theme will be transforming non-linear problems to linear ones that have meaningful outputs and predictable (and usually short) compute times.  The methods are simple, powerful and should be in everyone’s analysis toolbox but in practice, are not.  Many students entering an analytical profession already have seen these fundamentals but have already forgotten them or skip them in favor of high power techniques.

Bio: Dr. Mark Pratt is a physicist by training, data analyst by nature and system engineer by habit.  He has held a number of technical leadership positions in science and engineering spanning astronomy and astrophysics, telecommunications, lasers and photonics, instrumentation and genomics.   Since 2006, Mark has been generally focused on the development of low cost DNA sequencing as Principal Engineer at Solexa and Illumina and later on improving the accuracy of DNA sequencing applications at Personalis and 10X Genomics.   He is currently CTO of startup still in stealth mode.   Mark received his PhD in Physics from UC Santa Barbara, has 19 issued patents and a number of publications.

Coordination and Data Analytics for Networked Systems

Date/Time: Friday, February 10, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Stacy Patterson

Abstract: Networked systems are systems composed of dynamic agents that interact over a network. Examples of networked systems range from sensor networks to autonomous robotic networks to the millions of networked components within a single robot. In the first part of this talk, I will present recent theoretical results on coordination in networked systems, i.e., how can a group of agents efficiently reach and maintain agreement.  The second part of this talk addresses the challenge of how to efficiently extract and summarize data generated by a networked system, specifically, robotic tactile skins. Finally, I will discuss how tools and results for network coordination and data analytics can be combined to develop solutions for scalable, distributed data analytics in the Internet of Things.

Bio: Stacy Patterson is the Clare Boothe Luce Assistant Professor in the Department of Computer Science at Rensselaer Polytechnic Institute. She received the MS and PhD in computer science from the University of California, Santa Barbara in 2003 and 2009, respectively.  From 2009-2011, she was a postdoctoral scholar at the Center for Control, Dynamical Systems and Computation at the University of California, Santa Barbara. From 2011-2013, she was a postdoctoral fellow in the Department of Electrical Engineering at Technion – Israel Institute of Technology. Dr. Patterson is the recipient of a Viterbi postdoctoral fellowship, the IEEE Control Systems Society Axelby Outstanding Paper Award, and an NSF CAREER award.  Her research interests include distributed systems, machine learning, sensor networks, and the Internet of Things.

Secure Geometric Search on Encrypted Spatial Data

Date/Time: Monday, February 6, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Boyang Wang

Abstract: Geometric range search is a fundamental primitive for spatial data analysis in SQL and NoSQL databases. It has extensive applications in Location-Based Services, computational geometry, and computer-aided design. Due to the dramatic increase of data size, it is necessary for companies and organizations to outsource their spatial datasets to third-party cloud services (e.g. Amazon) in order to reduce storage and query processing costs, but meanwhile with the promise of no privacy leakage to the third party. Searchable encryption is a technique to perform meaningful queries on encrypted data without revealing privacy. However, geometric range search on spatial data has not been fully investigated nor supported by existing searchable encryption schemes. The main challenge, is that compute-then-compare operations required by geometric range search cannot be supported by any existing crypto primitives. In this talk, I will present my recent research in secure geometric range search over encrypted spatial data. The general approach is to adopt new representations of spatial data, and transform geometric range search to avoid compute-then-compare operations, so that existing efficient crypto primitives can be integrated. I will present two designs, the first one focuses on circular range search, and the second one can handle arbitrary geometric range queries. The security of both schemes are formally proven under standard cryptographic assumptions. Finally, I will briefly mention some of my future research plans.

Bio: Boyang Wang is a Ph.D. Candidate in the Department of Electrical and Computer Engineering at the University of Arizona. He received his first Ph.D. degree in Cryptography in 2013 and his B.S. degree in Information Security in 2007, both from Xidian University, China. He worked for Bosch Research & Technology Center as a research intern in 2015. He was a visiting student at the University of Toronto and Utah State University. His research interests include applied cryptography, information security and privacy-preserving techniques with focuses on data security and privacy. He has published over 20 research papers in top journals and conferences, including TIFS, TDSC, TSC, TPDS, INFOCOM, CNS, ACM ASIACCS, and ICDCS.

Towards End-to- End Security and Privacy: Accountability and Data Privacy in the Life Cycle of Big Data

Date/Time: Monday, January 30, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Taeho Jung

Abstract: The advent of big data has given birth to numerous innovative life-enhancing applications, but the big data is often called as a double-edged sword due to the increased privacy and security threats. Such threats, if unaddressed, will become deadly barriers to the achievement of big opportunities and success anticipated in the big data industry because they may arise at any part of the life cycle of the big data.

In this talk, I will describe my research which addressed various privacy and security issues in the big data life cycle: acquisition, storage, provisioning, and consumption. More specifically, I will present how to make large-scale data trading accountable against dishonest users for the provisioning phase of big data. Subsequently, I will briefly present how various types of data can be protected in their acquisition and consumption phases of the life cycle, and finally I will introduce the theoretic foundations of the presented research.

Bio: Taeho Jung is a Ph.D. candidate in Computer Science at Illinois Institute of Technology, advised by Professor Xiang-Yang Li. His research area, in general, includes privacy and security issues in data mining and provisioning in the big data life cycle. His paper has won a best paper award (IEEE IPCCC 2014), and two of his papers were selected as best paper candidate (ACM MobiHoc 2014) and best paper award runner up (BigCom 2015) respectively. He has served many international conferences as a TPC member, including IEEE DCOSS 2016, IEEE MSN 2016, IEEE IPCCC 2016, and BigCom 2016.

Intelligent tracking of moving objects by cracking the neural code for visual motion

Date/Time: Friday, January 27, 2017 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Neda Nategh

Abstract: A particularly difficult aspect of object tracking in artificial vision systems occurs when the observer itself is moving producing a confounded motion pattern that must be disentangled to reliably signal the object motion. While machine vision systems have improved manifold in their capabilities, they are still challenged by a trade-off between runtime, efficiency, accuracy, robustness, and flexibility, especially to handle real-world complexities such as object occlusions, multiple moving objects, and varying scene statistics. At other hand, our biological visual system is capable of performing similar motion detection and discrimination task reliably every moment that we are awake to compensate for constant eye movements of different sorts. Employing a statistical model-based approach driven by data, we are able to characterize the time-varying information conveyed by retinal and cortical spike responses during an eye movement task (encoding) and understand a readout mechanism by which downstream neurons can extract relevant motion information in the scene (decoding), all in a statistically optimal computational framework. Moreover, employing deep convolutional neural networks (CNN) whose computational units and connectivity are set to mimic the biophysical properties of our statistically optimal model, we will be able to generalize to real-world motion stimuli. This model-based approach to understand the neural code of visual motion may ultimately lead to intelligent motion computing schemes that will advance the state-of-the-art machine vision from a moving platform including autonomous vehicles, mobile robotic systems, and assistive technology for visually impaired people.

Bio: Neda Nategh is an Assistant Professor of Electrical and Computer Engineering at Montana State University since January of 2014. She obtained her Ph.D. in electrical engineering, her M.Sc. in electrical engineering, and her M.Sc. in statistics, all from Stanford University, and her B.Sc. in electrical engineering from Sharif University of Technology. She also holds a certificate in Biophysics and Computation in Neurons and Networks from the Neuroscience Institute at Princeton University. She conducts research in the areas of signal, image and information processing, and statistical machine learning, with particular emphasis on computational neuroscience, and biological and machine vision. She has been granted one US patent from her research internship in the Camera Algorithm group at Apple Inc., CA.

Sensitive and fast DNA homology search with profile HMMs in HMMER

Date/Time: Monday, December 5, 2016 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 103
Speaker: Travis Wheeler

Abstract: Sequence database homology searches are an essential part of molecular biology, providing information about the function and evolutionary history of proteins, RNA molecules and DNA sequence elements. I will describe a tool for DNA/DNA sequence comparison that is built on the HMMER framework, which applies probabilistic inference methods based on hidden Markov models to the problem of homology search. This tool, called nhmmer, enables improved detection of remote DNA homologs, and has been used in combination with Dfam and RepeatMasker to improve annotation of transposable elements in the human genome. I will then describe an algorithm, based on the Burrows Wheeler Transform, that speeds one simple but time-consuming part of nhmmer, yielding more than an order of magnitude acceleration over a highly optimized implementation.

Bio: Travis Wheeler is an Assistant Professor at the University on Montana Computer Science Department, where his group develops methods in computational biology, with an emphasis on genomic sequence analysis. For the most part, that involves development of algorithms that increase the speed, power, and accuracy of sequence database homology search using profile hidden Markov models, and application of these methods topics motivated by biology, especially those involving transposable elements and regulatory elements. Travis earned his Bachelors in Evolutionary Biology from the University of Arizona in 1995. He spent several years in industry and academia as a telecom and web software developer, then earned a PhD in Computer Science in 2009, under the guidance of John Kececioglu and Mike Sanderson at the University of Arizona. He worked in Sean Eddy's group (HHMI Janelia Research Campus) as a postdoc and software engineer until 2014, when moved to his current position.

PHENOstruct: Prediction of human phenotype ontology terms using heterogeneous data sources

Date/Time: Monday, November 28, 2016 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Indika Kahanda

Abstract: The human phenotype ontology (HPO) was recently developed as a standardized vocabulary for describing the phenotype abnormalities associated with human diseases. At present, only a small fraction of human protein coding genes have HPO annotations. But, researchers believe that a large portion of currently unannotated genes are related to disease phenotypes. Therefore, it is important to predict gene-HPO term associations using accurate computational methods. In this talk I will present PHENOstruct, a novel method based on the structured SVM approach for automatically predicting HPO terms for a given gene. Furthermore, I will highlight a collection of informative data sources suitable for the problem of predicting gene-HPO associations, including large scale literature mining data.

Bio: Dr. Indika Kahanda is an Assistant Teaching Professor in the Gianforte School of Computing at Montana State University. His research interests include Bioinformatics and  Biomedical Natural Language Processing. He works on the application of machine learning, data mining and natural language processing techniques for solving problems related to large-scale biological data. His current work focuses on predicting mental illness categories for biomedical literature, prediction of MicroRNA genes and targets using machine learning, protein function prediction and protein-function relation extraction from biomedical literature. He received his Ph.D. in Computer Science from Colorado State University in 2016 in the area of Bioinformatics, a Master of Science in Computer Engineering from Purdue University in 2010, and a Bachelor of Science in Computer Engineering from University of Peradeniya, Sri Lanka in 2007. 

Some elementary applications of algebraic topology

Date/Time: Monday, November 21, 2016 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 103
Speaker: Ryan Grady

Abstract: In this talk I will prove some fundamental results in (linear) algebra using tools from algebraic topology. Specifically, I will discuss a result of Perron on the spectrum of positive matrices; note that this result is necessary to show completeness of the Google page rank algorithm.

Bio: Ryan Grady is an assistant professor in the Department of Mathematical Sciences. He obtained his PhD from the University of Notre Dame under Stephan Stolz, before being a postdoctoral researcher at Boston University and the Perimeter Institute for Theoretical Physics. His research interests include algebraic topology and mathematical aspects of quantum field theory.

Agility and Software Architecture: Why Successful Teams Should Master Both

Date/Time: Monday, November 14, 2016 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Ipek Ozkaya

Abstract: In our increasingly software-enabled society, change is the norm rather than the exception. Technologies, business priorities, quality concerns, and team and organizational structures change perpetually. Successful software organizations are those that empower their teams with fundamental skills to adapt to such changes. Within a short duration of five years we have seen the software industry chase after service-oriented architecture, cloud computing, and microservice architecture. The fundamental problem that these approaches purport to solve ironically remains unsolved: enabling agility with minimal business impact. In this presentation driving from the work we do at the Software Engineering Institute, I will discuss how mastering agility and software architecture affords cross-functional teams the greatest likelihood for success. I will discuss why the increasing rate of change in the software business must motivate a consequent change in our approaches to software development. I will in particular focus on practices, research progress and challenges in enabling software engineers to generate and utilize software development data towards this goal. 

Bio: Ipek Ozkaya is a principal researcher and the deputy technical lead for the Architecture Practices (AP) initiative at the Software Engineering Institute at Carnegie Mellon University. She works to develop, apply, and communicate effective methods for software architecture and agile and iterative development to improve software development efficiency. Her most recent work focuses on building the theoretical and empirical foundations of managing technical debt in large-scale, complex software intensive systems. While at the SEI she has had the privilege of working with a wide variety of government and industry organizations helping them improve their software architecture practices. In addition, Ozkaya serves as the chair of the advisory board of the IEEE Software magazine and as an adjunct faculty member for the Master of Software Engineering Program at Carnegie Mellon University. She is the co-author of several articles as well as a frequent invited speaker in software architecture and related topics. She holds doctorate and masters degrees in Computational Design from Carnegie Mellon University.

A Survey on Monitoring Network Flows

Date/Time: Monday, November 7, 2016 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Samuel Micka

Abstract: Flows are found in many fields of research that deal with objects moving through a network. Flows can be traffic moving through a roadway, packets moving through the internet, or people walking on trails. Multiple flows moving through a single network makes it increasingly difficult to monitor the paths that the flows take. This survey provides a summary and analysis of three solutions for monitoring different flows in computer networks. Two additional, complimentary, papers are considered and evaluated as well: one focusing on the selection of management/monitoring nodes in dynamic networks, and one focusing on decomposing multi-path flows into single paths. We examine the methods used to solve the problems as well as the implications of the research and future work.

Bio: Samuel Micka is a PhD Student in the Computer Science Department, advised by Dr. Brendan Mumey and  Dr. Brittany Fasy. He is a member of the Networks + Algorithms lab and his research involves finding algorithmic solutions and applying them to the field of computer networks.

Autonomous and connected highway vehicles: what’s passed, and your future

Date/Time: Monday, October 24, 2016 from 4:10 p.m. - 5:00 p.m.
Location: Barnard Hall 108
Speaker: Craig Shankwitz

Abstract: The appearance of carrier-phase RTK GPS in 1994 created a new paradigm for ground transportation:  autonomous vehicles.  Dual-frequency, carrier phase RTK GPS appearing 4 years later made it a reality.  The presentation will highlight RTK-GPS based driver assist and autonomy, its present and future role in an autonomous vehicle world presently dominated by machine vision and lidar, and near- and long-term opportunities for MSU in this extremely disruptive environment where Uber is now worth more than Ford.

Bio: Dr. Craig Shankwitz serves as a Senior Research Engineer for the Connected Vehicle Initiative at the Western Transportation Institute at Montana State University.  He leads the development of a WTI research team that will explore and develop applications of autonomous and connected vehicle technologies to roads and transportation systems in rural areas and small cities.

His research interests include man-machine interaction, vehicle-driver interfaces, sensors, human in the loop control systems, and non-linear vehicle dynamic problems in general. Most recently, Dr. Shankwitz was a principal R&D engineer at MTS Systems in Eden Prairie, MN, where one of his tasks was to design and develop a patented, robotic motorcycle rider which can be used for testing in a wide variety of applications.  Prior to MTS, Dr. Shankwitz served as a Research Associate Professor and the Director of the Intelligent Vehicles Lab at the University of Minnesota. The focus of the IV Lab was the deployment of technology which simultaneously improves mobility and safety for the ground transportation network. Deployments include DGPS- and radar-based Driver Assist Systems for seven Alaska DOT snow-removal machines (to clear runways, roads, and mountain passes in Alaska), ten buses equipped with Driver Assist Systems for narrow bus-only-shoulder operations in the Twin Cities Metropolitan area, and a radar-based rural intersection collision avoidance system which assists drivers safety negotiate rural expressway intersections.

Shankwitz received his Ph.D. in Electrical Engineering from the University of Minnesota in 1992 in the area of control theory, a Master of Science in Mechanical Engineering from the University of Illinois, Champaign-Urbana in 1985, and a Bachelor of Science in Mechanical Engineering from Iowa State University in 1983. He holds seven patents, with two pending.

Student Summer Internship Presentations

Date/Time: Monday, September 26, 2016 from 4:10 p.m. - 5:00 p.m.

Location: Barnard Hall 108

Speaker: Sean Yaw
Abstract: Blackmore Sensors and Analytics is a 3D imaging company developing advanced laser ranging (lidar) systems and analysis algorithms to support a broad space of applications.  Lidar systems produce a precise cloud of 3D geospatial points representing the scanned scene. Realizing the full potential of this data requires novel data analytics to track objects, identify targets, and index into sophisticated data management schemes. This talk will introduce the base technology generating the data, as well as outline the analytical challenges being faced and some strategies developed to address them.

Speaker: Guangchi Liu
Abstract: As the development of online business, many enterprises now own tons of commercial data generated from costumers and clients.  These companies look forward ideas/insights from these data to help improve their product/service and make decision/strategies in the future. In sight of this huge market, many tech and consulting companies are providing data analysis services for those enterprises. In this summer, I go to intern in Stratifyd.Inc, a data analysis startup company located in NC. My job is conducting sentiment analysis and Chinese word segmentation on the comments and reviews of particular products/services and analysis costumers' opinions on these products/services.  By using NLP and data mining techniques including regression models, probabilistic models and neural network models, I have obtained promising as well as interesting results from the data.  

Speaker: Clint Cooper
Abstract: Stanford Research Institute International (SRI) is an independent research and development organization interested in the creation of innovative technology and business solutions for government agencies and commercial businesses. For more than 70 years, they have been developing the latest technology in many fields including Health, Technology, Education and Business . In the field of technology, they have developed many notable technologies: LCD, Optical Video Disks (Basic CD-ROM), CMOS, the computer mouse, Top Level Domain names, LED, Fax Machines, 911-Call System, Ultrasound, VPNs, and SIRI, just to name a few. SRI also hosted one of the four original nodes of ARPANET and advised DARPA (then ARPA) for development of the network.  For the duration of the summer, I had the opportunity to work on a project at SRI. This project is a document processor for analyzing research documents, with the goal of answering interesting questions regarding them. This is realized through a Scala written AI that utilizes SRI's own language processing libraries.

Student Summer Internship Presentations

Date/Time: Monday, September 19, 2016 from 4:10 p.m. - 5:00 p.m.

Location: Barnard Hall 108

Speaker: Mohammadbagher Parsa Gharamaleki
Abstract: We have developed a software package for online isolations and stimulation triggering of neuronal cells in the brain, which operates in conjunction with a Hardware Processing Platform (HPP). The HPP is a system on a chip solution enabling real-time signal processing on neuronal signals. Employing the HPP programmable device for template matching both accelerates spike sorting and provides the low-latency triggering of stimulation required to detect connectivity between brain areas.

Speaker: Sean Yew
Abstract: Blackmore Sensors and Analytics is a 3D imaging company developing advanced laser ranging (lidar) systems and analysis algorithms to support a broad space of applications.  Lidar systems produce a precise cloud of 3D geospatial points representing the scanned scene.  Realizing the full potential of this data requires novel data analytics to track objects, identify targets, and index into sophisticated data management schemes. This talk will introduce the base technology generating the data, as well as outline the analytical challenges being faced and some strategies developed to address them.

Speaker: Utkarsh Goel
Abstract: The growing popularity of interactive Web applications attract large number of mobile users. Content providers, such as Facebook, Netflix, and others, desire to improve user experience and generate more revenue. However, the opaqueness of mobile networks today introduces several challenges to meet the goals of faster mobile Web. In this talk, I plan to discuss one of these challenges from the perspective of the largest content delivery network, Akamai. My talk will provide an overview of large scale measurements that we perform to gather detailed information about Internet performance. I believe my contributions in this area will motivate new research directions to provide far better understanding of the Internet ecosystem, than what we have today.

K-12 Outreach Through Practical Software Research and Development in the Software Factory Environment

Date/Time: Monday, September 12, 2016 from 4:10 p.m. - 5:00 p.m.

Location: Barnard Hall 108

Presenter: Jessica Jorgenson, Michael O’Hara, Amber Nabity, Anna Jinneman, Riley Roberts, Xuying Wang, Ryan Darnell

Abstract: Teaching software development in environments that mimic industry practices is essential for teaching applicable real-word development skills. In addition, these kinds of delivery based projects engage students in meaningful design work that encourages clear, sustainable code. The Software Factory has provided such an environment to students at MSU since 2014.

This project aimed to explore the effectiveness of such a setting for high school students with limited programming experience. Five students from Bozeman High School were selected to work in a team with two undergraduates with the goal of improving upon a Sorting Guide android application built during last summer’s project. In order to accomplish this goal, the students were also taught the tools  and languages necessary to build an application. These students were exposed to Java, XML, Git, various sorting algorithms, and software development practices inside an industry setting. We will present a demonstration of the students’ work as well as discuss the benefits and challenges with this teaching method within the Software Factory.

Welcome Seminar

Date/Time: Monday, August 29, 2016 from 4:10 p.m. - 5:00 p.m.

Location: Barnard Hall 108

Presenter: John Paxton, Gianforte School of Computing, Montana State University

Abstract: This seminar will provide new and continuing graduate students with (1) useful information, (2) an opportunity to meet other students, staff and faculty, and (3) an opportunity to ask questions.

Departmental Awards Seminar

Date/Time: April 25, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: John Paxton, Dept. Computer Science, MSU

Abstract: At the end of every academic year, we celebrate the accomplishments of our graduate students and faculty. Join us for this year's celebration where several departmental awards will be given. Refreshments will be served.

A New Discrete Particle Swarm Optimization Algorithm

Date/Time: April 18, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Shane Strasser, Dept. Computer Science, MSU

Abstract: Particle Swarm Optimization (PSO) has been shown to perform very well on a wide range of optimization problems. One of the drawbacks to PSO is that the base algorithm assumes continuous variables. In this paper, we present a version of PSO that is able to optimize over discrete variables. This new PSO algorithm, which we call Integer and Categorical PSO (ICPSO), incorporates ideas from Estimation of Distribution Algorithms (EDAs) in that particles represent probability distributions rather than solution values, and the PSO update modifies the probability distributions. In this paper, we describe our new algorithm and compare its performance against other discrete PSO algorithms.  In our experiments, we demonstrate that our algorithm outperforms comparable methods on both discrete benchmark functions and NK landscapes, a mathematical framework that generates tunable fitness landscapes for evaluating EAs.

Video recording available:

Using Machine Learning to Detect Distant Evolutionary Relationships between Protein Families

Date/Time: April 11, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Mensur Dlakic, Dept. Microbiology & Immunology, MSU

Abstract: Given the relative ease of genomic sequencing, improving the functional annotations of known proteins has become more important than the production of additional protein sequences. Since experimental determination of all protein functions is impractical, in most cases functional assignments are still done using only computational methods. We develop machine learning methods and software tools to find distant relationships between protein families at the level of sequence even in the absence of statistically significant scores. In addition, interactive web servers will provide general public with an easy access to these methods. The combination of human expertise and machine learning techniques will allow us in the long term to systematically catalog many of unclassified proteins and infer their biological functions.

Video recording available:

Discovery and Analysis of Communities in Evolving Political Contribution Networks

Date/Time: April 4, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Scott Wahl, Dept. Computer Science, MSU

Abstract: An important aspect of social networks is the discovery and partitioning of the complex graphs into dense sub-networks referred to as communities. The goal of such partitioning is to find groups who have similar attributes or behaviors. In the realm of politics, it is possible to group individuals with similar political behavior by analyzing campaign finance records. In this paper we show the effectiveness of hierarchical fuzzy spectral clustering over political contribution networks. The results show that clustering the data into two communities generally shows strong correlation between fuzzy membership values and existing estimates of political ideology. Further breakdowns of the data highlight different patterns of behavior. Analyzing these networks in time segments shows how individual behaviors and ideologies may change over time.

Agent-Based Modeling in Electrical Energy Markets Using Dynamic Bayesian Networks and Relevance Vector Machines

Date/Time: March 28, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Kaveh Dehghanpour, Dept. Electrical & Computer Engineering, MSU

Abstract: Electrical power Generation Companies (GenCos) compete with each other on wholesale electrical energy markets over the supply of electrical power to the consumers. Energy markets are based on auction mechanisms in which each player tries to maximize its immediate pay-off value by optimizing its bidding strategy. However, the complexity of the problem is due to the fact that the players need to make decisions under uncertain conditions: firstly, the players do not have access to their competitors' private cost data, which means that each player is a source of uncertainty to its competitors. Secondly, the actual value of electrical load is unknown, a priori, and needs to be predicted. In this project, a hierarchical agent-based framework is presented to model the decision-making problem of GenCos in an electrical energy market. Each GenCo is modeled as an agent with its private computational and decision making capabilities. The concept of Dynamic Bayesian Network (DBN) is employed to represent the "belief space" of GenCo agents. Each agent trains/updates its private belief system using relevance vector machine (i.e., sparse Bayesian learning). The trained DBN is then used to predict the best response to competitors for the incoming rounds of auction. It is shown that as the GenCo agents track their best response to their competitors the market approaches its Nash equilibrium over time.

Video recording available:

Hearing the Earth's Music Through Computing: Sonification of GPS Data

Date/Time: March 21, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Linda Antas, School of Music (Music Technology), MSU

Abstract: Sonification is the mapping of data onto musical/sonic parameters. It is used to assist in data interpretation where visual representations of data are problematic, insufficient, or unavailable. For the computer musician, sonification techniques provide diverse opportunities for research and creativity. The data used, mapping strategies, and how—if at all—to manipulate the data for best musical outcomes are all significant factors. Examples in this presentation will be drawn from a work in progress that sonifies GPS data collected in the Montana wilderness. The data is mapped using an algorithmic composition program that can output to a variety of formats, allowing it to be accessed as standard musical notation, or to be sent directly to a computer music programming language for creating synthesized sounds.

Video recording available:

Monothetic Clustering and Extensions to Clustering Functional Data

Date/Time: March 7, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Mark Greenwood, Dept. Mathematical Sciences, MSU

Abstract: Cluster analysis seeks to find groups of observations that are similar within and different among the created groups. Monothetic clustering algorithms create groups of observations that share common traits, in contrast to the more common polythetic clustering algorithms that create groups that are similar “on average”. The basic ideas and advantages of monothetic clustering are reviewed, including its connections to multivariate regression trees. Progress on the development of an R implementation will also be provided. High dimensional responses that can be considered as having been measured continuously over an argument, such as time or wavelength, are often called functional data. By dividing the functional domain into subregions and recursively partitioning the overall curves based on information from the subregions, a new algorithm, called Partitioning using Local Subregions (PULS), is developed. PULS seems to be competitive with other common functional clustering techniques and shares some characteristics with the monothetic clustering algorithm even though it is no longer monothetic. This is joint work with Tan Tran (Montana State University) and David Hitchcock (University of South Carolina).

The Science of Stories: The Narrative Policy Framework

Date/Time: February 29, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Elizabeth Shanahan (with Michael D. Jones), Dept. Political Science, MSU

Abstract: Narratives are the lifeblood of politics. However, not until the development of the Narrative Policy Framework (NPF) (co-developed by Drs. Shanahan-MSU, McBeth-ISU and Jones-OSU) have narratives been operationalized into a class of variables to test the impact of narratives on decision making. This presentation will provide an overview of the NPF, with two examples of research at the mico- and meso-levels of analysis. The former is an experimental design testing the effects of narrative strategies on individual opinion regarding the introduction of bison in the northern Montana prairie. The latter is preliminary network analyses of narrative constructions of 4 groups with different cultural cognitions (Kahan et al. 2011) regarding campaign finance reform.

Exploring Ethics in Data and Technology Research

Date/Time: February 22, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Sara Mannheimer, Scott Young, Jason Clark, Justin Shanks, University Libraries, MSU

Abstract: Technology is advancing at a pace so rapid that ethical inquiry is often left unaddressed. The MSU Library is currently conducting several research projects with user data, including Twitter sentiment analysis, personalized search development, and semantic web development. Over the course of these projects, ethical gray areas have emerged, prompting questions regarding ethical practice in “big data” research. Although huge amounts of data are freely available to us, much of it comes from human creators. Simply because we can use this data, does not necessarily mean that we should use it. Acknowledging this straightforward, but often overlooked dictum gives rise to various multifaceted ethical questions. This seminar will not only introduce ongoing data-oriented research occurring at MSU Library, but will also consider the broader ethical components embedded within research and product development. Who is affected by our research? Do users understand when and how their data is being used? How can we anticipate user expectations and values? What is the balance between personalized services and user privacy? Join us for a discussion of data-driven research and its ethical implications.

Secure Knowledge Management

Date/Time: February 8, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Dalal Alharthi

Abstract: Knowledge has become one of the most important driving forces for organizational success. Organizations are becoming more knowledge intensive. Therefore, taking care of knowledge is important for every organization nowadays. Knowledge must be managed. Knowledge management (KM) seeks to increase organizational capability to use knowledge as a source of competitive advantage. The challenge for organizations is to develop effective strategies for managing the knowledge.

Security of Information is a major concern for organizations nowadays. Secure knowledge management is one of the emerging areas in knowledge management and information system. It refers to the management of knowledge while adhering to principles of security and privacy. As KM has become a more central part of organizational activities, securing organizational knowledge has become one of the most important issues in the KM field. Knowledge needs to be protected so that it is properly secured from threats. Knowledge security addresses the protection of knowledge in organizations.

In this presentation, I will deal with knowledge management and knowledge security. I will try to explore different approaches used to secure knowledge management. Then, I will seek to identify a number of challenging security issues in Knowledge Management associated with protecting knowledge. Finally, I will illustrate the application of knowledge management and security by providing some examples from the national government of Saudi Arabia.

Establishing a Prospective, Long-term Follow-up, Pilot Study of Mental Health Biosignatures

Date/Time: January 25, 2016 from 4:10 p.m. - 5:00 p.m.

Location: EPS 108

Presenter: Matt Byerly, MD, Center for Mental Health Research and Recovery

Abstract: Mental illness affects 25% of the US population each year, 6% having serious mental illness.  It also strikes the young, with 50% developing illness by age 14 and 75% by age 24, making these illnesses the most disabling disorders before age 50, and most costly of all health conditions world-wide, with estimated annual US costs of more than $300 billion.  Montana has especially significant mental health challenges including the highest suicide rate in the country, at nearly twice the national rate; large populations at high risk of mental illness including Native Americans and military veterans, and; rural settings with limited mental health treatment resources.

The new MSU Center for Mental Health Research and Recover (CMHRR) is in the process of developing a prospective, long-term follow-up, pilot study of mental health biosignatures.  A biosignature, commonly used in multiple areas of medicine, is a unique combination of measureable, biologic features of a person and their illness that aids in making a diagnosis. To date, biosignatures are not used in the routine diagnosis and treatment of mental disorders.  This research will determine if we can match biologically-based diagnoses with response to specific treatments.  The result would be diagnostic “biosignatures” for individual patients that could identify their best initial treatment choice, speeding up recovery for each person with mental illness.  The work would also further identify brain signatures of illness development and progression that could aid in early and accurate diagnosis and, in turn, guide the focus of intensive preventive interventions for those at especially high risk.

Dr. Byerly, Director of the CMHRR, will discuss the background of biosignature work in mental illnesses, review the proposed study, and discuss potential relevance of the work to the computer sciences.


Seminars from 2016.