Title: AI for Earth Systems Informatics and Modeling
Data/Time/Location: Monday, April 20th at 4:10 p.m. in Barnard 108
Speaker: Jordan Malof
Abstract: Earth systems encompass the interconnected systems of our planet, such as the atmosphere,
biosphere, and anthroposphere. These systems interact across a wide range of spatial
and temporal scales, governing critical processes such as climate regulation, water
cycling, and natural hazards like wildfires. Earth systems modeling and informatics
aims to understand, predict, and manage these complex interactions by integrating
observational data (e.g., remote sensing), physical theory, and computational methods.
This task is inherently challenging due to the complexity of the observational data
and the underlying relationships they reflect. Observational data (e.g., from remote
sensing platforms) are often sparse, noisy, and heterogeneous. Advances in artificial
intelligence (AI) offer a promising pathway to overcome these limitations. AI can
be used to automatically extract useful information from observational data, as well
as use it for modeling and decision-making. In this talk I will discuss my lab’s recent
work utilizing AI to address open challenges in earth systems informatics and modeling.
I will highlight three recent projects about energy activity monitoring, estimation
of global greenhouse gas emissions, and wildfire spread modeling.
Bio: Dr. Jordan Malof received his Ph.D. from Duke University in 2015 in Electrical and
Computer Engineering, and he is currently an Assistant Professor at the University
of Missouri. His research focuses on the development of novel computer vision, deep
learning, and AI methods to solve challenging problems in diverse fields such as materials
science, earth systems, and defense. His work has been featured in selective machine
learning, computer vision, and applied publication venues. For a list of publications
please see his Google Scholar Profile.
Title: End of Year Awards Seminar
Data/Time/Location: Monday, April 27th at 4:10 p.m. in Barnard 108
Abstract: Student awards, celebration of accomplishments, sharing of summer plans, light snacks.
PAST 2026 SEMINARS
Title: Toward Actionable and Reliable Decision Making by Sim-to-Real Framework and Trustworthy
Machine Learning
Data/Time/Location: Monday, January 26th at 4:10 p.m. in Barnard 108
Speaker: Longchao Da
Abstract: Complex decision-making can be framed as a Markov Decision Process, and then solved
by advanced policy learning such as Reinforcement Learning. However, policies learned
in simulation often struggle to generalize to real, safety-critical environments due
to distribution shift, partial observability, and uncertainty. This talk presents
a line of work that addresses these challenges by developing high-fidelity simulation,
introducing sim-to-real training paradigms, performing offline policy evaluation,
and conducting uncertainty quantification to support actionable and trustworthy decision-making
in real-world domains, with potential solutions in transportation, healthcare, and
disaster monitoring and response.
Bio: Longchao Da is a Ph.D. Candidate in Computer Science at Arizona State University. His research interests include Sim-to-Real Policy Learning, Trustworthy AI, and
Data Mining. He also leverages Generative AI with uncertainty quantification to detect
and mitigate hallucinations for more trustworthy responses. His work has appeared
in top venues like AAAI, KDD, NeurIPS, ICML, IJCAI, CIKM, ECML, CDC, etc. He is a
2025 Google PhD Fellowship nominee, a two-time ASU Ph.D. Fellowship recipient and
the Best Poster Award winner at SDM 2025.
Title: Collaborative Active Learning for Robots
Data/Time/Location: Monday, February 2nd at 4:10 p.m. in Barnard 108
Speaker: Michelle Zhao
Abstract: Today, robot learning paradigms rely on human-provided data, (e.g. demonstrations,
preference labels) to adapt their behavior and align with user intent. Yet in practice,
this process of teaching robots is one of trial-and-error that places the burden on
humans to decipher what the robot misunderstands, diagnose failures, and supply the
“right” corrective data. My research develops user-centric active learning methods
that learn by supporting human teachers. In this talk, I will first introduce uncertainty
quantification tooling that extends conformal prediction to the human-robot interaction
setting, enabling robots to rigorously “know when they don’t know” even when relying
on black-box policies. I will then discuss how these uncertainty self-assessments
enable robots to communicate insights with human teachers and proactively ask for
targeted feedback within novel interactive learning paradigms. Coupling these ideas
with cost-optimal planning algorithms, I will demonstrate how robots can interleave
both learning and collaboration with human partners over multitask sequences. I will
end this talk by taking a step back and examining the alignment process for robotics
and discussing opportunities for how rethinking interactive learning as collaborative
and continual accounts for not only task, but the nuanced interaction dynamics present
during the teaching process.
Bio: Michelle Zhao is a Ph.D. candidate at Carnegie Mellon University in the Robotics
Institute, working with Professors Henny Admoni and Reid Simmons. She studies human-robot
interaction, with an emphasis on how robots can learn from and about people. Her research
integrates methods from statistical uncertainty quantification, machine learning,
and human-robot interaction to develop theoretical frameworks and practical algorithms
for active learning from human feedback in domains like assistive robotic manipulation.
Prior to her Ph.D., she earned her B.S. at the California Institute of Technology.
She is the recipient of the Siebel Scholarship, Rising Stars in Computational and
Data Sciences, the NDSEG Research Fellowship, HRI Pioneers 2025 Honorable Mention,
and has worked at Toyota Research Institute.
Title: Towards Collaborative Intelligence: Learning from Decentralized Data at Scale
Date/Time/Location: Monday, February 9th at 4:10 p.m. in Barnard 108
Speaker: Yujia Wang
Abstract: As modern data increasingly comes from decentralized sources, e.g., phones, smart
devices, and medical systems, learning must occur without centralizing sensitive data.
Federated learning (FL) enables learning from decentralized data sources but faces
significant challenges in real-world deployments, including data heterogeneity, system
variability, and communication bottlenecks. In this talk, I will present the algorithmic
and optimization foundations of collaborative intelligence, focusing on building efficient
and scalable learning from decentralized data. My work addresses FL’s challenges both
individually and in a more systematic, integrated way, depending on what the problem
demands. I will first diagnose how stale updates and data heterogeneity jointly destabilize
asynchronous FL and introduce a cached calibration mechanism that probably removes
the harmful delay-heterogeneity interaction. I will then introduce a modularized and
parallel block-coordinate framework for federated fine-tuning of large language models.
Together, these results establish optimization-driven principles that enable efficient
and scalable federated learning. The talk concludes with a vision for the next generation
of collaborative AI, where models learn efficiently while respecting privacy, system
constraints, and social trustworthiness
Bio: Yujia Wang is a Ph.D. candidate in the College of Information Sciences and Technology
at The Pennsylvania State University, advised by Dr. Jinghui Chen. Her research spans
the theories and applications of collaborative intelligence and privacy-preserving
machine learning. Her work has been published in top venues such as ICML, NeurIPS,
ICLR, AISTATS, ACL and TMLR. She has delivered technical talks at the SIAM-NNP Section
Conference and IBM Research, and presented her work at the SDM Doctoral Forum. She
actively serves as a reviewer for leading AI conferences and journals. Beyond academia,
she gained industry experience as a Research Intern at IBM Research.
Title: Visualization Design and Artificial Intelligence for Scientific Inquiry
Date/Time/Location: Monday, February 23rd at 4:10 p.m. in Barnard 108
Speaker: Devin Lange
Abstract: Scientific inquiry seeks to understand how the world works, and data is a fundamental
tool for representing underlying phenomena. In this talk, I discuss how well-designed
visualizations, together with the integration of artificial intelligence, can help
domain scientists interact with and reason about their data. Cancer research serves
as a motivating application domain, as it involves complex, large-scale, and multimodal
datasets. These data sets range from images of individual cancer cells to datasets
assembled through collaborations across multiple institutions. Visualization plays
a critical role in helping scientists navigate this complexity and extract meaningful
insight from their data.
Bio: Devin Lange is a postdoctoral fellow in biomedical informatics at Harvard Medical
School in the HIDIVE Lab, where he works with Nils Gehlenborg. He earned a PhD in
computer science from the University of Utah under the supervision of Alexander Lex.
His research investigates how visualization and artificial intelligence can support
the understanding, discovery, and quality control of scientific data, with a particular
focus on biological and biomedical data.His work has been recognized with multiple awards at IEEE VIS, including a Best Paper
Award for Aardvark, a visualization system for integrated analysis of imaging, time-series,
and cellular divisions. His research spans biological visualization, data forensics,
and natural-language interfaces for interactive visual analysis.
Title: QCORE Tour
Date/Time/Location: Monday, March 30th at 4:00 p.m. at EngineWorks (2425 Technology Blvd)
Abstract: Learn about quantum initiatives at MSU and opportunities to engage with QCORE
Arrival: The Engine works building is located at 2425 Technology Blvd, Bozeman, MT 59718. Sometimes
the address doesn't get you right to the entrance, but you park in the main parking
lot, then enter the lobby under the "EngineWorks" sign. From there we have an IPAD
that EngineWorks requires all members of your party to sign in on for security tracking
purposes. It will prompt you to enter the name of the person you are meeting with,
in this case, Jayne Morrow
Title: The Genesis of Time Series Foundation Models: From Generative Pre-training to Physically-Consistent
Inference
Date/Time/Location: Wednesday, April 8th at 4:00 p.m. in Barnard 347
Speaker: Defu Cao
Abstract: While foundation models have transformed fields like Natural Language Processing and
Computer Vision, their application to time series analysis has been limited by unique
challenges inherent to temporal data, such as complex structures, pervasive data imperfections,
and the need for task generality. This talk explores the emerging frontier of foundation models designed to overcome these obstacles,
tracing a path through recent breakthroughs that are redefining time series analysis.
We begin by examining TEMPO, a prompt-based generative pre-trained transformer that
addresses the inability of standard attention mechanisms to disentangle the non-orthogonal
trend, seasonal, and residual components of time series data. Building on these insights,
we introduce TimeDiT, a more general-purpose diffusion transformer architecture that
employs a unified masking strategy to robustly handle missing values and perform diverse
tasks like forecasting, imputation, and anomaly detection from a single framework.
Derived from TimeDiT, we introduce PINFDIT, a plug-and-play, physics-informed inference process to inject domain knowledge and
ensure physical consistency without retraining. The primary contribution of this research
will be a novel foundation model for time series that is physically consistent, advancing
the field closer to the vision of a universal "world model" for temporal data. The
discussion will cover the opportunities and challenges in scaling these models, their
potential impact on real-world applications in finance, healthcare, and climate science,
and future research directions, including multimodal integration and enhanced interpretability.
Brief Bio: Defu Cao is a Ph.D. candidate in the Department of Computer Science at the University
of Southern California and a visiting scholar at Caltech. His research focuses on
building practical time series foundation models and LLM-guided decision-making systems
for real-world temporal data. He works on large-scale pretraining methods, model orchestration, and interpretable inference, with applications in domains
such as finance, infrastructure monitoring, and large-scale forecasting systems. His
work has been published in top machine learning conferences including NeurIPS, ICML,
and ICLR, and has achieved state-of-the-art performance on large-scale time series
benchmarks, including a top 1 result on the GIFT-Eval leaderboard. He is a recipient
of the USC Best Research Assistant Award.
Title: Toward Reliable LLM Frameworks for Scientific Search Problems
Data/Time/Location: Friday, April 17th at 4:10 p.m. in Barnard 126
Speaker: Jungtaek Kim
Abstract: Large language models (LLMs) are widely used in applications such as chatbots, machine
translation, and scientific discovery, and their performance can be improved through
structured search over possible outputs. This talk focuses on sentence-level and idea-level
search in LLMs. For sentence-level search, I show how process-supervised reward models
(PRMs) can guide inference-time methods, such as weighted majority voting, beam search,
and Monte Carlo tree search. Then, VersaPRM is introduced as a multi-domain PRM trained
using synthetic data. For idea-level search, I isolate the search capabilities of
LLMs by using them as search policies over tree-structured spaces. Using controllable
synthetic benchmarks, both theoretical and empirical results demonstrate that Transformers
are expressive enough to represent diverse search algorithms, along with evidence
of generalization to unseen settings. The talk concludes with future directions for
search-driven LLM frameworks and their applications to scientific discovery.
Bio: Jungtaek Kim is a research associate at the University of Wisconsin–Madison, working
with Prof. Kangwook Lee. Previously, he was a postdoctoral associate at the University
of Pittsburgh, working with Profs. Paul W. Leu, Satish Iyengar, Lucas Mentch, and
Oliver Hinder. He received his Ph.D. in Computer Science and Engineering from POSTECH,
under the supervision of Profs. Seungjin Choi and Minsu Cho. During his Ph.D. program,
he interned at the Vector Institute and SigOpt (acquired by Intel). He has presented
his work at top-tier machine learning conferences, including NeurIPS, ICML, AISTATS,
ICLR, and UAI, and has served as a reviewer for several machine learning conferences.
His main research interests include statistical machine learning, Bayesian optimization,
large language models, and artificial intelligence for science.
Seminars from 2025.