Causal Online Alignment for Reliable Foundation Models

PI Emmanouil Koukoumidis, Oumi PBC
Co-PI Gokhan Tur, University of Illinois Urbana-Champaign
Yulia Tsvetkov, University of Washington
Georgia Gkioxari, Caltech
Ruslan Salakhutdinov, Carnegie Mellon University
Project Summary

This project is creating AI models that know their limits, explain their reasoning and uncertainty, and safely collaborate with people and tools—so we can trust them in high-stakes science and industry.

Project Description

This research addresses a critical challenge in DOE's mission to advance scientific discovery and intelligent automation with AI: developing multimodal foundation models that are not just powerful, but fundamentally reliable, i.e. models that can (a) accurately assess their own capabilities and limitations, providing explicit uncertainty estimation for their actions, (b) can reason about the real-world implications of their actions, particularly in scientific and mission-critical contexts, and finally (c) be transparent and able to clearly communicate their assumptions, limitations, and potential failure modes while being able to propose and leverage mitigation strategies such as human feedback, and external tools. In essence, this research will prevent foundation models from naively attempting tasks beyond their capabilities, especially when failure could have adverse consequences. It introduces a novel training framework that emphasizes cause-effect reasoning and transparent decision-making. The approach leverages large-scale AI feedback and high-performance computing infrastructure to develop foundation models that are introspective, transparent and ultimately more reliable. 

The scientific and economic impact of this work is far-reaching as the arguably biggest barrier to the advancement of the sciences with AI and the deployment of AI models in most industry applications is not their power, potential, or cost, but their unreliability. The successful completion of this research will advance the fundamental understanding of foundation model reliability and transparency, while enabling the reliable deployment of AI in critical applications. It will greatly push the frontier through reliable AI systems in both scientific and industrial applications supporting DOE's mission of scientific leadership. The code, methodology, and research artifacts will be fully open, contributing to democratizing the use of HPC computing in AI research.

Allocations