Skip to main content

Bio: I am a Ph.D. student in Language Technologies Institute, Carnegie Mellon University. I am fortunately advised by Prof. Yiming Yang. I received my M.S. degree in May 2021 and anticipate completing my Ph.D. studies in 2024. My academic journey began at Peking University, where I received a B.S. in Computer Science, advised by Prof. Zhi-Hong Deng. My research focuses on neural-symbolic reasoning, which entails synergistically leveraging the strengths of machine learning systems with symbolic systems such as knowledge graphs and combinatorial optimization solvers. More recently, my research interests have been oriented towards the alignment of Large Language Models (LLMs) and Large Multimodal Models (LMMs), with a special emphasis on improving reliability through scalable oversight (minimal human supervision) such as human-defined principles or factual feedback from real-world interactions.

Talk Title: Aligning Large Language and Multimodal Models with Scalable Oversight

Abstract: There has been an increasing desire to bridge the gap between what we expect AI systems to generate and what they actually produce. At the forefront of this exploration, we introduce novel methodologies for aligning Large Language Models (LLMs) and Large Multimodal Models (LMMs): Principle-Driven Self-Alignment (SELF-ALIGN), Self-Alignment with Principle-Following Reward Models (SALT), and Aligning Large Multimodal Models with Factually-Augmented RLHF (Fact-RLHF). Driven by the aspiration to diminish the constraints of exhaustive human supervision and to magnify the reliability of AI outputs, SELF-ALIGN ingeniously harnesses principle-driven reasoning with LLMs’ generative prowess, crafting content that resonates with human values. Building on this motivation, SALT evolves the alignment landscape further by seamlessly integrating minimal human guidance with reinforcement learning from synthetic preferences, offering a tantalizing glimpse into the future of self-aligned AI agents. Shifting our lens to the multimodal realm, where misalignment often translates into AI “hallucinations” that are inconsistent with the multimodal inputs, Fact-RLHF emerges as a general and scalable solution. By merging RLHF’s strength with factual augmentations, this method not only mitigates misalignments but also pioneers in setting robust standards for AI’s vision-language capabilities.

arrow-left-smallarrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-long-yellowarrow-right-smallfacet-arrow-down-whitefacet-arrow-downCheckedCheckedlink-outmag-glass