Align via actions: Learning behavior aligns LLMs with human opinions in zero-shot

Abstract

Large language models (LLMs) have become ubiquitous in various applications, but aligning them with societal expectations remains challenging. To align LLMs with humans, current alignment methods rely heavily on human-annotated datasets, which are expensive, difficult to scale, and often biased toward specific demographic subgroups. We introduce a novel approach for LLM alignment by training on behavioral data. Our approach is based on the maxim in psychology that actions (behavior) have a strong consistency with opinions. Leveraging this insight, we developed AlignViaActions (AVA50M) comprising over 50 million samples derived from 1.5 million advertisements, including content and demographic viewing behaviors. We train LLMs on AVA50M, demonstrating significant improvements over existing alignment techniques across multiple societal and cultural alignment benchmarks, including GlobalOpinionQA, OpinionQA, CultureNLI, and CultureBank. Through this, we demonstrate that by observing and learning from behavior, LLMs can infer the underlying opinions and cultural norms. This approach addresses key limitations of current methods, offering improved scalability, demographic representation, and adaptability to evolving societal views. Our results suggest the potential for behavioral data to replace or complement traditional expert-annotation-based alignment techniques. Our datasets and code are available at https://behavior-in-the-wild.github.io/align-via-actions.

Publication
ACL Rolling Review, Nominated for best paper award
Aanisha Bhattacharyya
Aanisha Bhattacharyya
Research Associate
Yaman Kumar Singla
Yaman Kumar Singla
Senior Research Scientist
Nikitha SR
Nikitha SR
Research Associate
Balaji Krishnamurthy
Balaji Krishnamurthy
Senior Principal Scientist and Senior Director