Valentin Hofmann
Bio: Valentin is a final-year PhD student at the University of Oxford and a research assistant at LMU Munich. His work broadly focuses on the intersection of NLP, linguistics, and computational social science, with specific interests in tokenization algorithms, socially and temporally aware NLP systems, and computational models of political ideology. The findings of his research have been published at major NLP and machine learning conferences such as ACL, EMNLP, NAACL, ICWSM, and ICML. During his PhD, Valentin also spent time as a research intern in DeepMind’s Language Team and as a visiting scholar in Stanford’s NLP Group.
Talk Title: Modeling Ideological Language with Graph Neural Networks and Structured Sparsity
Talk Abstract: The increasing polarization of online political discourse calls for computational tools that automatically detect and monitor ideological divides in social media. While many methods to track ideological polarization have been proposed, most of them rely on knowing in advance the political orientation of text, a requirement seldom met in practice. In this talk, I will discuss research recently published at NAACL and ICML that fully dispenses with the need for labeled data and instead leverages the ubiquitous network structure of online discussion forums, specifically Reddit, to detect ideological polarization. I will first give an overview of prior research on polarization in natural language processing, highlighting salience and framing as two key mechanisms by which ideology manifests itself in language. I will then present Slap4slip, a method that combines graph neural networks with structured sparsity learning to determine the polarization of political issues (e.g., abortion) along the dimensions of salience and framing. In the third part of the talk, I will show that polarization is also reflected by the existence of an ideological subspace in contextualized embeddings, which can be found by adding orthogonality regularization to Slap4slip. The ideological subspace encodes abstract evaluative semantics and indicates pronounced changes in the political left-right spectrum during the presidency of Donald Trump.