Programme

ORIGen 2025 was held in the Palais des Congrès in Montréal, QC, Canada on October 10, 2025 in Room 518A.

8:30-9:00 - Welcome and registration
9:00-9:15 - Opening remarks - Nikhil Krishnaswamy, Colorado State University
9:15-9:50 - Invited talk I: Andreas Vlachos - Towards Constructive Conversations
In this talk I will present our work motivated by the question “What makes conversations among humans more constructive and how can we intervene to make them happen”. First I will discuss group decision-making in the context of the Wason Card Selection task, where we find that groups perform better than individuals, and, more interestingly, can reach a correct decision even if no one had it in the beginning of the conversation. Following this, I will present the Wikipedia disputes dataset which has allowed us to examine how disagreements are resolved in the context of Wikipedia, the most successful large-scale collaborative project. Finally, I will discuss how LLMs could be used to intervene to improve our conversations.
Session chair: James Pustejovsky, Brandeis University
9:50-10:50 - Accepted paper lightning talks: 4 minutes each + 1 minute transition (presentation order)
Session chair: Nikhil Krishnaswamy, Colorado State University
10:50-11:05 - Coffee break
11:05-12:00 - Keynote talk: Malihe Alikhani - Theory of Mind in Generative Models: From Uncertainty to Shared Meaning
We will explore how generative models can effectively facilitate communicative grounding by incorporating theory of mind alongside uncertainty and human feedback. We begin by examining how models signal and quantify predictive uncertainty, highlighting computational parallels to epistemic stance. Next, we discuss belief modeling, presenting evidence that language models can infer degrees of interlocutor uncertainty, a crucial component in managing reference and intent. We address how a failure to accurately track beliefs may lead to sycophancy, or over-alignment with user views. We then explore the positive role of friction introduced through structured discourse or interactional pauses, which slows down interactions to promote clarity and facilitate grounding. Finally, we extend these concepts to multimodal and socially situated contexts, drawing on research in sign language modeling and human-in-the-loop training to illustrate how shared meaning can be constructed across diverse modalities and populations. This line of research demonstrates how generative models embody core mechanisms of pragmatic reasoning, offering linguists and cognitive scientists both methodological challenges and opportunities to question how computational systems reflect and shape our understanding of meaning and interaction.
Session chair: Mariah Bradford, Colorado State University
12:00-12:35 - Invited talk II: Bertram F. Malle - Could Generative Language Models Have Moral Trustworthiness?
In our previous theoretical and empirical work, we have examined people’s trust responses and moral judgments of social robots. I will apply some lessons we learned from this work to the question of moral trustworthiness of LLMs. I will ask three questions: (1) Does it even make sense to treat LLMs as “agents” that “have trustworthiness”? (2) If we do query LLMs’ trustworthiness, what specifically moral attributes of trustworthiness would we want them to exhibit? (3) What would it take to design LLMs that actually have those attributes of moral trustworthiness?
Session chair: Nikhil Krishnaswamy, Colorado State University
12:35-2:00 - Lunch
2:00-2:35 - Invited talk III: Q. Vera Liao - Facilitating Appropriate Reliance on AI: Lessons from HCI Research
Having appropriate reliance on AI is key to harnessing the benefits of AI technology and achieving human-AI complementarity; while inappropriate reliance, particularly overreliance on AI, can lead to a range of harms, from high-stakes errors, de-skilling, to infrastructural vulnerabilities. Since before this wave of LLM technology, the field of human-computer interaction (HCI) has been studying how to facilitate appropriate reliance on AI, through empirical investigation of how people choose to rely or not rely on AI, designing interventions to mitigate inappropriate reliance, and developing approaches to measure and model people’s reliance behaviors. In this talk, I will provide an overview of these lines of HCI research and pose three open questions in the age of LLMs: How should we grapple with the normative question of what constitutes appropriate reliance? How can we measure and monitor reliance without intensive behavior surveillance? How can we deliver targeted preventive interventions to prevent overreliance by accounting for the system, individual, and contextual risk factors?
Session chair: Tejas Srinivasan, University of Southern California
2:35-3:35 - Poster Session (Session chair: Nikhil Krishnaswamy, Colorado State University)
3:35-3:50 - Coffee break
3:50-4:35 - Panel discussion: Future of Reliable and Accountable AI
Matthias Scheutz, Tufts University
Jesse Thomason, University of Southern California
Diyi Yang, Stanford University
Matthew Marge, DARPA
Moderator: James Pustejovsky, Brandeis University
4:35-5:00 - Best Paper Award and Conclusion

List of accepted papers

Illuminating Blind Spots of Language Models with Targeted Agent-in-the-Loop Synthetic Data
Philip Lippmann, Matthijs T. J. Spaan, Jie Yang
Med-CAM: Improving Medical Question Answering with Confidence-Aware Methods
Karina H. Halevy, Kshitish Ghate, Jimin Mun, Mona T. Diab, Maarten Sap (non-archival)
TextBandit: Evaluating Probabilistic Reasoning in LLMs Through Language-Only Decision Tasks
Arjun Damerla, Jimin Lim, Nikil Selladurai, Nam Le, Yanxi Jiang
Medal Matters: Probing LLMs’ Failure Cases Through Olympic Rankings
Juhwan Choi, Seunguk Yu, JungMin Yun, YoungBin Kim
Extending AutoCompressors via Surprisal-Based Dynamic Segmentation
Richard Xu, Raine Ma, Dawson Park, David Guo, Srivishnu Ramamurthi, Charles Duong, Kevin Zhu, Vasu Sharma, Sean O’Brien (non-archival)
From Indirect Object Identification to Syllogisms: Exploring Binary Mechanisms in Transformer Circuits
Karim Saraipour, Shichang Zhang (non-archival)
How Post-Training Reshapes LLMs: A Mechanistic View on Knowledge, Truthfulness, Refusal, and Confidence
Hongzhe Du, Weikai Li, Min Cai, Karim Saraipour, Zimin Zhang, Yizhou Sun, Himabindu Lakkaraju, Shichang Zhang (non-archival)
On the Retention of Edited Knowledge in Fine-Tuned Language Models
Fufang Wen, Shichang Zhang (non-archival)
Sarc7: Evaluating Sarcasm Detection and Generation with Seven Types and Emotion-Informed Techniques
Lang Xiong, Raina Gao, Alyssa Jeong, Yicheng Fu, Kevin Zhu, Sean O’Brien, Vasu Sharma (non-archival)
Constructive Disobedience and Trust in Human-Agent Interaction: A Multi-Scale Study
Gordon Briggs, Christina Wasylyshyn
Let’s Roleplay: Examining LLM Alignment in Collaborative Dialogues
Abhijnan Nath, Carine Graff, Nikhil Krishnaswamy
Uncertainty Quantification in Retrieval Augmented Question Answering
Laura Perez-Beltrachini, Mirella Lapata (non-archival)