Programme

ORIGen 2025 was held in the Palais des Congrès in Montréal, QC, Canada on October 10, 2025 in Room 518A.

8:30-9:00 - Welcome and registration
9:00-9:15 - Opening remarks - Nikhil Krishnaswamy, Colorado State University
9:15-9:50 - Invited talk I: Andreas Vlachos - Towards Constructive Conversations
 In this talk I will present our work motivated by the question “What makes conversations among humans more constructive and how can we intervene to make them happen”. First I will discuss group decision-making in the context of the Wason Card Selection task, where we find that groups perform better than individuals, and, more interestingly, can reach a correct decision even if no one had it in the beginning of the conversation. Following this, I will present the Wikipedia disputes dataset which has allowed us to examine how disagreements are resolved in the context of Wikipedia, the most successful large-scale collaborative project. Finally, I will discuss how LLMs could be used to intervene to improve our conversations.
Session chair: James Pustejovsky, Brandeis University
9:50-10:50 - Accepted paper lightning talks: 4 minutes each + 1 minute transition (presentation order)
Session chair: Nikhil Krishnaswamy, Colorado State University
10:50-11:05 - Coffee break
11:05-12:00 - Keynote talk: Malihe Alikhani - Theory of Mind in Generative Models: From Uncertainty to Shared Meaning
 We will explore how generative models can effectively facilitate communicative grounding by incorporating theory of mind alongside uncertainty and human feedback. We begin by examining how models signal and quantify predictive uncertainty, highlighting computational parallels to epistemic stance. Next, we discuss belief modeling, presenting evidence that language models can infer degrees of interlocutor uncertainty, a crucial component in managing reference and intent. We address how a failure to accurately track beliefs may lead to sycophancy, or over-alignment with user views. We then explore the positive role of friction introduced through structured discourse or interactional pauses, which slows down interactions to promote clarity and facilitate grounding. Finally, we extend these concepts to multimodal and socially situated contexts, drawing on research in sign language modeling and human-in-the-loop training to illustrate how shared meaning can be constructed across diverse modalities and populations. This line of research demonstrates how generative models embody core mechanisms of pragmatic reasoning, offering linguists and cognitive scientists both methodological challenges and opportunities to question how computational systems reflect and shape our understanding of meaning and interaction.
Session chair: Mariah Bradford, Colorado State University
12:00-12:35 - Invited talk II: Bertram F. Malle - Could Generative Language Models Have Moral Trustworthiness?
 In our previous theoretical and empirical work, we have examined people’s trust responses and moral judgments of social robots. I will apply some lessons we learned from this work to the question of moral trustworthiness of LLMs. I will ask three questions: (1) Does it even make sense to treat LLMs as “agents” that “have trustworthiness”? (2) If we do query LLMs’ trustworthiness, what specifically moral attributes of trustworthiness would we want them to exhibit? (3) What would it take to design LLMs that actually have those attributes of moral trustworthiness?
Session chair: Nikhil Krishnaswamy, Colorado State University
12:35-2:00 - Lunch
2:00-2:35 - Invited talk III: Q. Vera Liao - Facilitating Appropriate Reliance on AI: Lessons from HCI Research
 Having appropriate reliance on AI is key to harnessing the benefits of AI technology and achieving human-AI complementarity; while inappropriate reliance, particularly overreliance on AI, can lead to a range of harms, from high-stakes errors, de-skilling, to infrastructural vulnerabilities. Since before this wave of LLM technology, the field of human-computer interaction (HCI) has been studying how to facilitate appropriate reliance on AI, through empirical investigation of how people choose to rely or not rely on AI, designing interventions to mitigate inappropriate reliance, and developing approaches to measure and model people’s reliance behaviors. In this talk, I will provide an overview of these lines of HCI research and pose three open questions in the age of LLMs: How should we grapple with the normative question of what constitutes appropriate reliance? How can we measure and monitor reliance without intensive behavior surveillance? How can we deliver targeted preventive interventions to prevent overreliance by accounting for the system, individual, and contextual risk factors?
Session chair: Tejas Srinivasan, University of Southern California
2:35-3:35 - Poster Session (Session chair: Nikhil Krishnaswamy, Colorado State University)
3:35-3:50 - Coffee break
3:50-4:35 - Panel discussion: Future of Reliable and Accountable AI
Matthias Scheutz, Tufts University
Jesse Thomason, University of Southern California
Diyi Yang, Stanford University
Matthew Marge, DARPA
 Moderator: James Pustejovsky, Brandeis University
4:35-5:00 - Best Paper Award and Conclusion

List of accepted papers