Speaker: Niloofar Mireshghallah, Incoming Assistant Professor, Carnegie Mellon University (EPP & LTI)/Research Scientist, FAIR
Abstract: GenAI is no longer confined to chat interfaces—it has evolved into autonomous agents that move data through tools, APIs, and multiple modalities, creating new "data sinks" where information quietly accumulates and new leak paths where it slips across contexts. In this talk, we begin by examining the sensitive data that users and professionals share with these systems, establishing the context of what's at stake. We then dissect current privacy and data risks that go beyond traditional verbatim memorization, analyzing policies from frontier labs and various industries that reveal deceptive practices and surprising gaps in consent mechanisms. We demonstrate how new features like persistent memories, automated workflows, and deep inference capabilities enable unprecedented surveillance and profiling of users. Despite these challenges, we present an optimistic path forward through practical approaches including data minimization, intentional friction in data collection, and computational offloading strategies that limit what models need to store. As these technologies become more broadly adopted, we acknowledge emerging threat surfaces—from behavioral manipulation through character training to context stealing and persuasion attacks. We conclude by exploring how these privacy risks will manifest in future systems like long-horizon agents, ambient AI, and robotics workflows, discussing the need for distributed training approaches, trusted execution environments, and new frameworks to navigate the evolving economics of data in agentic AI systems.