What Does It Mean for Agentic AI to Preserve Privacy? Mapping the New Data Sinks and Leaks
Speaker: Niloofar Mireshghallah, Incoming Assistant Professor, Carnegie Mellon University (EPP & LTI)/Research Scientist, FAIR
Adversarial Attacks and Model Safeguards for LLMs and VLMs
About: This session focuses on research directly addressing the vulnerabilities, attack methods, and defensive strategies for Large Language Models (LLMs) and Visual Language Models (VLMs).
A Framework for Adaptive Multi-Turn Jailbreak Attacks on Large Language Models
Speaker: Javad Rafiei Asl
This paper introduces HarmNet, a modular framework designed to systematically construct, refine, and execute multi-turn jailbreak queries against LLMs, demonstrating significantly higher attack success rates compared to prior methods.
LLM Salting: From Rainbow Tables to Jailbreaks
Speaker: Tamás Vörös
This work proposes LLM salting, a lightweight defense mechanism that rotates the internal refusal direction of LLMs, rendering previously effective jailbreak prompts (like GCG) ineffective without degrading model utility.
ShadowLogic: Hidden Backdoors in Any Whitebox LLM
Speaker: Amelia Kawasaki
This paper unveils ShadowLogic, a method for injecting hidden backdoors into white-box LLMs by modifying theircomputational graphs. These backdoors are activated by a secret trigger phrase, allowing the model to generate uncensored responses and exposing a new class of graph-level vulnerabilities.
Text2VLM: Adapting Text-Only Datasets to Evaluate Alignment Training in Visual Language Models
Speaker: Jake Thomas
This research presents Text2VLM, a novel pipeline that adapts text-only datasets into multimodal formats to evaluate the resilience of Visual Language Models (VLMs) against typographic prompt injection attacks. It highlights the increased susceptibility of VLMs when visual inputs are introduced.
AI/ML for Cyber Defense Agents & Reinforcement Learning
About: This session explores the application of AI and Machine Learning, particularly agentic systems and reinforcement learning, to enhance active cyber defense and security operations.
Adaptive by Design: Contextual Reinforcement Learning for Mission-Ready Cyber Defence
Speaker: Jake Thomas
This paper introduces a framework for applying Contextual Reinforcement Learning (cRL) to cyber defense, where agents dynamically incorporate contextual signals (like mission objectives or threat assessments) to modulate their policies in real-time without retraining.
Towards a Generalisable Cyber Defence Agent for Real-World Computer Networks"
Speaker: Tim Dudman
This work proposes Topological Extensions for Reinforcement Learning Agents (TERLA) to provide generalizability for cyber defense agents in networks of differing topology and size without the need for retraining. It evaluates performance in realistic simulation environments.
Improving Accuracy and Consistency in Real-World Cybersecurity AI Systems via Test-Time Compute
Speaker: Ashley Song
This study evaluates Test-Time Compute for improving the accuracy and consistency of real-world cybersecurity agentic systems, specifically a container vulnerability analysis agent and a server alert triage agent.
RIG-RAG: A GraphRAG Inspired Approach to Agentic Cloud Infrastructure
Speaker: Benji Lilley
This paper introduces Relational Inference GraphRAG (RIG-RAG), an LLM-assisted pipeline that transforms cloud configuration data into a security-enriched knowledge graph to support natural-language reasoning about deployed infrastructure. This enhances agentic capabilities for cloud security operations.