Wednesday, October 22, 2025 CAMLIS Day One

  • Dr. Rachel Allen and Becca Lynch

  • Hyrum Anderson

  • Niloofar Mireshghallah, Incoming Assistant Professor, Carnegie Mellon University (EPP & LTI)/Research Scientist, FAIR

  • This session focuses on research directly addressing the vulnerabilities, attack methods, and defensive strategies for Large Language Models (LLMs) and Visual Language Models (VLMs).

    • "A Framework for Adaptive Multi-Turn Jailbreak Attacks on Large Language Models" (CAMLIS2025.pdf) 

      • This paper introduces HarmNet, a modular framework designed to systematically construct, refine, and execute multi-turn jailbreak queries against LLMs, demonstrating significantly higher attack success rates compared to prior methods.

    • "LLM Salting: From Rainbow Tables to Jailbreaks" (llm_salting.pdf) 

      • This work proposes LLM salting, a lightweight defense mechanism that rotates the internal refusal direction of LLMs, rendering previously effective jailbreak prompts (like GCG) ineffective without degrading model utility.

    • "ShadowLogic: Hidden Backdoors in Any Whitebox LLM" (shadowlogic_camlis.pdf) 

      • This paper unveils ShadowLogic, a method for injecting hidden backdoors into white-box LLMs by modifying their computational graphs. These backdoors are activated by a secret trigger phrase, allowing the model to generate uncensored responses and exposing a new class of graph-level vulnerabilities.

    • "Text2VLM: Adapting Text-Only Datasets to Evaluate Alignment Training in Visual Language Models" (Text2VLM_CAMLIS_2025.pdf) 

      • This research presents Text2VLM, a novel pipeline that adapts text-only datasets into multimodal formats to evaluate the resilience of Visual Language Models (VLMs) against typographic prompt injection attacks. It highlights the increased susceptibility of VLMs when visual inputs are introduced.


Detail 2

Nulla eu pretium massa. Nullam sit amet nisi condimentum erat iaculis auctor. Suspendisse nec congue purus. Mauris id fermentum nulla. Aliquam bibendum, turpis eu mattis iaculis, ex lorem mollis sem, ut sollicitudin risus orci quis tellus.


Detail 3

Maecenas non leo laoreet, condimentum lorem nec, vulputate massa. Suspendisse nec congue purus. Vivamus sit amet semper lacus, in mollis libero. Sed a ligula quis sapien lacinia egestas. Sed a ligula quis sapien lacinia egestas.