CAMLIS 2024
DAY ONE
Keynote: Cybersecurity and Infrastructure Security Agency (CISA)
Lisa Einstein serves as the Cybersecurity and Infrastructure Security Agency’s first Chief AI Officer. In this role, she leads CISA’s efforts to responsibly adopt AI tools that can help advance the agency’s mission and works to identify and mitigate risks to U.S. critical infrastructure associated with AI. Einstein previously served as Senior Advisor for AI at CISA and as Executive Director of CISA’s Cybersecurity Advisory Committee.
In her previous roles, she led the development and implementation of CISA’s AI Roadmap, an actionable plan to promote beneficial uses of AI to enhance CISA’s cybersecurity capabilities, ensure AI systems are protected from cybersecurity threats, and mitigate the risks malicious uses of AI pose for critical infrastructure.
Lisa Einstein
Chief AI Officer, CISA
-
Speaker: Gary Lopez Munoz
Contributors: Amanda Minnich, Roman Lutz, Richard Lundeen, Raja Sekhar Rao Dheekonda, Nina Chikanov, Bolor-Erdene Jagdagdorj, Martin Pouliot, Shiven Chawla, Whitney Maxwell, Blake Bullwinkel, Katherine Pratt, Joris de Gruyter, Charlotte Siska, Pete Bryan, Tori Westerhoff, Chang Kawaguchi, Christian Seifert, Yonatan Zunger, Yonatan Zunger
Abstract: Generative Artificial Intelligence (GenAI) is becoming ubiquitous in our daily lives. The increase in computational power and data availability has led to a proliferation of both single- and multi-modal models. As the GenAI ecosystem matures, the need for extensible and model-agnostic risk identification frameworks is growing. To meet this need, we introduce the Python Risk Identification Toolkit (PyRIT), an open-source framework designed to enhance red teaming efforts in GenAI systems.
PyRIT is a model- and platform-agnostic tool that enables red teamers to probe for and identify novel harms, risks, and jailbreaks in multimodal generative AI models. Its composable architecture facilitates the reuse of core building blocks and allows for extensibility to future models and modalities. This paper details the challenges specific to red teaming generative AI systems, the development and features of PyRIT, and its practical applications in real-world scenarios.
-
Speaker: Madeline Cheah
Contributors: Jack Stone; Samuel Bailey, Peter Haubrick, David Rimmer, Matt Lacey, Mark Dorn
Fully autonomous decision-making for cyber-defence (the ability to make expert-level defensive choices without human intervention) is desirable but challenging. This is particularly so for operational technology because of its cyber-physical nature and the need to take into account multiple dimensions of context. Our contribution is the creation and substantial extension of our co-operative decision-making framework for cyber-defence (Co-Decyber). This framework allows us to break up a large multi-contextual action space into smaller decisions for multiple agents to optimise between.
We have applied this framework to a vehicle platooning scenario (the linking of two or more trucks in a convoy) . This paper discusses development since our last published work, which is based on increased complexity by defending against a more sophisticated attack (diversion of the convoy using GPS message spoofing) using more agents. Results show that Co-Decyber agents are able to successfully defend against an attack and recover the situation. We conclude that this framework is viable and once mature, will assist in fully autonomous cyber-defence of operational technology.
-
Speaker: Kyla Guru
Contributors: Robert Moss, Mykel Kochenderfer
Despite the growing number of cyber-attacks per day, technical attribution, or the act of identifying the responsible group behind a cyber-attack, remains a complex but mission-critical task for defenders. Delays in attribution often stem from the manual process of picking apart dense, unstructured forensic documentation to identify the tactics, techniques, and procedures (TTPs) of the threat actor, and then piecing together various information for attribution. While previous approaches have looked at classical NLP methods to identify TTPs, an end-to-end ML framework that uses LLMs to identify TTPs and then makes attribution predictions based on these TTPs has not yet been presented or evaluated. This research looks at evaluating the use of Large Language Models (LLMs) and vector embedding search for conducting attribution of a cyber-attack based on behavioral techniques identified within CTI documentation. We analyze similarity to human-generated TTP sets, as well as strengths and limitations of each approach, evaluating on analyst interpretability, tendency to hallucinate, and contextual understanding.
This research also introduces an end-to-end ML model that takes in unseen documentation, extracts TTPs, and uses these TTPs to perform attribution. This research finds that while both approaches generate TTP datasets that are different from the tested human-generated datasets, they still prove useful and can be used to train a model that performs above baseline on cyber-attack attribution. This study also finds that the performance of the model greatly improves when a human analyst is added into the loop, providing more information to the model such as the relevancy of various threat actors at the time of analysis.
-
Speaker: Rodrigo Bersa
Contributors: Tadesse Zemicheal, Shawn Davis, Hsin Chen
Vulnerability management in containerized systems is a labor-intensive and time-consuming process, particularly when dealing with many containers. This process involves the collection, comprehension, and synthesis of various pieces of information to ascertain whether immediate remediation is necessary upon the identification of a new common vulnerability and exposure (CVE). If analysts conclude remediation is not required, they assign an exemption justification status category from the standardized Vulnerability Exploitability eXchange (VEX) reasoning. This is a manual and time-consuming task. To address this issue, we propose a multi-component system using Large Language Models (LLM) that automates vulnerability management, verification, and VEX justification. The system uses a Plan-and-Execute-style LLM system for vulnerability impact analysis. The process begins with an LLM planner that generates a context-sensitive task checklist with up-to-date CVE intel.
This checklist is then executed by an LLM agent equipped with Retrieval-Augmented Generation (RAG) capabilities and tool usage. The gathered information and the agent's findings are subsequently summarized and categorized by additional LLMs to provide a final verdict. The system eliminates the need for manual verification of CVEs in target containers by leveraging container Software Bill of Materials (SBOM), source code, and documentation as input.
Experimental results on both synthetic and real-world datasets demonstrate that the proposed system achieves high accuracy rates in capturing false-triggered CVEs, and final justification summary in par with human labeled justifications, indicates the effectiveness of the approach in streamlining vulnerability analysis tasks.
-
Speaker: Kaixi Yang
Contributors: Paul Miller, Jesus Martinez del Rincon
Anomaly detection algorithms identify unusual events and outliers in large datasets where manual approaches are highly impractical. Most prior anomaly detection methods assume simple unimodal Gaussian data distributions; however, they produce suboptimal results on complex multimodal distributions. To address this problem, we propose DIP-ECOD, a novel anomaly detection algorithm leveraging unsupervised machine learning that generalises to both multimodal and unimodal distributions.
DIP-ECOD integrates a dip test within the ECOD framework, using SkinnyDip to split a probability distribution into separate modes, after which ECOD is applied. In this way, difficult-to-find outliers between modes and hidden in the distribution tails of each mode are also detected. Experiments using nine benchmark datasets across a range of domains such as healthcare and imagery demonstrate DIP-ECOD’s improved performance over ECOD in detecting outliers in both multimodal and unimodal distributions, with DIP-ECOD achieving an average AUC score of 0.791 compared to ECOD’s 0.761. Further, using a proprietary enterprise dataset, we show DIP-ECOD effectively identifies anomalous Github commits, indicating its applicability to information security and software vulnerability, where multi modal distributions are expected.
-
Speaker: Ashley Song
Author(s): Ashley Song; Hsin Chen; Shawn Davis; Dhruv Nandakumar
In this paper, we present a benchmark dataset for training and evaluating static PE malware machine learning models, specifically for detecting known vulnerabilities in malware. Our goal is to enable further research in defense against malware by exploiting their bugs or weaknesses. After recognising limitations in current malware datasets regarding exploitable malware, our dataset addresses these gaps by utilizing the malware vulnerability database Malvuln, and software vulnerability database ExploitDB to create a new malware dataset with 864 vulnerable malware samples, 35,241 non-vulnerable malware samples, 1,425 vulnerable benign samples, and 7,905 non-vulnerable benign samples, detailed with timestamps, families, threat mapping, vulnerability mapping, and obfuscation analysis.
This 4-class dataset lays the foundation for advancing future research in analysis and vulnerability exploitation in malware using machine learning. We also provide baseline results using state-of-the-art models for malware classification to benchmark the performance of the dataset, where the binary tasks achieve F1 scores above 0.90, while the multi-class task attains an F1-Score of 0.958.
-
Speaker: David Krisiloff
Contributors: Scott Coull
Research on static malware classifiers has generally explored two extremes: (i) hand-crafted features painstakingly created by experts and (ii) deep learning architectures that operate directly on the raw-byte representation of the binary. Broadly speaking, byte-based approaches have struggled to achieve the performance of traditional machine learning models leveraging expert features despite extensive exploration of architectures and training regimes.
In this paper, we suggest that there exists a rich, unexplored continuum of expert knowledge that lies between entirely human-driven features and data-driven representation learning using deep neural networks, which can be leveraged to achieve better trade-offs between architecture flexibility and development costs. Specifically, we consider whether providing the model with explicit structural and semantic hints, at varying degrees of specificity, increases the performance of deep learning-based classifiers. To evaluate the impact of the structural and semantic information, we consider three distinct Windows PE malware datasets, ranging from 800K samples (i.e., EMBER) to a full production-grade malware dataset containing more than 100M unique samples. The results of our analysis indicate that incorporating lightweight structural information, such as PE file sections, directly into the architecture allows deep learning-based models to match the performance of traditional malware classifiers for the first time -- achieving performance equivalent to a commercial malware classifier deployed to millions of endpoints.
Our evaluation further analyzes the impact of semantic information, such as parsing errors, training set size, and robustness to adversarial evasion, revealing novel insights into the value of integrating expert knowledge into the architecture of deep learning systems.
-
Speakers: Manish Marwah
Contributors: Asad Narayanan, Stephan Jou, Martin Arlitt, Maria Pospelova
The cost of errors related to machine learning classifiers, namely, false positives and false negatives, are not equal and are application dependent. For example, in cybersecurity applications, the cost of not detecting an attack is very different from marking a benign activity as an attack. Various design choices during machine learning model building, such as hyperparameter tuning and model selection, allow a data scientist to trade-off between these two errors. However, most of the commonly used metrics to evaluate model quality, such as F_1 score, which is defined in terms of model precision and recall, treat both these errors equally, making it difficult for users to optimize for the actual cost of these errors. In this paper, we propose a new cost-aware metric based on precision and recall that can replace F_1 score for model evaluation and selection.
It includes a cost ratio that takes into account the differing costs of handling false positives and false negatives. We derive and characterize the new cost metric, and compare it to F_1 score. Further, we use this metric for model thresholding for five cybersecurity related datasets for multiple cost ratios. The results show an average cost savings of 49%.
DAY TWO
Keynote: You’ll Never Guess What Happens Next: Acting to Ensure AI Benefits Cyber Defense in a Decade of Technological Surprise
Joshua Saxe leads Meta's efforts to integrate security into its large language models (LLMs) and protect them from application-level cyberattacks. Before joining Meta, he served as chief scientist at Sophos, principal investigator on multiple DARPA programs at Invincea Labs, and led machine learning security research at Applied Minds.
Joshua co-authored the book "Malware Data Science" with Hillary Sanders, published by No Starch Press. He has authored dozens of scientific papers and patents on security AI and has presented at numerous conferences, including Defcon, Blackhat and RSA.
Joshua Saxe
Head of LLM Security Integration, Meta
-
Speaker: Mohammad Saidur Rahman
Author(s): Mohammad Saidur Rahman; Scott Coull; Qi Yu; Matthew Wright
Large Language Models (LLMs), while powerful, are built and trained to process a single text input. In common applications, multiple inputs can be processed by concatenating them together into a single stream of text.
However, the LLM is unable to distinguish which sections of prompt belong to various input sources. Indirect prompt injection attacks take advantage of this vulnerability by embedding adversarial instructions into untrusted data being processed alongside user commands. Often, the LLM will mistake the adversarial instructions as user commands to be followed, creating a security vulnerability in the larger system. We introduce spotlighting, a family of prompt engineering techniques that can be used to improve LLMs' ability to distinguish among multiple sources of input. The key insight is to utilize transformations of an input to provide a reliable and continuous signal of its provenance.We evaluate spotlighting as a defense against indirect prompt injection attacks, and find that it is a robust defense that has minimal detrimental impact to underlying NLP tasks. Using GPT-family models, we find that spotlighting reduces the attack success rate from greater than 50% to below 2% in our experiments with minimal impact on task efficacy.
-
Speaker: Christopher Galbraith
Whether it's music, movies, search results, or social media posts–-most online content today is personalized to reflect users’ evolving interests and preferences. However, threat intelligence is still stuck in the “one feed for all” paradigm. As a result, defenders are inundated by countless irrelevant signals that prevent them from focusing their time and energy on the real threats. Security teams need solutions that track threats according to their own unique environments and threat models. To address this gap, we present a data-driven approach for personalizing the threat landscape to specific security team needs.
We will show how to leverage the rich relationships and semantics in threat graphs to produce security object embeddings via metric learning. The learned embeddings enable numerous downstream tasks including personalized information retrieval, object clustering, and scoring. Focusing on information retrieval, we will demonstrate how to combine the embeddings with nearest neighbor search to create personalized threat recommendations and allow pivoting between threat intelligence objects. After the demo, we will reflect on the benefits of embeddings for learning useful threat intelligence data representations. Finally, we will discuss the extensibility of our approach and make the case that similar frameworks can be applied to other critical problems in cybersecurity. Overall, our approach can be viewed as a tool to organize semi-structured, unlabeled and large-scale cybersecurity threat intelligence data to make it actionable. -
Speaker: Derek Everett
Collaborators: Edward Raff, James Holt
n-grams have proven to be simple and efficient features for many domains in machine learning, but these features are intrinsically brittle to changes of any of the n tokens. We develop hamm(h)-grams, a new alternative to n-grams which allow wildcard tokens. The method is employed for the problem of malware detection with static features, where common patterns of bytes can only be represented by expressions including wildcards.
We devise an efficient algorithm for finding common h-grams using a new locality-sensitive hash. We then demonstrate the power of h-gram features in tasks important for malware classification and analysis.
-
Speakers: Ram Shankar Siva Kumar, Hyrum Anderson
-
Speaker: Amelia Kawasaki
Contributors: Andrew Davis, Houssam Abbas
The widespread adoption of Large Language Models (LLMs), exemplified by OpenAI's ChatGPT, brings to the forefront the imperative to defend against adversarial threats on these models. These attacks, which manipulate an LLM's output by introducing malicious inputs, undermine the model's integrity and the trust users place in its outputs. In response to this challenge, our paper presents an innovative defensive strategy, given white box access to an LLM, that harnesses residual activation analysis between transformer layers of the LLM. We apply a novel methodology for analyzing distinctive activation patterns in the residual streams for attack prompt classification.
We curate multiple datasets to demonstrate how this method of classification has high accuracy across multiple types of attack scenarios, including our newly-created attack dataset. Furthermore, we enhance the model's resilience by integrating safety fine-tuning techniques for LLMs in order to measure its effect on our capability to detect attacks. The results underscore the effectiveness of our approach in enhancing the detection and mitigation of adversarial inputs, advancing the security framework within which LLMs operate.
-
Speaker: Dimitris Mouris,
Contributors: Manuel dos Santos, Mehmet Ugurbil, Stanislaw Jarecki, José Reis, Shubho Sengupta, Miguel de Vega
Recent advancements in transformers have revolutionized machine learning, forming the core of Large Language Models (LLMs). However, integrating these systems into everyday applications raises privacy concerns as client queries are exposed to model owners. Secure multiparty computation (MPC) allows parties to evaluate machine learning applications while keeping sensitive user inputs and proprietary models private. Due to inherent MPC costs, recent works introduce model-specific optimizations that hinder widespread adoption by machine learning researchers. CrypTen (NeurIPS'21) aimed to solve this problem by exposing MPC primitives via common machine learning abstractions such as tensors and modular neural networks. Unfortunately, CrypTen and many other MPC frameworks rely on polynomial approximations of the non-linear functions, resulting in high errors and communication complexity.
This paper introduces Curl, an easy-to-use MPC framework that evaluates non-linear functions as lookup tables, resulting in better approximations and significant round and communication reduction. Curl exposes a similar programming model as CrypTen and is highly parallelizable through tensors. At its core, Curl relies on discrete wavelet transformations to reduce the lookup table size without sacrificing accuracy, which results in up to 19x round and communication reduction compared to CrypTen for non-linear functions such as logarithms and reciprocals. We evaluate Curl on a diverse set of LLMs, including BERT, GPT-2, and GPT Neo, and compare against state-of-the-art related works such as Iron (NeurIPS'22) and Bolt (S&P'24) achieving at least 1.9x less communication and latency.
Finally, we resolve a long-standing debate regarding the security of widely used probabilistic truncation protocols by proving their security in the stand-alone model. This is of independent interest as many related works rely on this truncation style. -
Speaker: Tamás Vörös
Contributors: Ben Gelman, Sean Bergeron, Adarsh Kyadige
Reliance on public foundation models raises significant security concerns, particularly due to the opaque nature of large language models (LLMs) and their vulnerability to Trojan attacks. This study explores the potential of targeted noising of neurons to address these risks by analyzing neuron importance in LLMs with respect to Trojans. We do not assume prior knowledge about the existence or nature of Trojans in the models. Instead, we insert our own controlled Trojans into the models. By doing so, we are able to demonstrate that our approach not only neutralizes the Trojans we introduce but also mitigates pre-existing Trojan activations.
Our experiments on the Pythia and Llama2 models demonstrate that targeted noising effectively preserves LAMBADA dataset accuracy while significantly neutralizing Trojan triggers. Specifically, at a noise level of approximately 2e-05 of all available neurons, the Pythia model maintains a LAMBADA accuracy drop of 1.6%, while reducing Trojan unigram recall to 1.7%. For the Llama2 model, a noise level of 1.3e-05 results in an accuracy drop of just 3.5%, with Trojan unigram recall reduced to 5%. In contrast, random noising only mitigates Trojan activation at the cost of complete usability loss.
-
Speaker: William Fleshman
Large language models (LLMs) are increasingly capable of completing knowledge intensive tasks by recalling information from a static pretraining corpus. Here we are concerned with LLMs in the context of evolving data requirements. For instance: batches of new data that are introduced periodically; subsets of data with user-based access controls; or requirements on dynamic removal of documents with guarantees that associated knowledge cannot be recalled. We wish to satisfy these requirements while at the same time ensuring a model does not forget old information when new data becomes available.
To address these issues, we introduce AdapterSwap, a training and inference scheme that organizes knowledge from a data collection into a set of dynamically composed low-rank adapters. Our experiments demonstrate AdapterSwap's ability to support efficient continual learning, while also enabling organizations to have fine-grained control over data access and deletion.