Accelerating The Alert Triage Scenario (AT-ATs): InfoSec Data Science with RAPIDS

Rachel Allen

To keep pace with cyber adversaries, organizations are constantly evolving in their approaches to information monitoring. With the addition of every new alert generated by ML models, heuristics, or sensors comes an additional data feed in need of triage and analysis. SOCs are frequently overwhelmed by the volume of alerts and unable to analyze a large portion of their data, resulting in potentially missed malicious activity. By leveraging the data processing and analytic capabilities of RAPIDS, a suite of open-source software libraries that allow for end-to-end data science pipelines in GPU memory, we demonstrate how it's possible to explore, analyze, and prioritize massive amounts of heterogenous cyber data in real-time.

In this hands-on tutorial, we work through two approaches to data exploration and alert prioritization. First, we demonstrate a RAPIDS' data exploration of model and sensor outputs using common methods for feature engineering, data manipulation, statistical, and trend analysis. Second, we use graph embeddings and Personalized PageRank to prioritize alerts according to criticality of the individuals and infrastructure involved. Attendees will be able to execute code and leave with the tools necessary to create custom workflows in their own security and research environments.