Robert Gove

Two Six Technologies

and

Nathan Danneman

Data Machines Corporation

Automatic Summarization and Visualization of Incident Reports (pdf)

Introduction.
Cyber defenders analyze and share incident reports to determine if malicious activity occurred, how it occurred, and to document it. When displayed in tables, the report’s narrative structure and all the connections within it are difficult to identify; especially when the table contains hundreds of rows. Indeed, we collaborate with a security operations center (SOC) analyst who told us a summary visualization is preferable over scrolling through a long table.
To help analysts identify the core sequence of events and report their findings, we present a summarization algorithm and a visualization tool for log data from incident reports and incident report-like alerts. The summarization algorithm shares similarities with extractive text summarization: it extracts the core sequence of events, the primary entities, and the relationships that connect them. Users can customize the amount of summarization to near-arbitrary levels. An evaluation on real incident reports finds that the optimized summaries reduce false positives and improve average precision by 22% while reducing the average incident report size up to 61%. An accompanying visualization tool inspired by Gantt charts displays the resulting summaries in a more compact manner than tables and earned praise from a SOC analyst colleague.

Incident Reports, Log Data, and Dynamic Graphs.
Incident reports contain excerpts from various logs that describe when events occurred, which also often encode relationships between various types of entities. For example, Zeek conn logs record a connection relationship between two IP addresses. As another example, scripts such as the BZAR project can detect higher-level relationships, including tactics from MITRE’s ATT&CK framework like “lateral movement” or “data exfiltration.” By mining a log, we can create a dynamic graph (aka temporal graph) where vertices are entities like IP addresses, and edges are relationships that encode types of behavior. Each entity and relationship has a set of timestamps associated with it that describe when activity occurred. This dynamic graph thus encapsulates the rich structure of log data.

Visualizing Logs.
This figure shows our new visualization from an anonymized Zeek-like conn log. Each row is an entity, and the position and length of its gray bar indicates the duration of observed activity. Entities are ordered by type (IP, host, then user), then by earliest observation and duration. Vertical links indicate relationships, where the circle designates the source in the relationship. If the relationship is a tactic from the MITRE ATT&CK framework, then the yellow-to-red color corresponds to the earlier-to-later stages in an attack. This design occupies fewer rows and less screen space than a table while illuminating structure obscured by table-like formats.
We iterated on the design of this visualization with feedback from a SOC analyst, implementing his requests (e.g. the color scheme and entity ordering). Overall the SOC analyst was very positive about the visualization, and expressed that it allowed him to more rapidly understand the data presented than a table.

Automatically Summarizing Log Data. The summarization algorithm operates on the dynamic graph described above. First, the algorithm generates four features on a 0-1 scale for each connected component that characterize the number of entities, number of timestamps, number of relationships, and duration of each component. Second, the algorithm identifies the core sequence of events in each component by conducting a depth-first search from the earliest entity to the latest. For each component the algorithm subtracts the component’s core sequence of events and induces subgraphs from the remaining entities and relationships, which we can consider “branches” in the log’s “narrative.” The algorithm generates six features on a 0-1 scale for each branch, similar to the component features, but also incorporating relationship severity if the relationship is a MITRE ATT&CK tactic. Third, the algorithm generates two entity features: 1 or 0 if the entity is part of a component’s core sequence of events, and a 0-1 severity score if one is available from a cyber security analytic.
The user provides a summarization threshold, and the algorithm can apply two alternate approaches to summarize the featurized dynamic graph. The first is an unweighted summarization, where all features are weighted equally. An entity e and its relationships are removed if either the mean of e’s component features is less than the summarization threshold, or the mean of e’s branch and entity features is less than the summarization threshold.
The second is a weighted summarization: We predict whether an entity belongs in an incident report using ground truth data generated during red team events. We leverage a Bayesian hierarchical logistic regression with fixed effects at the entity level and nested random effects modeled at the branch and component levels. This class of model allowed generalizable predictive accuracy from limited training data. Entities (and their relationships) are removed if their predicted probabilities of belonging are below the summarization threshold. The visualization above shows a summary using this method of what was originally about 100 entities and about 600 relationships.

Evaluation.
We evaluated the summarization methods on log excerpts included in incident reports from a red team event, which provided us with ground truth. Both summarization methods reduce the data size considerably, but the weighted model produces summaries considerably smaller than the unweighted model at almost all thresholds. Meanwhile, the weighted model can improve precision of the incident reports better than the unweighted model, improving mean precision 22% over the unsummarized graph vs. 8% over the summarized graph for the unweighted model. Meanwhile, at low levels of summarization, the weighted model increases precision, keeps F1 and recall high, and also reduces the number of entities 15-20%. In other words, using the weighted model we can make incident reports smaller and also improve their accuracy.

Conclusion.
We show an automated method to summarize incident reports. At optimal summarization thresholds, it reduces both false positives and data size, thereby helping analysts focus on high-value data and identify key entities and events. Because the algorithm operates on dynamic graph data, we can apply it to summarize other types of data such as log data, which we will demonstrate in our talk. We also debuted a visualization tool that presents data more compactly than tables, which earned analyst praise for speeding and easing their analysis tasks.