Cybersecurity Knowledge Base

Cybersecurity Knowledge Base

The ATT&CK knowledge base is used as a foundation for the development of specific threat models and methodologies in the private sector, in government, and in the cybersecurity product and service community.

Adversary Tactics Techniques and Procedures (TTPs)

TTPs describe what and how an adversary acts and Indicators describe how to recognize what those actions might look like. Using a non-cyber analogy, a specific approach to counterfeiting $100 dollar bills can be thought of as a TTP while the specific guidance for detecting bills (wrong color, bad watermark, etc.) using this approach can be thought of as Indicators.

Malicious activity in cyberspace can be considered in three dimensions: time, terrain and behavior. Every event in cyberspace can be represented as a specific behavior (benign, malicious or suspicious) at a specific time, on a specific machine, process, subnet, or other element of cyber terrain. Each of these dimensions is described below. The figure below provides a visualization of these three dimensions with example values for behavior and terrain types.

analysis-space

The time dimension is relatively straightforward. Most evidence of malicious behavior is transient in nature (process content and activity, for example) and must be collected during the intrusion as it cannot be obtained after the fact. Data from continuous monitoring is generally preferred over forensic data collection because hunting for malicious behavior starts with a large, unknown window of time and most forensic data (and its related data collection capabilities) only cover a narrow slice of the time domain for these transient events. It is far more likely that malicious behaviors will fall outside of this slice, thus these tools are less effective for hunting. If used as part of an ongoing investigation, forensic tools can complement other data sources but are much less effective as primary data sources for detection.

Cyber terrain can refer to the broad range of hosts, network segments, or other areas where the adversary may be operating. For the purposes of this methodology, terrain is restricted to where defenders have authority to operate and a responsibility to defend – within an enterprise or enclave monitored by a Security Operations Center (SOC) or hunt team. Particular focus should be paid to areas that might be highly targeted by an adversary (e.g., crown jewels) (MITRE, 2018); areas that the adversaries may need to traverse to complete their objective (Internet access points, Trusted Internet Connections, Domain Controllers, etc.); or areas that, if damaged, will hamper defensive forces in countering the intrusion (SOC analysis systems, perimeter and host sensors, log collection architecture, etc.).

Behavior refers to malicious activities in cyberspace. Data should be collected to observe those activities. For example, if the adversary can launch malware from ordinarily benign processes, defenders should capture data on process launches and the process’s parent information from hosts within the terrain of interest. Another example is encryption of malicious command and control (C2) communications across the network. Collecting network communication information from the host may allow defenders to potentially mitigate this visibility gap.

Detection Methods

Detection approaches include sweeping for IOCs and network security monitoring (NSM); anomaly detection; and TTP-based detection – with the majority of current data collection efforts focused on network sensors and perimeter proxies as opposed to host-based event data. Each of these approaches has benefits and limitations.

Pyramid of Pain

pyramid-of-pain Pyramid of Pain - a concept that was first introduced in 2013 by cybersecurity expert David J Bianco. The “pain” in his concept refers to the difficulty faced by the adversary in succeeding if they are denied certain indicators.

As the graphic indicates, it is trivial for the adversary to manipulate a hash value to bypass a detection engine that matches based on hash. However, adding controls that restrict tactics, techniques, and procedures (TTPs) makes it very tough for them to overcome. I am a big believer in adding as many of these controls as possible; thus, adding a multiplier effect for each defensive capability.

Below is the breakdown of each indicator:

Hash Values (TRIVIAL): Hashing is an algorithm performed on data such as a file or message to produce a number called a hash (sometimes called a checksum). The hash is used to verify that data is not modified, tampered with, or corrupted. In other words, you can verify the data has maintained integrity. A key point about a hash is that no matter how many times you execute the hashing algorithm against the data, the hash will always be the same if the data is the same. SHA1, MD5, SHA256 or other similar hashes that identify a suspicious or malicious file. It is trivial for the adversary to change the value of the hash bypassing the defensive capability: For example, polymorphic or metamorphic techniques.

IP Addresses (EASY): IP addresses are used as an identifier and can be easily hidden by using an anonymous proxy service, like Tor, or they can be changed at high frequency such as leveraging fast flux.

Domain Names (SIMPLE): Domain names (e.g., “internetbadguys.com”) and/or or sub domain (e.g., “exploitkit.internetbadguys.com”). These domains are registered and hosted and can be part of the adversary’s attack infrastructure. The attackers can simply bypass these controls using techniques such as DGAs (Domain Generated Algorithms).

Network Artifacts (ANNOYING): This refers to the ability to determine suspicious or malicious activity from legitimate activity. It goes beyond a user device and includes all Internet of Things (IoT) devices. Examples may include patterns based on network activity (C2 information), Uniform Resource Identifier (URI) patterns, certificates of use, and so on.

Host Artifacts (ANNOYING): This may include an artifact in the registry, a scheduled task, or files dropped within the file system that indicates the presence of malicious activity.

Tools (CHALLENGING): This would typically be software that the adversary brings with them to perform a variety of activities such as creating backdoors for a C2 channel, network sniffers, and password crackers.

Tactics, Techniques and Procedures – TTPs (TOUGH): The tactic provides the description of the behavior, the technique provides more details of the behavior from the perspective of the tactic, and the procedure would provide deep details around the technique itself. Example: The Tactic is “Discovery” and technique being used is “Network Service Scanning.”

Indicators of Compromise (IoC)

Prior to 2016, threat hunting processes appear to have been primarily organized around searching for IOCs; which include static characteristics of malware, such as hashes, filenames, libraries, strings; or gathering and analyzing disk and memory forensics artifacts.

A signature written to detect IP addresses, domains, file hashes, or filenames associated with malicious activity, without triggering on benign instances, is often very brittle to polymorphism, metamorphism and other implementation modifications which are relatively cheap for an adversary to use. David Bianco captured this through his Pyramid of Pain described above. Defining those brittle signatures and indicators often requires extensive resources, through reverse engineering and static analysis, and are often dependent on detection through some other means (often after having been successfully used on by adversaries in other breaches and independently detected, reported, and disseminated). Thus, indicator sweeping fails to identify novel or changing threats that don’t match known indicators, and only provides detection capabilities after the fact.

Anomaly Detection

Anomaly-based detection employs statistical analysis, machine learning, and other forms of big data analysis to detect atypical events. This approach has traditionally suffered from high false positive rates, can require significant investment in large scale data collection and processing, and does not always provide enough contextual information around why something was flagged as suspicious, which can make analytic refinement challenging.

The benign activity of software, system administrators, software developers and everyday users across enterprise networks is often so variable across time, users, and network space, that defining “normal” behavior is often a futile exercise.1 The volume of data required to be processed for anomaly and statistical analysis can be prohibitive to collect and retain. There must be sufficient data collected, from a sufficient number of data sources and locations within an environment, to enable trend and statistical analysis. However, what is sufficient can vary greatly and is often unknowable in advance, making this type of detection hard to utilize and measure effectively.

TTP Detection

Rather than characterizing and searching for tools and artifacts, a more robust approach is to characterize and search for the techniques adversaries must use to achieve their goals. These techniques do not change frequently, and are common across adversaries due to the constraints of the target technology. The MITRE ATT&CK™ framework is an effective way to characterize those techniques. ATT&CK categorizes reported adversary TTPs from public and open cyber threat intelligence and aligns them by tactic category within the phases of the Cyber Attack Lifecycle.

MITRE ATT&CK Matrix

MITRE ATT&CK Click to watch

MITRE ATT&CK™ is a globally-accessible Knowledge Base of adversary tactics and techniques based on real-world observations. The ATT&CK knowledge base is used as a foundation for the development of specific threat models and methodologies in the private sector, in government, and in the cybersecurity product and service community.

During the lecture we will go through selected pages in the Getting Started with ATT&CK eBook to understand how ATT&CK is used.

A real use case

In a 2018 article Microsoft detailed their layered defensive approach to cybersecurity. During class we will go through the article and understand the big picture of how ML is used in cybersecurity today. We will also cover the ML technique quoted that was used in the client side to detect the attach used Gradient boosted trees.