Protection from zero-day attacks is one of the biggest challenges of modern cybersecurity.
Although vendors are getting better and better at detecting zero-day exploits, the number of zero-day attacks and the effectiveness of them keeps increasing. Tools and techniques for developing zero-day exploits, spreading them, and performing zero-day attacks are becoming more sophisticated, more widespread, and easier to use by the day. On the other hand, actually detecting an ongoing cyberattack and preventing the damage is getting much harder due to the increased complexity of the systems that cybersecurity specialists are dealing with and the advanced level of encryption and obfuscation used by malicious software.
This means that there’s an increased demand for zero-day attack detection and prevention solutions that can produce accurate results with fewer false positives (non-incidents that raise the alarm) and false negatives (actual incidents that don’t raise the alarm).
As a response to the growing number of zero-day attacks of various types, which could not be effectively detected by traditional anti-virus software, the whole new market segment has appeared – NGAV (Next Generation Antivirus). NGAV products, such as e.g. Carbon Black CB Defense system, are specifically designed to detect previously unknown malware, ransomware, and malware-less attacks. Solutions like this usually involve complex algorithmic behavior analysis, based on in-depth monitoring of all events inside corporate infrastructure.
Producing such a solution can be quite difficult, requiring a considerable investment as well as a mature engineering team experienced in kernel and user level endpoint monitoring, network monitoring, and general cybersecurity techniques and algorithms.
At Apriorit, over the years we’ve worked on many cybersecurity solutions for our clients, including ones capable of detecting zero-day attacks. For example, we developed a technological stack that provides an in-depth system monitoring functionality, including hidden file and process detection, kernel and registry integrity checks, floating code detection, and detection of all type of hooks, as well as monitoring of user actions. This technology was used as a part of a zero-day attack detection solution that we helped to design for our clients.
Since we pride ourselves on our cybersecurity expertise (you can check out our article on the basics of ROP chain attack security, or browse our blog for other cybersecurity articles, if you want to find out more), we’ve decided to share our knowledge on the topic. To keep this article concise, we’ll focus on general information regarding zero-day exploits without going too deep into the actual technical stuff. We hope that this article will serve as a great starting point for anyone interested in learning about zero-day exploits and developing software capable of detecting them.
The danger of zero-day attacks
The number of detected zero-day exploits keeps rising at an alarming pace. According to a paper on zero-day attack defense techniques by Singh, Joshi, and Singh, the number of discovered exploits rose from 8 in 2011 to 84 in 2016. If this pace continues, we’ll see a new zero-day exploit discovered every day in 2022. Each of these exploits represents a vulnerability that can result in an extremely dangerous zero-day attack capable of hitting entire industries, not to mention individual users.
We can take the famous WannaCry ransomware attack that hit the majority of the world in May 2017 as a rough example of the worst-case scenario that could happen due to a zero-day exploit.
One actual zero-day exploit for Microsoft Windows systems, called EternalBlue, was initially discovered by the US National Security Agency (NSA) and stored as part of their zero-day exploit list, but was eventually stolen by a team of hackers called the Shadow Brokers. Upon learning about the potential data breach, the NSA warned Microsoft, which promptly patched the exploit in March 2017. However, due to the failure of many users to update their systems, malware based on this exploit still managed to do significant damage, hitting a number of organizations around the world including the United Kingdom’s National Health Service.
The damage dealt by EternalBlue was still limited, since the exploit was patched before the zero-day attack occurred. In case of an actual zero-day attack, the number of affected systems would have been much larger. Moreover, while WannaCry was ransomware that directly attacked user data, immediately making itself known, the majority of zero-day attacks are much more discreet, involving data theft without users even noticing that something is wrong. Such cyberattacks can stay undiscovered for years, and such exploits can be used again and again to penetrate the defenses of different organizations until they’re finally patched.
So how can we fight against zero-day exploits? First, let’s look at what this term actually means.
When thinking about the topic at hand, it’s important to distinguish among three major terms: zero-day attacks, exploits, and vulnerabilities. Let’s define each:
- Zero-day vulnerability – an inherent flaw in software code or in the way a piece of software interacts with other software that is yet to be discovered by the software vendor.
- Zero-day exploit – an exploit based on a zero-day vulnerability; usually malicious software that uses a zero-day vulnerability to gain access to a target system.
- Zero-day attack – the act of applying a zero-day exploit for malicious purposes; a true zero-day attack occurs when perpetrators are using a vulnerability currently unknown to the software vendor in order to compromise the system and perform malicious actions. The term pseudo zero-day attack describes examples such as WannaCry ransomware, where the exploit was already known to the software vendor but the cyberattack was still effective due to the failure of end-users to update their software.
Generally, in cybersecurity the term zero-day refers to the day when a new vulnerability is discovered by a software vendor. From that moment of zero-day detection, the clock is ticking for the vendor to produce a patch as quickly as possible.
How zero-day exploits are created
Now that we’ve defined our terms, we can start learning how to deal with zero-day exploits. In order to do that, however, we also need to understand how they’re produced. There are usually five steps involved in producing zero-day exploits:
Attack surface analysis. At the earliest stage, a perpetrator may try to analyze a system by simply studying parts of it that they have legitimate access to (the attack surface). This step requires a considerable knowledge of software engineering in general and involves various solutions and protocols on the part of the perpetrator. In the best-case scenario, an attack surface analysis will result in discovery of a vulnerability.
Attack surface analysis also often involves analyzing source code or binary code when it’s available.
Fuzz testing. This is another way to discover vulnerabilities, and is often applied if the attack surface analysis hasn’t produced any results. Fuzz testing can also be largely automated, which allows even people inexperienced in software development to successfully search for zero-day flaws and vulnerabilities in software.
The goal of fuzz testing is to find vulnerabilities by feeding unexpected and random values to the system and monitoring the subsequent behavior. Fuzz testing is usually extremely effective, since instead of following a pre-defined pattern it stresses the software randomly, testing many non-obvious edge cases.
Development. After a vulnerability has been discovered, it needs to be developed into a working piece of malware that can be used to attack the target system. This step again requires specific knowledge on the part of the perpetrator.
The main challenge of developing an exploit is placing the shellcode in the vulnerability. The difficulty of this task depends on the amount of memory available in the particular exploited stack on the target system. If there’s little free memory, code needs to be injected that will allow the application to jump to the full shellcode.
Also important is the ability to hide exploit code and protect it in case of discovery. Modern zero-day exploits use sophisticated protection techniques that can be divided into two categories:
- Metamorphic – obfuscation is used to change the malware code so it’s hard for humans to understand, producing software that’s functionally identical but structurally different than the original.
- Polymorphic – encryption techniques are used to encrypt the original code and pack it together with the encryption algorithm in the payload. Polymorphic zero-day malware is highly sophisticated and extremely hard to detect.
Delivery. Once an exploit has been developed, the next step is to deliver the malware to the target system in order to perform a zero-day attack. Depending on the target and the type of exploit used, malware can either be delivered via the network automatically or may require user interaction for activation. In the latter case, social engineering is often used, such as phishing and spam emails to prompt users to download and launch the malware.
A lot of modern malware also has protection mechanisms that prevent it from being detected when used multiple times, such as obfuscation and other means to slightly change the attack signature during each consecutive attack.
Such precautions – usually employed by perpetrators when developing malware – coupled with the fact that said malware is targeting an unknown vulnerability and doesn’t have a known attack signature to begin with, make this kind of malware very hard to detect.
However, different strategies for zero-day exploit detection have been developed, each using different technique with its own set of pros and cons.
Techniques for detecting zero-day exploits
Statistics-based detection techniques rely on data about previously detected exploits inside a particular system. Statistics-based detection solutions often employ machine learning to aggregate statistical data on past exploits and determine a baseline for safe system behavior.
The main advantage of such solutions is that the more data they have, the more accurate they become. As a statistics-based solution runs within a system, it gathers more information about new zero-day exploits, thus expanding its dataset and producing a more sophisticated profile for a potential new exploit.
However, depending on the baseline chosen, such a solution may also produce a high number of false positives and false negatives. It may be hard for developers to find the right balance with the baseline since on the one hand false negatives need to be avoided in order not to miss a zero-day attack, while on the other hand the number of false positives needs to be minimized to avoid impacting the daily operations of the company.
Overall, the effectiveness of statistics-based techniques for zero-day exploit detection is limited. They also have limited capabilities for detecting malware with heavily encrypted and obfuscated code. However, statistics-based techniques can work well as part of a hybrid solution, which we’ll cover further down.
Signature-based detection techniques are usually employed for malware detection by [legacy] antivirus software. As the name implies, the technique relies on existing databases of malware signatures, which are used as a reference when scanning a system for viruses. Although signature databases are usually updated very quickly, they cannot be used to detect new zero-day attacks since, by definition, zero-day exploits don’t have a known signature.
Thus, the only way to use signature-based detection for defense against zero-day attacks is to use machine learning and similar algorithms to generate signatures in real time that might match a currently unknown malware and thus be able to detect it. There are three types of signatures that can be generated this way:
- Content-based – a signature based on components typically present in most exploits (such as certain parts of code)
- Semantic-based – a signature based on typical actions taken by malware
- Vulnerability-based – a signature based on establishing the conditions for a vulnerability and how easily achievable they are; vulnerability-based signatures usually use data on known vulnerabilities to establish a baseline, and therefore the accuracy of the baseline is determined by the size of the data pool.
Overall, the ability to quickly generate accurate signatures that correspond to real-life malware is what determines whether a signature-based approach actually works for detecting zero-day exploits.
Behavior-based detection techniques look for characteristics of malware based on the way it interacts with the target system. This means that a solution using a behavior-based technique doesn’t examine the code of incoming files, but instead looks at the interactions they have with existing software and tries to predict whether this is the result of any malicious actions.
Machine learning is often used to establish baseline behavior based on data of past and current interactions within the system. As with statistics-based detection techniques, the more data is available, the more reliable the detection becomes. A behavior-based detection system that works on a single target system for a long time may prove very effective in predicting results of current processes and actually detecting malicious software.
Hybrid detection techniques are aimed at taking advantage of the different strengths of the three techniques mentioned above while at the same time avoiding their weaknesses. Hybrid detection solutions usually combine two or three techniques in a way that allows them to produce more accurate results.
For example, a statistics-based algorithm can be used to reinforce a behavior-based baseline for normal behavior and to speed up the learning process, while a signature-based approach can be used to filter false positives, increasing the accuracy of detection.
Due to their effectiveness, popularity of hybrid-based solution is constantly rising. One example of a cybersecurity vendor that moved away from signature-based detection and offers solution, based on hybrid detection techniques is previously mentioned Carbon Black. Effectiveness of their CbDefense product received praise from cybersecurity experts for its ability to detect zero-day malware, including never before seen ransomware like aforementioned WannaCry, as well as non-malware attacks.
CbDefense and other solutions using hybrid detection techniques almost always produce more accurate results than any single approach; however, they are also the hardest and most expensive to develop. With the advent of machine learning, it’s possible to create fairly accurate zero-day exploit detection solutions, but a team of engineers with extensive experience in cybersecurity and a good grasp of low-level programming is required to develop complex algorithms for the data acquisition and processing used by such solutions.
Writing your own zero-day exploit detection solution
Generally, when you write a zero-day attack protection solution, there are three major problems that you need to solve: how to acquire data, how to parse it, and how to detect malicious software based on the data at hand.
In-depth system monitoring. The complexity of your solution is directly determined by the amount of data you are able to acquire. To be truly effective in detecting modern zero-day attacks, your solution needs to be able to monitor as much events as possible, including but not limited to all network traffic, all system processes including hidden ones, all existing hooks, all floating code, and others.
It is also important to detect and record all relationships between different sets of acquired data, which will help to process events in streams, rather than individually, providing behavior analysis algorithm with a full picture of everything that happens on any given endpoint. Sometimes the data is hidden or encrypted, and acquiring it can be quite a challenge. Developing a set of technologies capable of monitoring all the necessary data and seamlessly integrating them into a single solution requires a significant investment of time and effort from the dedicated development team.
Creating a baseline with behavior analysis. Behavior analysis algorithms process all monitoring data, coming to them in real time, as well as any available previous records in order to establish the baseline of normal behavior. Obviously, bigger datasets allow to establish more accurate baseline, which, in turn, allows to more accurately detect any deviations from said baseline. In order to predict future events, behavior analysis algorithm needs to be able to process all events in their proper context as a single unified stream, rather than individually. Focusing on event streams will allow to produce a baseline that accounts for both malware-based and non-malware attacks, but this is also one of the most expensive and time-consuming parts of the project.
Detecting an exploit. Behavior analysis is used to establish and constantly tweak the baseline that you compare all incoming results of your ongoing in-depth system monitoring against. The key here is to correctly configure the system to limit the number of false positives and false negatives. Both a good behavior analysis algorithm and access to a good initial dataset are crucial for success. Once an active zero-day attack has been detected, you need to immediately block it for endpoint remediation.
Overall, zero-day attack prevention and detection are an extremely difficult problems, but there’s no denying the high demand for solutions in these areas. The practice of issuing monetary rewards for reported exploits has become nearly universal, and the amount of money offered is constantly going up.
Thus, if you decide to invest in a cybersecurity solution, zero-day exploit detection is one of the great areas to focus on. And if you’re looking for an experienced development team to help you, check out Apriorit – we have a team ready for you right now!