Solution provider takeaway: Intrusion detection offers a rich source of packet patterns that can benefit from network
algorithmics. This section of our chapter excerpt from the book "Network Security: Know It All" describes the growing role that network algorithms occupies in the intrusion detection field and details vendors' efforts in real-time intrusion detection systems.
From denial-of-service to Smurf attacks, hackers that perpetrate exploits have captured both the imagination of the public and the ire of victims. There is some reason for indignation and ire. A survey by the Computer Security Institute placed the cost of computer intrusions at an average of $970,000 per company in 2000.
Thus there is a growing market for intrusion detection, a field that consists of detecting and reacting to attacks. According to IDC, the intrusion-detection market grew from $20 million to $100 million between 1997 and 1999 and is expected to reach $518 million by 2005.
Yet the capabilities of current intrusion detection systems are widely accepted as inadequate, particularly in the context of growing threats and capabilities. Two key problems with current systems are that they are slow and that they have a high false-positive rate. As a result of these deficiencies, intrusion detection serves primarily as a monitoring and audit function rather than as a real-time component of a protection architecture on par with firewalls and encryption.
However, many vendors are working to introduce real-time intrusion detection systems. If intrusion detection systems can work in real time with only a small fraction of false positives, they can actually be used to respond to attacks by either deflecting the attack or tracing the perpetrators.
Intrusion detection systems (IDSs) have been studied in many forms since Denning's classic statistical analysis of host intrusions. Today, IDS techniques are usually classified as either signature detection or anomaly detection. Signature detection is based on matching events to the signatures of known attacks.
In contrast, anomaly detection, based on statistical or learning theory techniques, identifies aberrant events, whether known to be malicious or not. As a result, anomaly detection can potentially detect new types of attacks that signature-based systems will miss. Unfortunately, anomaly detection systems are prone to falsely identifying events as malicious. Thus this chapter does not address anomaly-based methods.
Meanwhile signature-based systems are highly popular due to their relatively simple implementation and their ability to detect commonly used attack tools.
The lightweight detection system Snort is one of the more popular examples because of its free availability and efficiency.
Given the growing importance of real-time intrusion detection, intrusion detection furnishes a rich source of packet patterns that can benefit from network algorithmics. Thus this chapter samples three important subtasks that arise in the context of intrusion detection. The first is an analysis subtask, string matching, which is a key bottleneck in popular signature-based systems such as Snort. The second is a response subtask, traceback, which is of growing importance given the ability of intruders to use forged source addresses. The third is an analysis sub-task to detect the onset of a new worm (e.g., Code Red) without prior knowledge.
These three subtasks only scratch the surface of a vast area that needs to be explored. They were chosen to provide an indication of the richness of the problem space and to outline some potentially powerful tools, such as Bloom filters and Aho-Corasick trees, that may be useful in more general contexts. Worm detection was also chosen to showcase how mechanisms can be combined in powerful ways.
This chapter is organized as follows. The first few sections explore solutions to the important problem of searching for suspicious strings in packet payloads. Current implementations of intrusion detection systems such as Snort (www.snort.org) do multiple passes through the packet to search for each string. Section 4.1.1 describes the Aho-Corasick algorithm for searching for multiple strings in one pass using a trie with backpointers. Section 4.1.2 describes a generalization of the classical Boyer--Moore algorithm, which can sometimes act faster by skipping more bits in a packet.
Section 4.2 shows how to approach an even harder problem -- searching for approximate string matches. The section introduces two powerful ideas: minwise hashing and random projections. This section suggests that even complex tasks such as approximate string matching can plausibly be implemented at wire speeds.
Section 4.3 marks a transition to the problem of responding to an attack, by introducing the IP traceback problem. It also presents a seminal solution using probabilistic packet marking. Section 4.4 offers a second solution, which uses packet logs and no packet modifications; the logs are implemented efficiently using an important technique called a Bloom filter. While these traceback solutions are unlikely to become deployed when compared to more recent standards, they introduce a significant problem and invoke important techniques that could be useful in other contexts.
Section 4.5 explains how algorithmic techniques can be used to extract automatically the strings used by intrusion detection systems such as Snort. In other words, instead of having these strings be installed manually by security analysts, could a system automatically extract the suspicious strings? We ground the discussion in the context of detecting worm attack payloads.
The implementation techniques for security primitives described in this chapter (and the corresponding principles) are summarized in Figure 4.1.
Principles used in the implementation of the various security primitives discussed in this chapter.
Quick Reference Guide
Sections 4.1.1 and 4.1.2 show how to speed up searching for multiple strings in packet payloads, a fundamental operation for a signature-based IDS. The Aho-Corasick algorithm of Section 4.1.1 can easily be implemented in hardware. While the traceback ideas in Section 4.4 are unlikely to be useful in the near future, the section introduces an important data structure, called a Bloom filter, for representing sets and also describes a hardware implementation. Bloom filters have found a variety of uses and should be part of the implementor's bag of tricks. Section 4.5 explains how signatures for attacks can be automatically computed, reducing the delay and difficulty required to have humans generate signatures.
Network Security Algorithms
4.1 Searching for multiple strings in packet payloads
4.2 Approximate string matching
4.3 IP traceback via probabilistic marking
4.4 IP traceback via logging
4.5 Detecting worms
About the book
The chapters contain compiled knowledge from recognized industry experts on topics such as the theory and practice of network security technology, analysis and problem-solving techniques and core security concepts.
Printed with permission from Morgan Kaufmann, a division of Elsevier. Copyright 2008. Network Security: Know It All by James Joshi. For more information about this title and other similar books, please visit www.elsevierdirect.com.