4.4 IP TRACEBACK VIA LOGGING
Solution provider takeaway: Learn how to respond to an intrusion detection on a client's network by using packet logs. This section of our chapter excerpt from the book "Network Security: Know It All" explains why you should add more storage to routers and opt for Bloom filters.
A problem with the edge-sampling approach of the previous section is that it requires changes to the IP header to update marks and does not work for single-packet attacks like the Teardrop attack. The following approach, traceback via logging, avoids both problems by adding more storage at routers to maintain a compressed packet log.
As motivations, neither of the difficulties the logging approach gets around are very compelling. This is because the logging approach still requires modifying router forwarding, even though it requires no header modification. This is due to the difficulty of convincing vendors (who have already committed forwarding paths to silicon) and ISPs (who wish to preserve equipment for, say, 5 years) to make changes. Similarly, single-packet attacks are not very common and can often be filtered directly by routers.
However, the idea of maintaining compressed searchable packet logs may be useful as a general building block. It could be used, more generally, for, say, a network monitor that wishes to maintain such logs for forensics after attacks. But even more importantly it introduces an important technique called Bloom filters.
Given an efficient packet log at each router, the high-level idea for traceback is shown in Figure 4.7. The victim V first detects an attack packet P; it then queries all its neighboring routers, say, R8 and R9, to see whether any of them have P in their log of recently sent packets. When R9 replies in the affirmative, the search moves on to R9, who asks its sole neighbor, R7. Then R7 asks its neighbors R5 and R4, and the search moves backward to A.
The simplest way to implement a log is to reuse one of the techniques in trajectory sampling. Instead of logging a packet we log a 32-bit hash of invariant content (i.e., exclude fields that change from hop to hop, such as the TTL) of the packet. However, 32 bits per packet for all the packets sent in the last 10 minutes is still huge at 10 Gbps. Bloom filters, described next, allow a large reduction to around 5 bits per packet.
4.4.1 Bloom Filters
Start by observing that querying either a packet log or a table of allowed users is a set membership query, which is easily implemented by a hash table. For example, in a different security context, if John and Cathy are allowed users and we wish to check if Jonas is an allowed user, we can use a hash table that stores John and Cathy's IDs but not Jona's.
Using a packet log to trace an attack packet P backwards from the victim V to the attacker A by having the currently traced node ask all its neighbors (the dotted lines) if they have seen P (solid line).
Checking for Jonas requires hashing Jonas's ID into the hash table and following any lists at that entry. To handle collisions, each hash table entry must contain a list of IDs of all users that hash into that bucket. This requires at least W bits per allowed user, where W is the length of each user ID. In general, to implement a hash table for a set of identifiers requires at least W bits per identifier, where W is the length of the smallest identifier.
Bloom filters, shown in Figure 4.8, allow one to reduce the amount of memory for set membership to a few bits per set element. The idea is to keep a bitmap of size, say, 5 N, where N is the number of set elements. Before elements are inserted, all bits in the bitmap are cleared.
For each element in the set, its ID is hashed using k independent hash functions (two in Figure 4.8, H1 and H2) to determine bit positions in the bitmap to set. Thus in the case of a set of valid users in Figure 4.8, ID John hashes into the second and next-to-last bit positions. ID Cathy hashes into one position in the middle and also into one of John's positions. If two IDs hash to the same position, the bit remains set.
Finally, when searching to see if a specified element (say, Jonas) is in the set, Jonas is hashed using all the k hash functions. Jonas is assumed to be in the set if
A Bloom filter represents a set element by setting k bits in a bitmap using k independent hash functions applied to the element. Thus the element John sets the second (using H1) and next-to-last (using H2) bits. When searching for Jonas, Jonas is considered a member of the set only if all bit positions hashed to by Jonas have set bits.
all the bits hashed into by Jonas are set. Of course, there is some chance that Jonas may hash into the position already set by, say, Cathy and one by John (see Figure 4.8). Thus there is a chance of what is called a false positive: answering the membership query positively when the member is not in the set.
Notice that the trick that makes Bloom filters possible is relaxing the specification (P3). A normal hash table, which requires W bits per ID, does not make errors! Reducing to 5 bits per ID requires allowing errors; however, the percentage of errors is small. In particular, if there is an attack tree and set elements are hashed packet values, as in Figure 4.7, false positives mean only occasionally barking up the wrong tree branch(es). More precisely, the false-positive rate for an m-size bitmap to store n members using k hash functions is
(1 -- (1-1/m)kn)k ≈ (1 -- ekn/m)k
The equation is not as complicated as it may appear: (1 - 1/m)kn is the probability that any bit is not set, given n elements that each hashes k times to any of m bit positions. Finally, to get a false positive, all of the k bit positions hashed onto by the ID that causes a false positive must be set.
Using this equation, it is easy to see that for k = 3 (three independent hash functions) and 5 bits per member (m/n = 5), the false-positive rate is roughly 1%. The false-positive rate can be improved up to a point by using more hash functions and by increasing the bitmap size.
Hardware implementation of packet logging using Bloom filters. Note the use of two-level memory: SRAM for random read-modify-writes and DRAM for large row writes.
4.4.2 Bloom Filter Implementation of Packet Logging
The Bloom filter implementation of packet logging in the SPIE system is shown in Figure 4.9 (the picture is courtesy of Sanchez et al.). Each line card calculates a 32-bit hash digest of the packet and places it in a FIFO queue. To save costs, several line cards share, via a RAM multiplexor, a fast SRAM containing the Bloom filter bitmap.
As in the case of counters, one can combine the best features of SRAM and DRAM to reduce expense. One needs to use SRAM for fast front-end random access to the bitmap. Unfortunately, the expense of SRAM would allow storing only a small number of packets. To allow a larger amount, the Bloom filter bit-maps in SRAM are periodically read out to a large DRAM ring buffer. Because these are no longer random writes to bits, the write to DRAM can be written in DRAM pages or rows, which provide sufficient memory bandwidth.
Network Security Algorithms
4.1 Searching for multiple strings in packet payloads
4.2 Approximate string matching
4.3 IP traceback via probabilistic marking
4.4 IP traceback via logging
4.5 Detecting worms
About the book
The chapters contain compiled knowledge from recognized industry experts on topics such as the theory and practice of network security technology, analysis and problem-solving techniques and core security concepts.
Printed with permission from Morgan Kaufmann, a division of Elsevier. Copyright 2008. Network Security: Know It All by James Joshi. For more information about this title and other similar books, please visit www.elsevierdirect.com.