Data leak prevention: Finding data before it's lost

The majority of businesses with a data classification system don't have any way to implement it. Data leak prevention products aim to solve this problem by identifying sensitive data either at rest on the host or as it traverses the network.

By Yuval Shavit, Features Writer

Data leak prevention (DLP) is a topic that's starting to gain popularity with businesses in the wake of heavily publicized data thefts, most famously the data breach at TJX in which more than 45 million credit card numbers were stolen. DLP -- also known as data loss prevention -- is a category of technology that aims to reduce intentional and unintentional leaks of sensitive data.

One of the biggest problems in planning for data leak prevention is that virtually no enterprises have a handle on how much sensitive data they have, where they have it, or where it comes from, said Nick Selby, research director of enterprise security at The 451 Group in Boston. Only about 45% of companies said they have a classification system to identify sensitive data, and nearly all of those said it's virtually impossible to enforce, Selby said. Without a good sense of which servers have data and how it's stored, it can be difficult to make sure employees don't send information to unauthorized third parties, either intentionally or unintentionally.

Data leaks are embarrassing at best, but they can also cause serious damage by exposing sensitive information about your client's customers or employees, said Mark Finegan, president of SIM2K, an Indianapolis consultancy. Data leak prevention may also be required by regulations like HIPAA or PCI-DSS. In addition to personal information, data leaks can expose information about your client's business operations that could give competitors an advantage, Finegan said.

Going through a company's data manually to flag sensitive documents is impossible, Finegan said. When a single page of text takes up about 2 KB, sorting through the gigabytes or terabytes a typical business has isn't feasible, he said.

Data leak prevention software can scan data to look for patterns that suggest sensitive data, like credit card information or Social Security numbers. There are two approaches to scanning data: network-based and host-based. Network-based software examines traffic on your client's LAN as it is sent, either internally or outside the firewall. Host-based software resides on a server, desktop or laptop and scans that computer's data.

Each approach to data leak prevention has its pros and cons, Selby said. Advocates of host-based software argue that if you can mark sensitive data as it's used by installing an agent on employees' computers, you can nip the problem in the bud. Endpoint host-based software also works when a laptop isn't on the corporate LAN. But it's very difficult to create a good agent that also plays well with all the variants of Windows and other data surveillance rootkits, like antivirus software, Selby said.

This technical hurdle has prompted established malware vendors to get into DLP, he said. Symantec bought DLP vendor Vontu in late 2007, Trend Micro announced its plans to acquire Provilla around the same time, and McAfee has developed its own product after buying out Onigma in late 2006.

Aside from data surveillance rootkits, host-based software can also spider through a computer's data. This approach works for data on servers; RSA's purchase in 2007 of Tablus, a vendor that makes DLP that spiders over servers, wasn't surprising given that RSA is owned by EMC, Selby said. But spidering software can be slow -- if the agent isn't installed on the server itself, the spider has to pull the data across the LAN, often in plain text, Selby said.

In contrast, network-based data leak prevention -- which is often delivered as an appliance that sits on the LAN -- can monitor data as it travels, which advocates argue is when the leakage actually occurs. Taking this approach lets an administrator look for red flags without disrupting employees who trigger false alarms. The downside of network-based DLP is that it can't work if an employee uses a laptop off the corporate LAN -- at a conference, for instance, or while working from home.

The data leak prevention market is still very young, Selby said. The 451 Group estimates that the entire industry was worth about $80 million in 2007, but hype and investment in the technology surpassed that, Selby said. The technology itself is still young, and companies that do use it have lightweight deployments that don't penetrate deep into the IT infrastructure. But he said more companies are starting to try the software out and learning what it can and can't do, and that interest is growing.

For now at least, the problem of what the software can and can't do is a major consideration, and the channel's role involves helping clients learn the products' limitations. In the next installment of our Hot Spot Tutorial on data leak prevention, we'll look at how leaks occur and how you can work with clients to stop them.

Dig Deeper on Cybersecurity risk assessment and management

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.