What represents the space that exists between the end of the file and the end of the last cluster used by that file?

A lawyer’s technical understanding of how digital information is stored makes a difference in the context of e-discovery and how he best serves his clients

As e-discovery has risen into a major point of focus in modern litigation, it is important for lawyers without strong technology backgrounds to familiarize themselves with at least basic computer storage concepts. This blog post addresses some of the key points regarding deleted files and unallocated space, and how these concepts come into play in e-discovery.

Unallocated space, also referred to as “free space,” is the area on a hard drive where new files can be stored. Conversely, allocated space is the area on a hard drive where files already reside. Think of “allocated” storage space as already filled with data and not to be overwritten with other newer data, while “unallocated” space is available to store new data even though it may contain old data which would be overwritten by new data. While this may sound simple enough, to fully understand the properties of unallocated space, it is necessary to understand how files are stored on a computer.

Computer files are created in binary code (1s and 0s). A computer’s operating system reads a file by processing this series of 1s and 0s. When a user saves a file on a hard drive, it is stored using a file system that tracks the physical location of files in allocated space. These physical storage locations are called “sectors.” A sector is designed to hold 512 bytes of data. Depending on the type of encoding used, one character (e.g. the letter “C”) can take between one and four bytes, meaning that a sector will typically hold between 128 and 512 characters. Sequential groups of sectors (usually four or eight) are called “clusters.” On any physical storage device, there is a finite number of sectors, making them a scarce resource (even though hard drives often have hundreds of millions of storage clusters).

Since a hard drive has a limited amount of space, the file system tracks which clusters are in use and which ones are not. The file system does this by labeling a cluster with a “1” or a “0” value. “1” means that the space inside the cluster is being used to store all or part of a file (depending on how small it is). If the file is deleted, the file system labels the cluster as “0.” The change in the label does not cause the system to overwrite the data. Rather, the change signals to the file system that the space is available to store a new file. When a new file is stored, it will overwrite data on unallocated clusters and label these clusters with a “1.”

For example, assume that a user wants to delete a word document which we will call essay.doc. Assume that essay.doc is stored in multiple clusters. The file system would label the clusters associated with essay.doc as “1” since the clusters are being used to store the contents of essay.doc. If a user deletes essay.doc, the file system will label all the clusters where essay.doc was once stored as “0.” Since the cluster is a “0,” the file system knows these clusters are now part of the unallocated storage space, and available to be used to store the contents of another file.

At this stage, essay.doc is considered deleted by the file system, but is still “recoverable,” because some or all the information that comprised essay.doc has not yet been overwritten. When the clusters on which the contents of essay.doc are reused, however, the contents of the new file replace the contents of essay.doc. At this point, the overwritten content is considered “unrecoverable” even though the markers showing the existence of the overwritten file may still be indicated. As a result of this process, the file system will eventually reuse all of the clusters from essay.doc for other files, overwriting the content of the original essay.doc and making it unrecoverable.

Here is why a lawyer’s technical understanding of how digital information is stored makes a difference in the context of e-discovery. Under Federal Rule of Civil Procedure 26(f)(3)(C), parties may provide their views and make proposals for a discovery plan on “any issues about disclosure or discovery of electronically stored information[.]” Thus, a party has the chance at the discovery conference to attempt to include or exclude unallocated space from litigation holds or the definition of electronically stored information (ESI) for the matter. This is a critical aspect of any discovery plan. Often, parties fail to address the issue of unallocated space at this juncture, creating uncertainty, potentially unnecessary preservation costs, and the risk of being sanctioned for failure to preserve information contained in unallocated space.

In Genger v. TR Investors, [No. 592,2010 (Del. Supr. July 18, 2011)] for example, a Delaware court defined unallocated space as a “reservoir of data.” Unallocated space, however, cannot be properly viewed as an accessible storage area or “data reservoir” that can be intentionally managed by a computer user. Rather, unallocated space is a pool of storage resources that the file system “intends” to reuse as needed and may not contain any coherent data at all. However, under the definition of unallocated space the Genger Court adopted, significant amounts of data residing in unallocated space were required to be preserved and collected. The Court found that one of the parties failed to preserve unallocated space on the devices at issue, which, adjoined with other actions, constituted sufficient grounds to deliver substantial sanctions. While the Court also found that a status quo order does not necessarily encompass the preservation of the unallocated space, best practices would be to expressly establish early on with the court and the parties whether or not unallocated space is to be preserved and searched.

Thus, by allowing the Genger Court to accept a misleading definition of unallocated space, counsel opened the door to burdensome e-discovery obligations and sanctions that could have been avoided by making a more technically informed argument. Therefore, it is important for counsel to understand unallocated space from a technical perspective and tailor their e-discovery strategy accordingly.

by Casey Schmidt  |  January 10, 2020

File slack is an extremely complex technological issue. Fortunately, this article simplifies the concept for even the least technical users to understand. I’ll begin with the basics and then work my way up to some of the more technical parts of file slack, but don’t worry – even the advanced information will be presented in a way for all to understand.

What is file slack?

File Slack, also called ‘slack space’, is the leftover space on a drive where a file is stored. This space remains empty or left over because each cluster on a disk has a storage threshold and files are random sizes. Therefore, the files only fill a part of the hard drive portion. Sometimes it’s best to use a real world example and visualize what’s happening to truly understand what’s going on in a computer.

Imagine you have a bowl and lid. In this example, the bowl is a portion of a computer hard drive. Next to the bowl is a cup of pudding, in this case it is the file. You pour the pudding into the bowl, it fills the bowl about three-fourths, then you seal it. Note that the pudding hasn’t completely filled the bowl. The remaining space between the pudding and the lid is the file slack. When a file is saved, the computer puts it into a hard drive cluster on the hard drive (bowl), and it has some leftover space.

There is extra space around a saved file.

How to utilize file slack

File slack is a result of computer functionality and there are unintentional benefits of this semi-flaw. The most prevalent benefit occurs in the computer forensics field, as file slack allows users to locate files deleted from sectors. Deleting computer files doesn’t fully delete them – it just moves them. This provides investigators potential clues concerning the data erased by legal suspects on the hard drive.

Computer forensics solves notorious criminal cases by utilizing file slack data to find missing information or clues. One of the most recent, high-profile cases was in the investigation of American Secretary of State Hillary Clinton and deleted emails. The investigators explained the task of retrieving data was potentially possible because of the extra space in deleted files. This is just one example, as it has been used in many different legal cases involving computers.

Computer forensics can recover deleted data.

Further considerations

File slack is a reminder that it’s hard to erase something permanently from computers. Evaluate sensitive information often to determine whether it’s something important enough to keep off digital mediums. From basic things like a secret recipe for an award-winning stew, or more complex issues like business data, understanding slack space keeps companies protected.

This also shouldn’t scare anyone – in fact it should relieve people to know that not all information is truly lost. Sometimes, in fact, it’s fully retrievable. Furthermore, the idea of slack space helps the average user visualize and understand how data fills up on their computer and hard drive. It’s good to understand the basics of the tools we operate, similar to looking under the hood of an automobile from time-to-time.

Though file slack remains a rarely mentioned term, it resurfaces every now and then. Understanding its uses and possibilities keeps you ahead of the average user, researcher or business.

Toplist

Latest post

TAGs