Containers Expo Blog Authors: Elizabeth White, Pat Romanski, Yeshim Deniz, Flint Brenton, Gordon Haff

Related Topics: Containers Expo Blog

Containers Expo Blog: Article

Direct Indexing Enables Management of Legacy Tape Data

Tape remediation is quickly becoming the preferred method

"How many backup tapes do you have?"
"I have no idea - probably thousands."

"Do you need to keep them?"

"Why don't you recycle them?"
"Legal won't let us."

This might be a typical storage manager's response when questioned about a company's backup tape stockpile. These tapes are often created in response to a key objective of any IT organization - to protect enterprise data assets. Thus a mountain of old backup tapes has been amassed, largely tapes that have long outlived their disaster recovery purpose. Why not recycle or destroy all these old tapes? Federal regulations forbid it. Data on these tapes "may" be necessary to support current or future litigation. What data? A very, very small percentage of what exists, typically less than 1 percent. Why then keep all these tapes? Because it has been next to impossible to separate the useless data from what legal requires.

Sometime down the road, if not already, specific data from backup tapes will be requested by legal.  Some corporate legal teams have proactively issued a mandate to not touch tapes; others have been forced to do so. Either way, stricter regulations are forcing the issue. The June 2009 California Electronic Discovery Act, for example, declares all electronically stored information should be accessible and requires it to be produced. In January 2010 Judge Scheindlin, the judge on the groundbreaking Zubulake v. UBS Warburg case, issued an opinion where she denied the use of the burdensome argument, called out the defendant as grossly negligent, and issued sanctions against UBS Warburg for not collecting data from backup tapes to support the case. The courts are ruling more frequently against firms that do not produce data, including tape data, in a timely manner. Many cases exist today where fines have been imposed against the botched collection of historical files and email. Will your company be next?

Storing old tapes is not only a potential liability but also a wasted expense. Even if it costs only a few dollars a month to store a tape, those dollars quickly add up. In addition, since these old tapes cannot be recycled, new tapes must be purchased for ongoing tape backups. This expense, combined with the storage costs, quickly becomes a large item in the budget. This IT expense could easily be allocated to something more useful for the organization. This article discusses how to take a mountain of stored tapes and turn them into a molehill by extracting the relevant data and eliminating unnecessary tapes.

Consider Remediation
In the past it was far too expensive and difficult to understand the detailed content of old backup tapes. The content would first need to be restored and analyzed in order to determine what to keep and what is safe to purge. The restoration process uses the original backup software to remove data from tape and bring it back online in order to begin the discovery process. Restoring thousands or tens of thousands of tapes would be out of the question, taking too much time, money and legacy infrastructure. As a result IT departments have let the mountain of tapes grow taller every day - with no end in sight.

The problem has now been solved by applying a more intelligent approach and eliminating the need for expensive and time-consuming backup restoration. Direct indexing and extraction is a more intelligent process since it significantly streamlines the collection of ESI (electronically stored information) from tape.

Direct indexing technology scans tapes and then searches and extracts specific files and email without requiring the original backup software. This allows you to only deal with relevant files (less than 1 percent of the tape content) and not the bulk of useless content (the other 99-plus percent). In significantly less time an IT department can process tapes in-house, find what legal needs, archive it and make it available when it is needed. This efficient, cost-effective process enables tape remediation, allowing IT departments to recapture tape-storage budgets, while supporting legal with the data they need.

Automated Direct Indexing Illustrated
The new automated process is simple - no specialized skills or software are required. Assume a situation where there are 10,000 tapes in offsite storage. The first step would be to catalog the tapes to profile the content. Using a tape library, tape headers can be scanned in minutes, only requiring manpower to load the tapes. Once the scan is complete, the indexing technology can analyze the catalog and eliminate incremental backups, as well as backups of non-user data servers and blank tapes. This typically reduces the volume by 80 percent, turning a 10,000-tape job into a 2,000-tape job. Stopping here eliminates 80 percent of the tapes and achieves significant cost savings.

Once the cataloging is done the remaining set of tapes contains potentially responsive data that will support current and future litigation. The next step requires a full scan of the tapes. This generates a searchable index of the content and metadata without copying or modifying the existing tapes. Collaborating with legal, the search queries are defined (the management team's email, files related to a sensitive project, intellectual property documents, etc.). Legal can then search the index, tag what they want and request the data be extracted. IT will then run an extract job and all the tagged files and emails will be ripped from tape, keeping all the content and metadata intact. When this process is complete the tapes can then be recycled.

Details of a typical tape remediation project with 10,000 tapes using direct indexing are as follows:

If you combine the cost to store tapes offsite with the cost to acquire new tapes in support of the existing backup process it equals $430,000 per year. As the volume of tapes is growing each week, this number will continue to increase over time. In order to compute the payback for such a project you would need to break out the costs for the acquisition of a direct indexing product, the dedicated tape library, and manpower. The expenditure for manpower, tape libraries, hardware, and software will prove out an ROI in less than one year. This does not include any costs associated with ongoing litigation where tapes are pulled from storage for restoration. Such litigation support costs could easily reach hundreds of thousands of dollars annually, which would result in a shorter period of ROI.

In the past it was not cost-effective to remediate the mountains of tape stored offsite. Direct indexing technology now makes this feasible and is quickly becoming a best practice for any organization that is faced with constant legal events involving legacy data. Extraction using direct indexing technology does not require the backup software to access tape content. In addition, extraction leverages the index to understand data at a file and email level. By using direct indexing and extraction you can review the contents on tape, find relevant content and extract what is interesting. Direct indexing is a non-invasive scan of the tape that allows intelligence to be obtained about the contents: file types, dates, custodians, etc., and allows the selection and specific content to be gathered. Restoration requires you to first restore data before you can find the relevant content; it's a radically different process. The benefits of direct indexing over restoration are a clear savings of both time and money. As legal and IT work together, tape remediation is quickly becoming the preferred method to reduce corporate liability, and expand IT's ever-shrinking budget.

More Stories By Jim McGann

Jim McGann serves as Vice President of Information Discovery for Index Engines. He has extensive experience with the eDiscovery and Information Management. He is currently contributing to the Sedona working group addressing electronic document retention and production. Jim is also a frequent speaker for industry organizations such as ARMA and ILTA, and has authored multiple articles for legal technology and information management publications.

In recent years Jim has worked for technology based start-ups that provided financial services and information management solutions. Prior to Index Engines, he worked for leading software firms, including Information Builders and the French based engineering software provider Dassault Systemes. Jim was responsible for the Business Development of Scopeware at Mirror Worlds Technologies, the knowledge management software firm founded by Dr. David Gelernter of Yale University. Jim graduated from Villanova University with a degree in Mechanical Engineering.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.

@ThingsExpo Stories
"There's plenty of bandwidth out there but it's never in the right place. So what Cedexis does is uses data to work out the best pathways to get data from the origin to the person who wants to get it," explained Simon Jones, Evangelist and Head of Marketing at Cedexis, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Digital Transformation and Disruption, Amazon Style - What You Can Learn. Chris Kocher is a co-founder of Grey Heron, a management and strategic marketing consulting firm. He has 25+ years in both strategic and hands-on operating experience helping executives and investors build revenues and shareholder value. He has consulted with over 130 companies on innovating with new business models, product strategies and monetization. Chris has held management positions at HP and Symantec in addition to ...
Enterprises have taken advantage of IoT to achieve important revenue and cost advantages. What is less apparent is how incumbent enterprises operating at scale have, following success with IoT, built analytic, operations management and software development capabilities - ranging from autonomous vehicles to manageable robotics installations. They have embraced these capabilities as if they were Silicon Valley startups.
In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect at GE, and Ibrahim Gokcen, who leads GE's advanced IoT analytics, focused on the Internet of Things / Industrial Internet and how to make it operational for business end-users. Learn about the challenges posed by machine and sensor data and how to marry it with enterprise data. They also discussed the tips and tricks to provide the Industrial Internet as an end-user consumable service using Big Data Analytics and Industrial C...
René Bostic is the Technical VP of the IBM Cloud Unit in North America. Enjoying her career with IBM during the modern millennial technological era, she is an expert in cloud computing, DevOps and emerging cloud technologies such as Blockchain. Her strengths and core competencies include a proven record of accomplishments in consensus building at all levels to assess, plan, and implement enterprise and cloud computing solutions. René is a member of the Society of Women Engineers (SWE) and a m...
When talking IoT we often focus on the devices, the sensors, the hardware itself. The new smart appliances, the new smart or self-driving cars (which are amalgamations of many ‘things'). When we are looking at the world of IoT, we should take a step back, look at the big picture. What value are these devices providing. IoT is not about the devices, its about the data consumed and generated. The devices are tools, mechanisms, conduits. This paper discusses the considerations when dealing with the...
DXWordEXPO New York 2018, colocated with CloudEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term.
To Really Work for Enterprises, MultiCloud Adoption Requires Far Better and Inclusive Cloud Monitoring and Cost Management … But How? Overwhelmingly, even as enterprises have adopted cloud computing and are expanding to multi-cloud computing, IT leaders remain concerned about how to monitor, manage and control costs across hybrid and multi-cloud deployments. It’s clear that traditional IT monitoring and management approaches, designed after all for on-premises data centers, are falling short in ...
With privacy often voiced as the primary concern when using cloud based services, SyncriBox was designed to ensure that the software remains completely under the customer's control. Having both the source and destination files remain under the user?s control, there are no privacy or security issues. Since files are synchronized using Syncrify Server, no third party ever sees these files.
Cloud-enabled transformation has evolved from cost saving measure to business innovation strategy -- one that combines the cloud with cognitive capabilities to drive market disruption. Learn how you can achieve the insight and agility you need to gain a competitive advantage. Industry-acclaimed CTO and cloud expert, Shankar Kalyana presents. Only the most exceptional IBMers are appointed with the rare distinction of IBM Fellow, the highest technical honor in the company. Shankar has also receive...
In his session at 21st Cloud Expo, Carl J. Levine, Senior Technical Evangelist for NS1, will objectively discuss how DNS is used to solve Digital Transformation challenges in large SaaS applications, CDNs, AdTech platforms, and other demanding use cases. Carl J. Levine is the Senior Technical Evangelist for NS1. A veteran of the Internet Infrastructure space, he has over a decade of experience with startups, networking protocols and Internet infrastructure, combined with the unique ability to it...
"Cloud Academy is an enterprise training platform for the cloud, specifically public clouds. We offer guided learning experiences on AWS, Azure, Google Cloud and all the surrounding methodologies and technologies that you need to know and your teams need to know in order to leverage the full benefits of the cloud," explained Alex Brower, VP of Marketing at Cloud Academy, in this SYS-CON.tv interview at 21st Cloud Expo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clar...
Andrew Keys is Co-Founder of ConsenSys Enterprise. He comes to ConsenSys Enterprise with capital markets, technology and entrepreneurial experience. Previously, he worked for UBS investment bank in equities analysis. Later, he was responsible for the creation and distribution of life settlement products to hedge funds and investment banks. After, he co-founded a revenue cycle management company where he learned about Bitcoin and eventually Ethereal. Andrew's role at ConsenSys Enterprise is a mul...
Internet-of-Things discussions can end up either going down the consumer gadget rabbit hole or focused on the sort of data logging that industrial manufacturers have been doing forever. However, in fact, companies today are already using IoT data both to optimize their operational technology and to improve the experience of customer interactions in novel ways. In his session at @ThingsExpo, Gordon Haff, Red Hat Technology Evangelist, shared examples from a wide range of industries – including en...
"Space Monkey by Vivent Smart Home is a product that is a distributed cloud-based edge storage network. Vivent Smart Home, our parent company, is a smart home provider that places a lot of hard drives across homes in North America," explained JT Olds, Director of Engineering, and Brandon Crowfeather, Product Manager, at Vivint Smart Home, in this SYS-CON.tv interview at @ThingsExpo, held Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA.
Rodrigo Coutinho is part of OutSystems' founders' team and currently the Head of Product Design. He provides a cross-functional role where he supports Product Management in defining the positioning and direction of the Agile Platform, while at the same time promoting model-based development and new techniques to deliver applications in the cloud.
DevOpsSummit New York 2018, colocated with CloudEXPO | DXWorldEXPO New York 2018 will be held November 11-13, 2018, in New York City. Digital Transformation (DX) is a major focus with the introduction of DXWorldEXPO within the program. Successful transformation requires a laser focus on being data-driven and on using all the tools available that enable transformation if they plan to survive over the long term. A total of 88% of Fortune 500 companies from a generation ago are now out of bus...
delaPlex is a global technology and software development solutions and consulting provider, deeply committed to helping companies drive growth, revenue and marketplace value. Since 2008, delaPlex's objective has been to be a trusted advisor to its clients. By redefining the outsourcing industry's business model, the innovative delaPlex Agile Business Framework brings an unmatched alliance of industry experts, across industries and functional skillsets, to clients anywhere around the world.
Business professionals no longer wonder if they'll migrate to the cloud; it's now a matter of when. The cloud environment has proved to be a major force in transitioning to an agile business model that enables quick decisions and fast implementation that solidify customer relationships. And when the cloud is combined with the power of cognitive computing, it drives innovation and transformation that achieves astounding competitive advantage.
Headquartered in Plainsboro, NJ, Synametrics Technologies has provided IT professionals and computer systems developers since 1997. Based on the success of their initial product offerings (WinSQL and DeltaCopy), the company continues to create and hone innovative products that help its customers get more from their computer applications, databases and infrastructure. To date, over one million users around the world have chosen Synametrics solutions to help power their accelerated business or per...