Strategies for Archiving in Hybrid Environments


The drivers for archiving electronic content in a particular organization depend on a number of factors, including its corporate culture, senior management’s appetite for risk, the regulatory obligations it faces, the geographies in which it operates, and a variety of others.

The survey conducted for this white paper asked decision makers and influencers to rate the various drivers for electronic content archiving on a scale of 1 (not a driver) to 7 (a major driver), and also how these were changing over time. As shown in Figure 1, the most important drivers in 2017 are legal and contractual requirements to retain data for specified periods, regulatory compliance obligations, disaster recovery/business continuity, and eDiscovery.

While all of the drivers for archiving electronic content will become more important over the next two years, two findings from the research are noteworthy:

  1. While eDiscovery is today a fourth-place driver (albeit a close fourth place) for archiving, it will become tied for the most important driver in just two years’ time.
  2. The drivers for archiving that will grow in importance most quickly over the next two years are extracting insight and intelligence from archived data (growing in importance as a key driver by 50 percent) and giving employees the ability to search for their old content (33 percent). Admittedly, these are the least important motivators today for organizations to archive their electronic content, but a growing number of decision makers understand the importance of using their archived content for new and imaginative applications to business problems. 

Osterman Research is a supporter of the view that archiving should be used as a tool to gather intelligence about an organization and gain competitive or other advantages based on the insight gleaned from this information. For example, a huge amount of information is stored in data archives, such as emails, spreadsheets, social media posts, memos, graphics files, presentations, voicemails, contacts, databases, CRM data and other data types. This content is generated by and stored in a wide variety of venues. The traditional view of archiving will preserve this content in the event it is needed in the future – a proactive view of archiving will perform analytics on this content to search for meaningful insights that can be extracted from it.


Organizations are subject to a host of legal and contractual requirements, and must manage their eDiscovery process and control the costs associated with eDiscovery. Every organization – regardless of its size, the industry it serves or how much data it possesses – must retain important records for various lengths of time. The requirement to retain data is imposed from a variety of sources, including legal precedent (courts establish standards for the length of time that data must be retained), statutory obligations ( specifically defining the retention and production obligations for certain types of data), and internal best practices. Retention obligations apply to all forms of data, both physical and electronic. Organizations that reasonably anticipate pending litigation may also need to subject certain electronic content to a legal hold period that is different from their standard policies. A centralized archive can facilitate that process.

If eDiscovery is managed using a centralized and properly maintained archive, organizations are generally much more capable of addressing their litigation requirements and controlling the costs associated with those activities. In addition, for organizations that have frequent or extensive litigation or investigations, proactively addressing eDiscovery in a systematic way can significantly reduce overall eDiscovery expenses and other costs of litigation.

Easy search and access to electronic records, particularly across the multiple siloes in which an organization’s data is stored, can permit legal counsel to evaluate the merits of a case before investing substantial time, money and effort in electronic records retrieval. In short, legal counsel and senior management can make better decisions about whether to fight or settle a lawsuit by having easy access to all archived content. 


A large proportion of the electronic records that pertain to an organization’s business activities are subject to regulatory compliance obligations, which vary by industry and jurisdiction. It is important to note that virtually every organization and industry faces some level of regulatory compliance obligation to retain its records, and that retention obligations are not limited to “regulated” organizations or industries, since there is no such thing as an “unregulated” one. A few examples of data retention requirements outside of industries that are normally considered to be “heavily regulated”:

  • US and foreign air transport carriers must retain for three years the complaints they receive from individuals with disabilities who use these carriers.
  • Employers of homeworkers in the clothing, jewelry and related industries must retain for three years any documents related to stop watch time studies or other work measurement methods used to demonstrate piece rates so that these employers can prove that employees are making at least minimum wage.
  • Bottlers involved in the labeling and advertising of distilled spirits must retain for five years certificates of age and/or origin for spirits imported to the US in bulk where those spirits are bottled and removed from the plant.

These regulations require the retention of content such as financial documents, email correspondence between organizations, employee records, invoices, shipping information and a variety of other data. In fact, even metadata must be preserved – the Supreme Courts of both Arizona and Washington State have ruled that metadata must be retained along with other records. 

Among the more heavily regulated verticals worldwide is the financial services industry. In the United States, for example, rules of the Securities and Exchange Commission (SEC) and the Financial Industry Regulatory Authority (FINRA) require members of national securities exchanges, brokers and dealers to preserve securities transaction records for a minimum of six years, the first two years in an easily accessible place. In Canada, records of purchase and sell orders of securities must be retained for seven years, the first two years in an easily accessible location. And in the United Kingdom, investment service and transaction records must be retained for at least five years.

The consequences to financial services firms of not complying with these retention regulations can be severe and typically involve the imposition of significant financial penalties.

Another heavily regulated industry is healthcare. For example, the Health Insurance Portability and Accountability Act of 1996 (HIPAA), requires organizations to protect patients’ electronic health information from unauthorized users and to retain such information for six years. Non-compliance with HIPAA requirements could result in fines of up to $50,000 per violation, or criminal penalties of $250,000 and up to 10 years in prison for violations based on intent or malice.

Virtually every organization, even in industries which are not considered heavily regulated, are subject to electronic content retention requirements and the consequences of non-compliance with requirements, as discussed above.


Aside from the enormous fines associated with violation of the European Union’s General Data Protection Regulation (GDPR) – up to €20 million or four percent of an organization’s annual turnover – there are some important implications to consider for organizations that possess data on residents of the EU. For example:

  • Article 15 of the GDPR gives data subjects the right to ask any entity that possesses or processes his or her personal data (a data controller) to produce that data on demand. These individuals also have the right to know if and when their data is transferred to a third country or to an international organization, along with whatever safeguards are in place to ensure on-going protection of the data after it has been transferred. A data controller must provide a copy of any personal data that is being processed at no charge the first time it is requested.
  • Article 17 states that, subject to certain conditions, a data subject has the “right to be forgotten” by any data controller that possesses or controls his or her information.
  • Article 30 requires that data controllers keep records of their data processing activities, with a list of specific information to be retained for each record.

Moreover, implementing the right organizational and technological safeguards on all production systems that contain personal and sensitive personal data is essential, but it isn’t enough. Sufficient controls are required for:

  • Copies of production databases that contain personal data taken for testing, development, or analytics purposes.
  • Spreadsheets and other data sources populated by exporting customer contact and profiling details for a mail merge.
  • Email archives, whether stored on-premises, in cold storage or in the cloud are likely to contain personal data that must be protected under the GDPR.

The GDPR imposes a major burden on any organization that has data on residents of the European Union, requiring a level of data retention and management that is on par with the level of effort required for eDiscovery activities. Plus, these activities must often be performed without charging those who request information, and so archiving and related activities must be efficient and easy to use. In short, compliance with many of the key provisions of the GDPR will not be possible without a robust archiving capability.


An archiving system can help enable storage management by indexing content and making it more accessible and discoverable. This is particularly important for organizations that must respond to frequent retrieval requests for email and files because it can dramatically reduce the time employees spend looking for, filtering and producing data. Sunshine-law and Freedom of Information Act (FOIA) requests are two common types of requests, but there are numerous others.

An archiving system can also improve email and other system performance by minimizing the amount of “live” data that must be stored on active servers.  Because electronic data like old email messages and files older than 30 days are accessed relatively infrequently, it often makes sense to move this content to an archiving system for better system performance. This can reduce the amount of time required to backup email and data servers, it can speed the time to restore a server from backups, and it can reduce the amount of overall downtime experienced in key systems.


An organization’s email and other electronic content constitute one of its most important business knowledge repositories. Some analysts have estimated that the majority of an organization’s intellectual property is contained in its messaging systems. Even if that is overstated, an organization’s electronic content does contain important (structured and unstructured), employee-generated information critical to its growth, ongoing operations and profitability, competitive advantage, and its ability to innovate.

To satisfy employees’ constant need for business information, email, collaboration tools and other electronic content repositories are often relied upon as the primary tools used for work. For example, an employee may need to locate stored emails quickly so he or she can review their own email correspondence or other content, such as attachments, in email. Alternatively, a new employee may have to trace back email and other electronic content between his or her predecessor and a customer. 

Employees are also extracting business intelligence and data from electronic content servers. This makes the preservation and availability of the content extremely important. An organization that does not store its important content adequately risks the loss of information that it has paid employees to create.


The drivers and needs for archiving are changing over time and organizations, including those operating hybrid environments, must be able to adapt. For example, cyber security has emerged as a driver for archiving and for preserving content from bad actors or those seeking to deploy malicious cyber attacks.

Regulations are evolving and archiving requirements are typically getting more stringent. Newer regulations like the European Union’s General Data Protection Regulation (GDPR) and the New York Department of Financial Services (NYDFS) implementation of Cyber security Requirements for Financial Services Companies (CRFSC) are two examples of the changing nature of the archiving challenge.

Strategies and Best Practices for Hybrid Archiving