A comprehensive explanation on the difference between backup and archiving and how this knowledge can equip you to make better use of IT.
Where should your data sets lie in the debate with backup vs archiving?
Data Backup and archiving are mutually exclusive processes and have different objectives. Data backups are for disaster recovery (DR) and data archives are for e-discovery.
With data backups, the focus is on “current operational data”, and backups are intended to recover data when the primary source of data has either got corrupted or destroyed. This could be due to reasons such as accidental deletion or as the result of hardware failure. Speed of restoration is critically important.
Data archiving involves moving “older data” that is no longer actively used to a separate data storage device for long-term retention. Data archives consist of older data that is still important and necessary for future reference, as well as data that must be retained for regulatory compliance. Data archives are indexed and have search capabilities so that files and parts of files can be easily located and retrieved.. Speed of restoration from a data archive is not as critical as from a data backup, but searchability is vital.
For example, if a company is under investigation for any reason, they will be asked for emails with specific keywords or files in a particular directory. Data archives will show a history of files, where they existed, when they existed and who changed them. Backup systems are not good with performing any of these tasks.
Key factors to consider while archiving:
- Archiving is particularly important in highly regulated sectors, such as legal, financial, and healthcare. However, today, more and more businesses are archiving data as a protection for the business, and also to free up costly storage space.
- All data is not equally important. So it is important to analyse the types of data in the business and the lifespan of each, and an archival policy can then be developed.
- Also data needs to be stored in an open format, so as to future-proof the archive and ensure it will still be readable many years hence. The archive should be tested regularly to check it is still readable.
- Data can be stored on tape, disk or on the cloud. The key factors to determine are the length of time for keeping the data and the cost involved.
- Tapes can be used for archives, but need to be maintained and have a limited life, whereupon data must be transferred to new media. There are also several situations recorded where it has been impossible to recover from tape. (For this reason, Frontier only uses disks for archiving)
- One way of reducing the cost of archiving, is by first deduplicating data (i.e removing duplicates) and then storing what is really required.
- Auto-tiering solutions have been developed so that the most frequently accessed data can be placed in the fastest storage tier (usually SSDs) and the less frequently used data is moved down to lower cost, slower performance drives. An organization can benefit by just purchasing the actual amount of high-cost high performance SSD drives. This is seen as a “must-have” particularly in a virtualized environment.
Key factors with backup
- Speed of recovery is critical
- Data integrity needs to be preserved for the full duration of backup, maybe several months
- There should be multiple options available to backup, such as incremental differential and full backup.
- The backup should be copied or retained in a secondary location.
Read more about our archive and backup solution here: Frontier Vault Service
This article is written by Vish Rao
Read more News