Amazon’s AWS S3 storage is probably best known as general-purpose object storage in the cloud. AWS subscribers can use it to create storage buckets and then fill those buckets with data. Even so, Amazon offers several other flavors of S3 storage, including a new storage option called S3 Glacier Deep Archive.
AWS S3 Glacier Deep Archive, due this year, is designed to act as an alternative to another AWS storage offering called S3 Glacier. AWS S3 Glacier storage has been a part of S3 for many years, and has always been Amazon’s go-to solution for data archiving.
As you would probably expect from an archival solution, AWS S3 Glacier is designed to accommodate massive amounts of data, at a cost that is far less than that of the standard S3 storage tier (more on those costs later).
Of course, Amazon makes it known that S3 Glacier sacrifices performance in the name of bringing down costs. S3 Glacier is designed to accommodate archival data that will be rarely–if ever–accessed. As such, data that is stored within S3 Glacier is not accessible in quite the same way as data that is stored in the standard S3 tier. While standard tier S3 data is usually accessible in a matter of milliseconds, it may take anywhere from a few minutes to several hours to retrieve a file that is stored within S3 Glacier.
The reason for the widely varied data retrieval rates is that Amazon allows its subscribers to set data retrieval policies for data stored in S3 Glacier. These policies determine how quickly archived data can be accessed, as well as the data storage cost.
As previously noted, AWS S3 Glacier has always served as a solution for data archiving. Most organizations have data that they need to keep for an extended period of time, even if that data is not actively being used. However, there is a big difference between archived data that gets accessed once in a while, and archived data that must be retained but that will probably never be accessed again. This is where Amazon’s new S3 Glacier Deep Archive comes into play. S3 Glacier Deep Archive is a solution for storing archive data that only needs to be accessed in the rarest of circumstances.
Like S3 Glacier, S3 Glacier Deep Archive is an object storage solution with 11 nines of durability (99.999999999%) and support for multiple availability zones. Likewise, S3 Glacier Deep Archive supports the use of lifecycle transition policies and data retrieval policies. However, the similarities end there.
AWS Glacier Deep Archive is intended to be a semi-permanent data storage solution, and this is reflected in the way that Amazon bills its subscribers. When a subscriber adds data to S3 Glacier, Amazon imposes a minimum storage duration charge of 90 days for storing that data. In other words, if someone uploads a file to S3 Glacier and then immediately deletes the file, Amazon will bill that person the same amount as it would if the data had remained in place for 90 days. This practice is presumably intended to condition subscribers to treat S3 Glacier as a data archive, rather than a low-budget general-purpose storage solution.
Amazon also imposes a minimum storage duration charge for S3 Glacier Deep Archive. Rather than the 90-day minimum that Amazon sets for S3 Glacier, however, the minimum storage duration charge for S3 Glacier Deep Archive is based on 180 days.
Another key difference between S3 Glacier and S3 Glacier Deep Archive is the speed with which data can be retrieved. As previously mentioned, it can take anywhere from a few minutes to several hours to retrieve S3 Glacier data, depending on which data retrieval policy is in use. In contrast, Amazon advertises that S3 Glacier Deep Archive data can be retrieved in 12 hours or less. Amazon is also planning to offer a bulk retrieval option that will allow subscribers to retrieve petabytes of data in 48 hours or less.
Of course, the biggest difference between S3 Glacier and S3 Glacier Deep Archive is the storage cost. There are many factors that affect the cost of standard S3 storage, including the volume of data (the cost per GB changes based on volume) and the storage region. However, a fairly typical cost estimate for standard S3 storage is about $0.023 per GB per month, or roughly $22.55 per terabyte per month. By way of comparison, the lowest priced option for S3 Glacier storage is $0.004 per gigabyte per month, which is about $4.10 per terabyte per month. In contrast, it costs $0.00099 per gigabyte per month to store data on S3 Glacier Deep Archive. This works out to about $1.01 per terabyte per month.