Azure Blob Storage usage scenarios for Archival

Azure blob storage is a great way to store large amounts of data. It provides cheap highly available access to data anytime, anywhere. However the nature of data often changes and as a result there are three common methods or access tiers under which the data is stored in Azure. Developers are encouraged to consider Azure not as giant hard disk in the cloud but a more granular storage mechanism suited for different use cases.

Before we proceed there are three storage account types in Azure Storage

For all intents and purposes the focus is on General Purpose V2 since it is newer and offers more functionality. Other options are only meant for backward compatibility. Changing from one to the other incurs charges for data transfer so ideally you should start off with General Purpose V2

Inside of General purpose V2 we have three tiers

Each tier is meant for a different purpose and has a different characteristics when it comes to cost and IO

HOT Access Tier: – Readily available and has the lowest cost for data access but highest cost for Storage. What this means for us is when you have files that need to be accessed regularly ( read and write) it makes sense to store in a HOT access tier. However since the data has to be available at such short notice it needs to have a higher availability and therefore is placed in storage that is more reliable hence the higher storage cost. The best example of this would be regularly used files like company letterheads and official documents and templates , source code, etc.

The above screenshot shows the approximate cost / month for 1 TB of storage

The total cost of write and reads against this tiers is approximately 1% of the storage cost or 1600/month.

COOL Access Tier: – This tier has a slightly lower availability as the data is moved to slightly less available hardware ( down to 99% SLA from 99.99% for HOT) however the storage cost is significantly lower than HOT tier but there is a slightly higher access cost and penalties for deleting data within 30 days.

The cost of data access in this case is roughly 55% of the storage cost so it comes to 1700/month

This tier is best suited to storing data that is large but nor required frequently, the best example of this would be backups.

ARCHIVE Access Tier: – This tier offers the best cost effectiveness for long term backups essentially files that might never be accessed. Typically things like Old Fullbackups taken at the beginning for the year. Old system images, archived documents and images. This tier offers really cheap storage in the long term.

However the access cost for this tier is prohibitive in the short term

As you can see the cost of storage is offset by the cost of access which is three times the cost of cold storage. In addition the data is not readily accessible since it is kept offline. In other words if you want to read data stored in a Archive access tier you need to switch from Archive to HOT access tier first to bring it online and then access the data. This operation can take in excess of 10 hours.

So as you can see there are cost implications for the kind of storage tier you select so it’s important to know when to use what here is a simple guide

TierNeed instant accessNeed infrequent accessAlmost never accessed
HOT

USE

USE

COOL

USE

ARHIVE

You might ask why I have not recommended ARCHIVE for any case, here is why

If a blob is moved to a warmer tier (archive->cool, archive->hot, or cool->hot), the operation is billed as a read from the source tier, and the read operation (per 10,000) and data retrieval (per GB) charges of the source tier apply.

The day you finally decide to use it , you need to switch from ARCHIVE to COOL and this will incur read charges which as you can see from the screenshot is 4957/-. So the only way for Archival to break even in terms of cost would be store data in it for 1 year before accessing it.

References

https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiers