Microsoft Azure Blob Storage
What is Azure Blob Storage?
Azure Blob Storage is a scalable cloud storage designed to store huge amounts of unstructured data. Azure Blob is the go-to-storage for unstructured data.
When to use Azure Blob Storage?
The perfect use-case for Azure Blob Storage is to act as an unstructured Data Lake for analytics. A Data Lake should allow (1) to store any type of data and (2) to enable parallelized analytics.
Given Azure Blob Storage allows to store any type of data (documents, csv’s, jsons, anything) and it’s build on top of as Hadoop Distributed file-system (add link to https://blogs.msdn.microsoft.com/cindygross/2015/02/04/understanding-wasb-and-hadoop-storage-in-azure/) it’s perfect fit.
When to consider Azure Blob Storage vs. other storage services?
Azure Blob storage is a perfect fit if you have any or more of the following needs:
- You have unstructured files like images, video, audio, documents, logs, back-up date
- You want to have easily accessible data (accessible through multiple tools and API interfaces)
- You need low-cost storage yet want to make sure the data is highly available (possibly across multiple geographic regions)
- You want to use this data for Analytics purposes thus guarantee this data can be enabled for parallelized analytics
When not to use Azure Blob Storage?
- You need to store relational data (in that case look into Azure SQL Database, Azure Databases for PostgreSQL and MySQL or Azure SQL Data Warehouse)
- If you need to perform advanced real-time querying (Azure Cosmos DB)
What is the difference between Azure Blob Storage and Azure Data Lake Store?
Azure Data Lake Store Gen2 is build on top of Azure Blob Storage and thus has the same benefits added with additional functionalities.
We recommend using Azure Data Lake Store Gen2 if your analytics is your most important need. Azure Data Lake Store Gen2 differs from Azure Data Lake store as it embeds hierarchical namespace functionality within its product and doesn’t ask the user to run it on his compute environment. Renaming a folder thus becomes for Data Lake Store a simple operation.
What about the cost?
The pricing tier of the blob storage can be chosen based on the access frequency to the data in the blob storage
- if you need storage for data which is rarely accessed, Archive storage tier is the best option for it. Bear in mind that while the storage cost is reduced, the cost of accessing the data is higher.
- data which is infrequently accessed should be stored in a Blob storage with Cool storage tier. Storage costs are lower than for a Hot or Premium storage, but also the availability of the data is lower.
- frequently accessed data can be stored in a Hot storage or the Premium storage (preview.
The storage costs for Hot storage are higher than the previous tiers, while the costs for access are much lower. We recommend Hot storage as solid default for using Azure Blob Storage as a Data Lake.
How can you interact with Azure Blob Storage?
You can interact with Blob Storage through any of the below
- AzCopy – a command-line interface to be downloaded locally
- Azure Data Factory
- Azure SDKs (.NET, Java, Python etc.) – allowing you to interact with Azure Storage directly within Python or R
- Azure Data Box Disk
- Azure Import/Export service
For simple operations you can also leverage the portal interface (‘Azure Explorer’) or the Azure Storage Explorer (which you can install on your laptop).
Some screenshots showing interactions with Azure Blob Storage:
element61 has used the Azure Blob storage for various types of projects. We are Data Platform experts and with our knowledge on Azure we can help your organization when in doubt of what is the best cloud storage solution depending on your needs.
Azure Blob Storage is a highly scalable object storage for unstructured data like images, videos, audio, documents, etc. Can store massive amounts of data at high availability and accessibility from anywhere around the world.
More information is available at the Microsoft website.
Contact us for more information on Azure Blob Storage !