What is Azure Blob Storage?
Azure Blob Storage is a scalable cloud storage designed to store huge amounts of unstructured data. Azure Blob is a flat-namespace system and is the go-to-storage for storing massive amounts of unstructured data.
When to use Azure Blob Storage?
The Azure Blob Storage is recommended storage for a general purpose workloads. You can store any type of media data from streaming, text or binary data from applications, logs and backups. It has a full support for analytical workloads (batch and real-time data, machine learning data IoT etc.), but Azure Data Lake Gen2 is recommended for this type of use cases.
When to consider Azure Blob Storage vs. other storage services?
Azure Blob storage is a perfect fit if you have any or more of the following needs:
- You have unstructured files like images, video, audio, documents, logs, back-up date
- You want a general purpose storage
- You need low-cost storage yet want to make sure the data is highly available (possibly across multiple geographic regions)
When not to use Azure Blob Storage?
- You want to use this data for analytical purposes thus guarantee this data can be enabled for parallelized analytics (Azure Data Lake Gen2)
- You need to store relational data (in that case look into Azure SQL Database, Azure Databases for PostgreSQL and MySQL or Azure SQL Data Warehouse)
- If you need to perform advanced real-time querying (Azure Cosmos DB)
What is the difference between Azure Blob Storage and Azure Data Lake Storage?
Azure Data Lake Storage Gen2 is build on top of Azure Blob Storage and thus has the same benefits added with additional functionalities.
We recommend using Azure Data Lake Store Gen2 if your analytics is your most important need. Azure Blob Storage is a flat namespace storage where the users were able to create virtual directories, while Azure Data Lake Storage Gen2 has the hierarchical namespace functionality within its product. Renaming a folder thus becomes for Data Lake Gen2 a simple atomic operation avoiding the need to list through all the files.
What about the cost?
The pricing tier of the blob storage can be chosen based on the access frequency to the data in the blob storage
- if you need storage for data which is rarely accessed, Archive storage tier is the best option for it. Bear in mind that while the storage cost is reduced, the cost of accessing the data is higher.
- data which is infrequently accessed should be stored in a Blob storage with Cool storage tier. Storage costs are lower than for a Hot or Premium storage, but also the availability of the data is lower.
- frequently accessed data can be stored in a Hot storage or the Premium storage (preview).
The storage costs for Hot storage are higher than the previous tiers, while the costs for access are much lower. We recommend Hot storage as solid default for using Azure Blob Storage as a Data Lake.
How can you interact with Azure Blob Storage?
You can interact with Blob Storage through any of the below
- AzCopy – a command-line interface to be downloaded locally
- Azure Data Factory
- Azure Databricks
- Azure SDKs (.NET, Java, Python etc.) – allowing you to interact with Azure Storage directly within Python or R
- Azure Data Box Disk
- Azure Import/Export service
For simple operations you can also leverage the portal interface (‘Azure Explorer’) or the Azure Storage Explorer (which you can install on your laptop).
element61 has used the Azure Blob storage for various types of projects. We are Data Platform experts and with our knowledge on Azure we can help your organization when in doubt of what is the best cloud storage solution depending on your needs.
Azure Blob Storage is a highly scalable object storage for unstructured data like images, videos, audio, documents, etc. It can store massive amounts of data at high availability and accessibility from anywhere around the world.
More information is available at the Microsoft website.
Contact us for more information on Azure Blob Storage !