What is Azure Stream Analytics?
Azure Stream Analytics is a fully managed, serverless engine by Microsoft for real-time analytics. It offers the possibility to perform real-time analytics on multiple streams of data from sources such as sensors, web data sources, social media and other applications.
When to use Azure Stream Analytics?
You have incoming live streaming data that you want to just store, or report on it with Power BI, or get insights by transforming it, then Azure Stream Analytics might be the service you are looking for. Azure Stream Analytics is a perfect solution if you want a fully managed service where you don’t have to worry about any infrastructure setup, and you pay only for what you use.
Azure Stream Analytics use cases:
- real-time dashboarding with Power BI (monitoring purposes)
- store streaming data to make it available to other cloud services for further analysis, logging, reporting, etc.
- transform and analyze data in real-time
- trigger workflows on certain conditions (e.g. run Azure Functions from Stream Analytics job)
- send alerts
- make decisions in real time
- machine learning (e.g. risk analysis, predictive maintenance, fraud detection, predict trends etc.), although for more advanced analytics it has limited usage
Azure Stream Analytics can be used if the input data is in an AVRO, JSON or CSV format and the application logic can be programmed in a query language like SQL. The whole programming in Azure Stream Analytics job is declarative and it doesn’t require you to be an expert in programming.
Alternatives for stream processing use cases: Azure Functions, HDInsight with Spark Streaming or Storm, Apache Spark in Azure Databricks.
How to get started with Azure Stream Analytics?
You need an Azure subscription to get started with Azure Stream Analytics and it can be hosted in a few minutes through the Azure portal, PowerShell or Visual Studio. To get started with some real-time analytics, you will need to create an Azure Stream Analytics job.
Azure Stream Analytics job is defined by:
- Input source of streaming data
- Query in a SQL-like language to transform data
- Output sink for the results of the data transformations
Key features:
- You can combine data coming from multiple streams
- You can use declarative SQL-based queries for data transformations
- You can stream the data to real-time dashboards with Power BI
- You can integration with Azure IoT Hub
- You only pay for streaming units used
- You don’t need to handle infrastructure
- You will automatically benefit from writing different partitions in parallel (increased throughput)
- Your jobs can be visually monitored
- You have recovery capabilities
- You can perform operations on data in temporal windows such as tumbling, hopping, sliding and session windows
- You have built-in geospatial functions
Limitation:
- It only supports SQL (you are limited to SQL-possible-transformation)
- Your input data needs to be AVRO, JSON or CSV
- You can only use blob storage to add static data
- You can only integrate with Azure services
- You can’t benefit from support dynamic reference data join
- There is no automatic scaling (scale job in Azure Portal)
Alternatives for Streaming Analytics
- Apache Kafka Streaming
Kafka is an open-source product which can run on Azure through HDInsight. It has a real-time streaming functionality (Kafka Streaming), yet it will only work if you leverage Kafka as an Event Hub (instead of for example Azure Event Hubs). Additionally, given it’s open-source, you are responsible for configuration and maintenance.
- Azure Functions
Azure Functions is a PaaS serverless service within Azure allowing users to specify functions in Python, .Net or JavaScript. It auto-scales and guarantees high availability and scalability. Through the use of Python, you can apply a broad set of transformations.
Our expertise
element61 has worked with Azure Stream Analytics on real-time analytics projects on many occasions in the past. We have a solid understanding of defining when Azure Stream Analytics would be the best solution for your organization’s real-time data analytics and to help you design an end-to-end solution.
Conclusion
Azure Stream Analytics is a fully managed serverless engine for performing real-time analytics on many different real-time data streams such as sensors, web sources, IoT devices, etc. Allows easy to work with UI for building real-time data streams, without the need to worry about setting up clusters, network, security etc. It’s a great engine to get started with real-time analytics in the cloud.
More information is available at the Microsoft website.
Contact us for more information on Azure Stream Analytics!