Process Data from Event Hubs Leveraging Azure Stream Analytics
Event Hubs is a streaming and event ingestion service on the Microsoft Azure Platform capable of receiving and processing millions of events per second. The service will allow you to process and analyze the massive amounts of data produced by your connected devices and applications.
By now, you might know more about Event Hubs when reading through the other tips. With Event Hubs, you can ingest large amounts of events and stream them to Azure Stream Analytics, a service in Azure which can quickly ingest, process and perform real-time analysis on data.
To hook Stream Analytics to your Event Hub, you can navigate in the Azure Portal to your Event Hub in your namespace. Next, select process data, and then choose to Enable real-time insights from events tile.
You will then see a query pane appear and notice that your Event Hub is the input and the output is an alias, which represents the results from the query. To test the query, you will need to have data in your Event Hubs – you can use an Event producer, explained in the tip on sending data to Event Hubs. Select Create in the Input preview pane as shown in the image below you will see a snapshot of your data (depending format of your data, which will be detected automatically – CSV/JSON).
Not you can switch to another format if the data allows it, e.g. if your event data is in json format, you cannot visualize that to CSV. You can, however, switch to raw format regardless of the format of your event data.
Finally, you can test the default query or a query you create with the Stream Analytics Query Language.
In case you are satisfied with your query, you can deploy the query, which will result in a new pane of the right-hand side. You specify the job name, and other details and click create. The Stream Analytics Job will be created for you, which you can modify by adding one or outputs.
Through the Event Hub namespace, you can quickly per Event Hub define an event data pipeline with Stream Analytics to ‘real-time’ query data for further analysis i.e. you can push the output of your query to a storage account for further processing.
Note that Stream analytics Job uses three streaming units (SU) by default, which you can scale up if necessary.