By Sam Debruyn
Welcome to the fifth part of a 5-part series on an end-to-end use case for Microsoft Fabric. This post will focus on the analytics engineering part of the use case.
In this series, we will explore how to use Microsoft Fabric to ingest, transform, and analyze data using a real-world use case. The series focuses on data engineering and analytics engineering. We will be using OneLake, Notebooks, Lakehouse, SQL Endpoints, Data Pipelines, dbt, and Power BI.
All posts in this series
This post is part of a 5-part series on an end-to-end use case for Microsoft Fabric:
- Fabric end-to-end use case: overview & architecture
- Fabric end-to-end use case: Data Engineering part 1 - Spark and Pandas in Notebooks
- Fabric end-to-end use case: Data Engineering part 2 - Pipelines
- Fabric end-to-end use case: Analytics Engineering part 1 - dbt with the Lakehouse
- Fabric end-to-end use case: Analytics Engineering part 2 - Reports
Use case introduction: the European energy market
If you’re following this series, feel free to skip this section as it’s the same introduction every time. 🙃
Since Russia invaded Ukraine, energy prices across Europe have increased significantly. This has led to a surge of alternative and green energy sources, such as solar and wind. However, these sources are not always available, and the energy grid needs to be able to handle the load.
Therefore, most European energy markets are converging towards a model with dynamic energy prices. In a lot of European countries, you can already opt for a dynamic tariff where the price of electricity changes every hour. This brings challenges, but also lots of opportunities. By analyzing the prices, you can optimize your energy consumption and save money. The flexibility and options improve a lot with the installation of a home battery. With some contracts, you could even earn money by selling your energy back to the grid at peak times or when the price is negative.
In this use case, we will be ingesting Epex Spot (European Energy Exchange) day-ahead energy pricing data. Energy companies buy and sell energy on this exchange. The price of energy is announced one day in advance. The price can even become negative when there will be too much energy on the grid (e.g. it’s sunnier and windier than expected and some energy plants cannot easily scale down capacity).
Since it’s quite a lot of content with more than 1 hour of total reading time, I’ve split it up into 5 easily digestible parts.
We want to ingest this data and store it in OneLake. At one point, we could combine this data with weather forecasts to train a machine learning model to predict the energy price.
After ingestion, we will transform the data and model it for dashboarding. In the dashboard, we will have simple advice on how to optimize your energy consumption and save money by smartly using a home battery.
All data is publicly available, so you can follow along in your own Fabric Workspace.
I am by far not the best at creating dashboards and reports. I know how to get something going and visualize what I want to see, but there are lots of people more experienced at this than I am. Nevertheless, to showcase how easy it is to create a Report with the Power BI experience in Fabric, I will create a paginated report to visualize our electricity pricing data. This is an excellent way to conclude our end-to-end use case through Fabric.
I've put this under the umbrella of analytics engineering as typically an analytics engineer will create the visualizations that go with the data marts. Of course, this is not set in stone and you might find different kinds of roles doing this task across different organizations.
Creating a report
Let's start by creating a new report. We can do this by clicking the + New report button in the top left corner of the Power BI experience in Fabric. Fabric then asks us to select a dataset to use for the report. Every Lakehouse on Fabric is automatically exposed as a dataset as well, so just look for the name of your Lakehouse from the previous posts and select it. Fabric can even automatically suggest a report based on your data. For now, we will just create a blank report. You can find the button to create a blank report under the chevron 🔽 button in the bottom right corner.
Our report is going to have multiple pages and on the first page, we're going to visualize the pricing data for Europe for the current month.
Our first visualization is going to be a map with circles indicating the average monthly electricity price for that country. This should give us a helicopter view of the pricing data for Europe. Click the Map 🌍 icon in the Visualizations pane to create a map. Then, on the Data pane, expand the
mrt_avg_price_per_month data mart and drag the
country column to the Location field that popped up in the Visualizations pane. Drag the
month_name_short to Legend. You can use the chevron 🔽 on the right to select Rename for this visual and type month to make sure users just see month there instead of the actual column name. Drag the
avg_month_price to Bubble size. Click on the chevron 🔽 next to the
avg_month_size you just dragged and make sure the Average aggregation is selected.
Next, switch the tab on top of the Visualizations pane from the Build visual to the Format visual tab. New tabs appear under a search bar. In the first tab, Visual, you can change the style of the map under Map settings. I picked Light. Under General and Title, you can change the title of the visualization to something like Average price per country per month.
This should give you a map that looks like this:
I'm happy with that result for now, so let's move on to the next visualization.
Average daily price per market in a line chart
The map visualization is great at giving us an impression of how electricity pricing in different countries compares to each other, but it doesn't tell us anything about the price fluctuations, how prices evolve over time, or how prices between different markets are related to each other.
In the next visualization, we're going to use a Line chart to visualize the pricing evolution on a daily basis per market. Click the Line chart icon 📈 to put a line chart on the canvas. Then, open the
mrt_avg_price_per_day data mart under Data and drag the
date column to the X-axis,
avg_day_price to Y-axis, and
market to Legend. Make sure the selected aggregation for the
avg_day_price is Sum.
Then switch the Visualization pane from the Build visual to the Format visual tab. Click on the General tab that just appeared and fill in an appropriate title like Average daily price by market.
This should give you a line chart that looks like this:
Resize and drag the visualizations to fill the page and you end up with a first page on the report that looks like this:
Right away, we notice that most prices follow the same trends, but that the Scandinavian countries have significantly lower prices than the rest of Europe.
Details for a specific market with a multi-row card
Let's add a second page to the report that shows us the details for a specific market. We can do this by clicking the ➕ sign at the bottom of the canvas. By right-clicking on the pages, you can also rename them. I renamed the first page to Europe - month and the second page to Belgium - day.
I would like to get a quick overview of the prices and for that, I chose a Multi-row card. So click on the icon for a Multi-row card to add it to the canvas. Then open the
mrt_simple_advice view under Data and drag the same
price_cent_kwh column 4 times to the Fields section on the Visualizations pane. Then, use the chevrons 🔽 to set the aggregations to Minimum, Average, Median, and Maximum respectively and rename the fields to Lowest price (cent/kWh), Average price (cent/kWh), Median price (cent/kWh), and Highest price (cent/kWh).
The problem we have right now is that this visualization shows us the prices for all markets and all days. We only want to see the prices for Belgium (or for your market if applicable). Therefore, expand the Filters pane to the left of the Visualizations pane and drag the
date and the
market columns to the Filters on this page section. For the
date, you can set the Filter type to Relative date and set the 2 dropdowns to is in this and day. For the
market, you can check the Require single selection checkbox and select your market.
Displaying advice using a table
For our last visualization, we're using a Table to display the advice itself. Click the Table icon to add a table icon to the canvas and drag the columns
simple_advice to the Columns. Use the chevrons 🔽 to rename the columns to start time, price (cent/kWh), and simple consumption advice. Also, make sure to set the aggregation for the
price_cent_kwh column to Don't summarize.
We then end up with our second page looking like this:
Finish your work by clicking Save report from the File menu and open the report in Reading view to see the final result.
This concludes our end-to-end use case for Microsoft Fabric. We've seen how to ingest and transform data, model it into data marts using dbt, and visualize it using Power BI. I have a special bonus prepared to showcase an interesting use case for the Data Activator in Fabric, but since that is still in private preview, I will wait until it is publicly available before publishing that post.
I know you could do a lot more with this data, the modeling, and the visualizations, but I hope this gives you a good impression of how easy it is to get started with Fabric and how you can use it to build a data platform that is easy to use for both data engineers and data analysts.