Microsoft PowerPivot: Microsoft’s answer to self serviced BI ?

PowerPivot, then still codenamed Project Gemini, was announced at the second annual Microsoft BI Conference in Seattle on October 6, 2008. The reason behind the pre-announced PowerPivot months earlier than it might normally have been done by Microsoft, was thanks to the competitive pressure from successful smaller, new vendors like QlikTech, as well as the recent acquisition of in-memory products by IBM (Applix TM1), Oracle (TimesTen) and SAP (BI Accelerator). PowerPivot is one component of the broader Business Intelligence & Data Warehousing Roadmap at Microsoft. In this insight, we take a closer look at the solution and what it can bring to the Business Intelligence user.

What is self serviced BI with Microsoft?

Building on its backend data warehousing strengths and in trying to address the remaining BI front-end weaknesses, Microsoft wants to give more power for creating report models to final business end-users. Microsoft's promise is to make it possible for business users to collect data from several sources, link them together on their desktop and create personal, self-serviced Business Intelligence, including additional calculations. Given the success of Microsoft Excel as a personal productivity tool, report creation on top of this data ideally can be done in the same favourite tool: Microsoft Excel. This is the exact rationale of the newly released Microsoft PowerPivot.

PowerPivot consist of 2 components:

  • The PowerPivot client is an application that is closely linked to Excel, but isn't actually part of Excel, it's an add-in to Microsoft Excel. This has as an advantage that not all users will automatically get the add-in when installing MS Excel and that PowerPivot and MS Excel can have their own life cycles and versions. So PowerPivot can launch a new version without having to wait for a new version of Excel. There is however a small limitation for the first version of the add-in; it requires the new Excel 2010 to work.


    With the PowerPivot add-in, you will create the report model by adding the necessary table and relationships, load all information and also create reports. You could call this the "development application” for PowerPivot.
Figure 1 : Microsoft PowerPivot report modeling & development environment

 
  • PowerPivot server is an application that is added to the SharePoint 2010 environment. With the server component you are able to publish the model to Microsoft SharePoint. This way other users can use the existing model to create their own reports. When centrally stored it is also possible to refresh the data automatically.

    Next to the added value for the users there is also a big advantage for IT. When the models are saved within the SharePoint environment, it is possible to monitor their execution and use.

    But also without the PowerPivot component for SharePoint it is possible to publish the Excel workbook, resulting from a PowerPivot analysis, to the server, as you would with any other Excel file.
     

How can PowerPivot add value to the existing data warehousing environment ?

Is the launch of PowerPivot then a plea from Microsoft to use Excel to store and collect all business data and do all reporting on top of Excel-spreadsheets? That whilst, data warehousing professionals have been evangelising the corporate world that Excel is not the right place to store company data and have been convincing management to invest in a corporate data warehouse, instead of living the "Excel hell”.

So, did Microsoft turn their back to classical Data Warehousing ? No, most definitely not! PowerPivot is in no perspective a replacement for a corporate data warehouse. The data warehouse is not only a collection of data from several sources. It is a central storage of data that is cleansed, aligned, loaded according corporate definitions and the values residing in these warehouses are validated and approved as corporate history. No local Excel-file can replace the value this brings to an organisation.

Figure 2: Microsoft PowerPivot report

 
 

So what benefits does PowerPivot bring, also to existing data warehouse environments ?

One of the problems business users are often facing today in a conventional BI environment is the time needed to add new information to the data warehouse, the OLAP cubes or the relational models and thus to the resulting company reports. Business users need to contact their Business Intelligence Competence Centre (BICC) or IT department to make a request for change. When IT or the BICC then is available to finally document requirements and finalise a proposal for creating the models, the need for the model can be long gone. Adding new data sources to a corporate data warehouse is and remains a time and effort-consuming task. This is where PowerPivot could come in the picture. With this "Self Serviced” solution it is possible for business users to create models based upon data in the corporate data warehouse, and if needed this data can be complemented with data from other sources, potentially locally held. This way the business user himself can create a new report model when needed. Another advantage of working like this is that business users will have a better grasp of what information is available, and what they want to do with it. This will also assist in creating better requirements for future data warehouse projects or enhancements.

Another use case is when data is acquired only once for a specific analysis needed at that time. Let's say, the customer relations manager acquired the results of a customer survey, and he wants to combine this information with the actual sales data from these customers. This might not be the kind of information you want to permanently load in your data warehouse. For this kind of volatile information, likely not to be reused later, it is difficult –in the current Microsoft tool stack- to easily combine the dataset with actual data from the data warehouse. With PowerPivot it is perfectly possible to combine both sources and this without the help of IT. The business user creates the links between both sources and can add additional calculated measures, if needed. Finally, he is able to create a nice report and/or dashboard based on that data. When the resulting analysis needs to be available for multiple people, he can even publish his PowerPivot results to the SharePoint environment.

Another unaddressed requirement in existing Microsoft Business Intelligence environments is the need to take the information offline. With the current solutions, you could create an Excel report or PDF file from a Reporting Services report, but then the information becomes static. Or you could create an offline cube file in Excel, but when you already get around the limitations of such a cube file, you still are challenged with very slow performance. And this is where PowerPivot is at its best. You can load information from your current corporate DWH system into a fast, special In Memory cube engine, called Vertipaq. So when an off-line solution is needed you can take the subset of information needed, build reports on top of that information, and use that information anywhere and anyhow you want, at the speed as if it was online. Of course, information would need to be reloaded/refreshed when new data becomes available online, but the model and reports can stay the same.
 

Warning: Danger !

Of course, giving this kind of freedom to the business user community also comes with a big danger. Business user creating their own reports, based on data that comes from ‘somewhere' can lead to believe that data warehouses are not needed anymore, and everyone starts making their own ‘versions of the truth' (again?). This could be taking us back to the pre data warehouse era where everyone had a different value for a company metric or measure and data quality was doubtable. This is not the road you want to take, also not when considering PowerPivot! The data warehouse still needs to remain the central data storage where as close to 100% as possible of all data needs to be stored that is used in corporate reporting. It can act as a perfect source for PowerPivot as it provides clean data and aligned metadata and master data, providing correct and user-friendly input to PowerPivot reports.

Therefore it will be important to develop a good policy on how PowerPivot will be used and introduced in existing environments with a data warehouse. The valid scenarios in which PowerPivot can be successfully introduced need to be managed in terms of this clear danger involved.

When someone has made a model in PowerPivot make sure it is published to the SharePoint environment. This way multiple people can make use of the same model and there will be no multiple versions on the network or in mail boxes. This also makes it possible for IT to monitor the models and keep track of the usage. If IT notices that a model is used very frequently by multiple users and over a longer periods of time, it could contact the owner of the model and decide to make it part of the corporate data warehouse.

Figure 3: Microsoft PowerPivot report with sparklines

 

 
As in most data warehouse environments you will have 2 groups of business users, the people who only consume the information, i.e. "Information Consumers” and the people who create the models/reports/dashboards/..., i.e. "Information Producers”. The first group of people will use the created reports that are stored in the SharePoint environment. For them, it is of no interest where the information resided, they only want to see the correct information when they need it. It is also not this group of users we need to target for the "PowerPivot policy”.
The group the policy on the PowerPivot use should be targeting will be the information producers, the Business Analysts and Power Users who make the models and reports. They have the possibility to use the new functionality to empower the organisation, or take the company back to the "Excel hell”.
 

Conclusion

Within the Microsoft BI community a lot of noise is being made about the new PowerPivot as a Managed Self Service Business Intelligence solution. According to Microsoft, PowerPivot will change the way business users will make use of Business intelligence.

From the side of BI professionals there were some doubts and questions like "Will this replace the Analysis Services component of SQL Server?”, "Will ‘giving power to the business users' impact the nature of the data warehousing work and architecture?” and finally "Is PowerPivot an alternative to a data warehouse” ?

First of all, Microsoft PowerPivot will not in any way replace the current or future needs for a data warehouse. Reporting is still best done on a single version of the truth, and with data that is the same for everyone.

But PowerPivot can help in speeding up getting information to the people who need it, when they need it. It can have an added value in co-existence with the existing data warehouse. You can use PowerPivot to take information off-line, to add information to your analysis from data sources that you do not want in your data warehouse or that you do not have in the data warehouse at that moment. It can help jump-start business to get quick information, instead of needing to wait for IT to provide the necessary sources or metadata-modelling changes in the data warehouse.

But it is imperative to watch out for misuse of the tool. We do not want to re-experience the Excel nightmare where the same data is living in multiple spreadsheets, with multiple calculations and/or sources for the same definition, all residing outside the management of ICT. So keep on building those data warehouses and use PowerPivot as an addition –to a clearly defined and controlled user-community of power users- to make your BI environment even more agile and allowing the business users to get the information they need, at the moment they need it.