Go to Google and type in the words "Why do BI projects fail ?".
You can select any of the sites listed in your result set, and every single one will mention data quality issues as one of the top reasons. In many of the Business Analytics or even Performance Management (Planning & Budgeting, Consolidation) projects we execute as element61, we are confronted with the importance of qualitative information and its impact on project success. The lack of data quality in the customers' data -and the underestimation of the problem by the customer- often leads to -at best- considerable delay in project Go-Live.
When looking into the main issues with data quality, Master Data is often the main cause of many problems. In this Insight, we will explain what Master data is, why these data sets are often causing problems and what different solutions are available to resolve these issues.
First of all it is important to understand what exactly Master Data is.
There are three main types of data that are being captured and maintained within organizations:
- Transactional Data - Data that is being generated by applications in supporting business processes of the organization
- Analytical Data - Data that is calculated and/or derived from transactional information to support the decision making of the organization
- Master Data - represents business objects upon which transactions are done and the dimensions on which analysis is conducted
Maybe it is best explained by using a sentence as an example.
Consider the following sentence:
"Customer XYZ placed an order for 5 Product ABC on August 8th 2014 for a total of 250.
This sentence describes a business process of a company that just sold 5 units of a product to a customer. In other words, it is a fact that is described and can be considered as a transaction. The customer and the product describe the transaction and are -as such- key for understanding the fact that is described. "Customer" and "Product" can be considered as master data. The invoicenumber, the quantity, the total amount, the VAT amount ... could all be considered transactional data. Examples of "analytical data" might be the Average Order value, the average quantity ordered and the DSO (Days Sales Outstanding), calculated by customer.
These three types of data thus interrelate to one another in various forms.
Below picture gives a possible high level architecture of these types of information in an enterprise:
It is clear that Master Data plays a crucial role in the captioning, processing and understanding of data within companies. Various types of Master data exist and of course will vary depending on the company and the business they are in.
However there are a lot of Master Data elements that will often return, like the ones show in the below picture.
These business objects can be considered as fairly generic and will be applicable in some form or another within most companies.
Master Data Management
Now we understand what Master Data is, we can move on to the problem that arises when trying to manage this set of information, across divisions of a company or organisation. As Master Data is used in various applications (ERP, SCM, CRM, ...) and business processes throughout an organization, issues are bound to occur -and they can have a big impact- if not carefully managed.
The problems arising with Master Data can generally be classified into the following categories:
- Data Redundancy
As the master data is so critical in various business processes, it is often maintained in different applications by different people as they all require this information for different purposes. Customer information maintained by sales people will have different attributes for example as customer information required by finance. As a result a lot of redundant information is stored and maintained, increasing the overall cost for the organization.
- Data Inconsistencies
Next to the data redundancy problem, another typical problem is that the data is inconsistent between different applications. The root cause for the problem is the same as for data redundancy, but resolving inconsistencies is often more time consuming as opposed to dealing with redundancy. Typically, these problems will emerge when consolidating information from different applications, for example upon loading the information into a data warehouse.
- Business Inefficiency
Due to problems in having accurate and redundant information, business processes may suffer the consequences and can drive up the cost and performance of organizations. Especially when looking at the problem holistically, considering the end-to-end process flow of a company, these problems can have a big impact. Consider the order-to-ship, ship-to-bill, bill-to-cash process flows, that all need to be executed based on different versions of the master data. Examples of business inefficiency could be : a shipment made to a wrong shipping address or an invoice remaining unpaid because send to a wrong billing address. Quantifying these inefficiencies is not always easy, but Gartner estimates the cost can be as high as 20% of the operational budget for an average organization.
- Supporting Business Change
As the Master Data is often maintained in various applications by various people, a change in business concepts causes significant workload. Change has become the norm for organizations: products and services are introduced and withdrawn, companies are acquired and sold, new technologies appear, companies are restructured or new legislation comes into play. These disruptive events cause a constant stream of changes to master data, and without a way of managing these changes, the issues of data redundancy, data inconsistency, and business inefficiency are exacerbated even further.
The art, because yes it can be considered just that, of managing this master data is referred to as Master Data Management. Various definitions of the term exist, but below are some of the most frequently used:
Master data management (MDM) is a technology-enabled discipline in which business and IT work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprises official shared master data assets.
Master data management (MDM) is a comprehensive method of enabling an enterprise to link all of its critical data to one file, called a master file, which provides a common point of reference. When properly done, MDM streamlines data sharing among personnel and departments.
In addition, MDM can facilitate computing in multiple system architectures, platforms and applications.
Important to note in these definitions is that MDM needs to be considered as a process that needs to be installed, not just a one-off project that can be executed -with a fixed beginning and ending- that will solve all problems once and for all. It is a process that will require a close collaboration between IT and Business, across different departments and functions.
Different architecture models of MDM
As every company is unique, with its own set of challenges, IT landscape and business processes, various different architecture models exist for implementing MDM.
Each of these has its advantages and drawbacks, and making the right choice is not always straight-forward. The available budget, the IT landscape, the existing organizational structure, the people involved and their skillset, all need to be carefully considered when deciding on how to move forward with Master Data Management in an organisation. In general, there are three main architectures that can be distinguished regarding MDM architectures.
- Registry Architecture
This architecture provides a read-only view to master data for downstream systems which need to read but not modify master data. This implementation architecture is useful to remove duplicates and provide (in many cases federated) a consistent access path to master data.
The data in the MDM System is often only a thin slice of all the master data attributes which are required to enforce uniqueness and cross-reference information to the application system that holds the complete master data record. In this scenario, all attributes of the master data attributes remain with low quality without harmonization in the application systems except for the attributes persisted in the MDM System. Thus, the master data is neither consistent nor complete regarding all attributes in the MDM System. The advantage of this architecture is that it is usually quick to deploy and with lower cost compared to the other architectures. Also, there is less intrusion into the application systems providing read-only views to all master data records in the IT landscape.
- Hybrid Architecture
This architecture fully materializes all master data attributes in the MDM System. Authoring of Master Data can happen in the MDM System as well as in the application systems. From a completeness perspective, all attributes are there. However, from a consistency perspective, only convergent consistency is given. The reason for this is that there is a delay in the synchronization of updates to master data in the application systems distributed to the MDM System. This means, consistency is pending. The smaller the window of propagation, the more this implementation architecture moves towards absolute consistency.
The cost of deploying this architecture is higher because all attributes of the master data model need to be harmonized and cleansed before loaded into the MDM System which makes the master data integration phase more costly. Also, the synchronization between the MDM Systems and application systems changing master data is not free. However, there are multiple benefits of this approach that are not possible with the Registry Architecture implementation:
- The master data quality is significantly improved.
- The access is usually quicker because there is no need for federation anymore.
- Workflows for collaborative authoring of master data can be deployed much easier.
- Reporting on master data is easier as now all master data attributes are centralized.
- Repository Architecture
With this architecture, master data is consistent, accurate and complete at all times. The key difference to the Hybrid Architecture is that both read and write operations on Master Data are now done through the MDM System. Achieving this means that all applications -with the need to change master data- invoke the MDM services offered by the MDM System to do so.
As a result, absolute consistency on master data is achieved because propagation of changed master data causing delay no longer exists. Deploying an MDM solution with this architecture might require deep intrusion into the application systems, intercepting business transactions in such a way that they interact with the MDM System for master data changes or the deployment of global transaction mechanism such as a two-phase commit infrastructure.
The MDM market has been rapidly evolving over the last couple of years, and the reason for that is the broad range of requirements that an MDM system should be able to handle. As the nature of the data, the types of technology used, the different front-end requirements related to data quality and data stewardship vary heavily between companies, the MDM market is very wide.
For that reason for example, Gartner has split up MDM in two Magic Quadrants: one for Customer Data Solutions and one for Product Data Solutions. The Customer Data Solutions focusses on delivering qualitative customer information across systems. These solutions are mainly used in companies with many customers, mostly in B2C environments.
As the name suggests, the Product Data Solutions are used to maintain product or other "thing-like types of information. Gartner does not provide an overall MDM Magic Quadrant, as they believe the MDM market is still too diverse. Furthermore there are very few MDM solutions that can cover all of the aspects of MDM.
The IBM offering of MDM products is situated under the wings of InfoSphere, as with all other data related products they offer. As a result, these products also follow the broader vision and strategy of Infosphere to develop a comprehensive data platform, including Big Data, Data Quality, Data Governance, Integration and Master Data Management. This vision and strategy appeals more to larger organizations, who have to manage big amounts of scattered information. It can be stated that IBM currently has the only true multi-domain MDM solution that can cover all the aspects of MDM.
Their products include InfoSphere MDM Advanced Edition, Collaborative Edition and the Enterprise Edition. Each of these product lines have specific characteristics and pricing. Especially the Collaborative Edition has a strong focus on handling complex workflows in combination with a flexible data model, so it can be used in a variety of situations.
Informatica is mainly known for producing excellent ETL tools. However they also offer a broad range of other data management products, including MDM solutions. In 2013 they acquired Heiler, a company specialized in MDM and data quality and they still deliver their products under that company name. Informatica itself also had its own multi-domain MDM solution that focuses on Customer or party information, but it can also be used to handle other types of information such as products. They are now shifting their vision from delivering MDM products towards delivering a true MDM platform. As such their solution can be seen as multi-domain, but the data governance aspects is not covered to the extent of for example the IBM offering.
Oracle has a big range of MDM products, and as this business is growing at a nice pace for them (10% in 2012) they are continuing to invest in this area. Oracle Product Hub is their tool for managing product information and is usually deployed within Oracle E-Business Suite (EBS) but it can also be deployed as stand-alone. Customer Data Hub is more used for managing party information, or you can also chose Siebel Universal Customer Master (UCM) or Fusion MDM. It is currently unclear what the strategy for MDM of Oracle will be, and what products will be kept and which ones will disappear. Their portfolio is very broad with various products, but it can also be considered their weak point.
The biggest worldwide ERP vendor, SAP, also offers MDM tools to its customers. Master Data Governance for Customers (MDG-C) is their main product for handling party information, and they also sell Netweaver Data Management for consolidation and for data stewardship they have Information Steward. For product information Master Data Governance for Material (MDG-M) is offered, which also can be expanded with Information Steward capabilities. All of these products are also available on the HANA platform. Important to note is that these products are only offered to SAP ERP customers and not as stand-alone products. Also, their product strategy regarding MDM tool is not always clear to its customers.
Even though they are not listed as such by Gartner, Microsoft has been expanding its SQL Server Platform with Master Data Services (MDS) as of version 2008r2, but as of version 2012 it became mature. It has its own data store and is very flexible in deployment, allowing it to be used for a variety of applications. As it is a component of SQL Server, license cost is very acceptable and for some SQL Server versions (SQL Server Business Intelligence & SQL Server Enterprise) it is included in the price. As the system is flexible and uses an Excel-like front-end for its users, the system can quickly be deployed.
For more information on Master Data Services 2012, see also our Insight "Master Data Management in SQL Server 2012: Use case of managing Data warehouse Dimensions with Master Data Services".
Role of Master Data Management in Business Analytics
Both MDM and BA consolidate information that is considered crucial for an organization.
Yet both have their own specifics and can exist with or without each other. MDM will deal solely with Master Data, while Business Analytics will also require transactional information in producing analytical information. Master Data objects typically will produce the "dimensions" of our analytics and potentially end up as such in the Data Warehouse starschemas.
As such, Business Analytics can be considered as a stakeholder of MDM and as one of the main beneficiaries. In fact, a Data Warehouse initiative is often the cause of an organisation to discover (the magnitude of) the Master Data problems. It is a missed opportunity if the Master Data issues are resolved just in function of the DWH (eg. in the Data Quality layer of the ETL).
Due to the fact that the data is being consolidated into a single repository (of course depending of the architecture that is selected), MDM often serves as a source for the data warehouse.
Important to note is that if MDM will cover for the master data, creating "golden records"/"360 degrees view" for customers for example, that indeed will greatly benefit the data warehouse. But next to MDM, the aspects of data quality and data governance also play a crucial role in the data management processes. MDM without imposing proper data governance on top, runs a high risk of failing.
Data Quality is also a broad term, but the scope of data quality spans across all forms of data in an organization, including transactional information used in Business Analytics. It is hence important that all of these aspects are considered when developing a Data Management Strategy, which can then be rolled out in phases. All of the domains are linked, but considering them separately often leads to project failures.
As MDM brings benefits to BA, so can BA bring benefits to MDM. Business Analytics can provide insight in the MDM processes, and potentially issues & improvements over time. BA can for example monitor how well the master data is being managed: How many new customers have been created? How many attributes have been updated? How many unique customers do we have? Etc Combining the two disciplines can also bring business benefits. For example, once MDM creates golden records for customers, BA can easily provide answers to questions regarding cross- or up-sell opportunities that may have gotten unnoticed before.
Implementing Master Data Management is a complex and often costly experience. Therefore it is crucial that a good architecture is selected, along with a matching technology that fits the requirements of an organization and also fits in with the existing IT landscape.
Next to implementing MDM, it is as important to give the proper attention to Data Governance and Data Quality, preferably by creating an overall vision and strategy for Data Management. As MDM can have an impact throughout an entire organization, it is equally important to work in phases upon deploying MDM.
element61 has implemented MDM at several of our customers, including Euroports, Essers, Isabel, Weber Europe, Woonhaven Antwerpen, ... .
If you would like to receive further information on MDM or any other Business Analytics topic, feel free to contact us.