Best Practices in Modeling IBM Cognos 10.2 Semantic Layers in Framework Manager

In October 2008, we published the Insight "Best Practices in Modelling IBM Cognos 8 Semantic Layers" on our website. Since then, this page has become the most visited page on our website, with over 50.000 pageviews since publication, clearly showing the need for the topic. Given IBM Cognos 10 now is on the market for a while, we decided to update the Insight to incorporate new IBM Cognos 10 functionality. We thus are proud to present : "Best Practices in Modelling IBM Cognos 10.2 Semantic Layers".

IBM Cognos Business Intelligence Server offers report authors a single platform to create reports, dashboards, events and perform analysis on multidimensional data.

All users connect to the Cognos BI server using a zero footprint web portal: IBM Cognos Connection. Zero Footprint means no additional software or applets are installed on the client PC. It provides a single point of entry for all corporate data and the tools to analyse this data. The portal contains all available reports, analysis, dashboards and offers advanced sharing, publishing and security features. IBM Cognos Connection will provide interaction with other modules from the Cognos family such as IBM Cognos Insight or IBM Cognos Controller.

Figure 1: Cognos 10.2.1 Cognos Connection Portal

From IBM Cognos Connection all the end user applications can be launched. Each one has a specific functionality focus:

  • Cognos Workspace: build corporate dashboards using pre-made components
  • Cognos Workspace Advanced: perform multi-dimensional analysis and create basic reports
  • Report Studio: perform advanced, pixel perfect reporting with complex queries
  • Event Studio: create agents to follow-up on triggers
  • Cognos Insight: do self-service analysis and share these insights with the enterprise
  • Query Studio (legacy product): perform basic reporting using basic queries and formatting
  • Analysis Studio (legacy product): perform multidimensional analysis

All these tools share the same semantic layer built with IBM Cognos Framework Manager.

Semantic Layers

The purpose of a semantic layer is to create a business representation of corporate data. This representation hides database complexity to the end-user by creating an intuitive model. The semantic layer maps complex data into familiar business terms and shields cryptic database language from the end-user. This makes it very easy for a business user to create his own reports as the terminology used is very recognizable.

Business users are insulated from underlying data complexity while IT maintains governance over the use of data sources. By using a single version of the truth and the use of consistent terminology, end-user productivity is increased as the self-servicing aspect of business intelligence is strengthened.

A semantic layer can handle multilingual features and consolidate different database sources and/or OLAP cubes. This enables the use of different databases even from different vendors- or OLAP cubes in a single semantic layer, enabling the ability to use these transparently in a single report.

IBM Cognos 10 Business Intelligence uses 2 metadata tools: Framework Manager and Cube Designer. The metadata modeling tools within Cognos Business Intelligence are client-server applications. All end-user based tools are accessed from Cognos Connection.

  • Framework Manager is used to create relational and dimensionally modeled relational models (DMR), called frameworks.
  • Cube Designer will model Dynamic Cubes, a recently introduced cubing technology that replaces Powercubes. Cube Designer is beyond the scope of this insight, but in the near future a new insight concerning Dynamic Cubes will be released.
Figure 2: Cognos 10.2.1 Framework Manager

    Figure 3: Cognos 10.2.1 Cube Designer

    Best Practices in Modeling IBM Cognos 10.2 Semantic Layers in Framework Manager
    click to enlargeBest Practices in Modeling IBM Cognos 10.2 Semantic Layers in Framework Manager

    Flexible Models

    Model flexibility can be defined from two different points of view. How easily can the model be adapted to changing conditions and how easily can the user generate ad hoc query requests? Both questions can be answered by using star schema modeling.

    The dimensionally modeled database is ideal for reporting and is often referred to as a data warehouse. In a data warehouse facts and dimensions are established and data is stored at the lowest granular level. In every data warehouse a number of star schema's are present. The central table represents the fact table and only contains numeric and additive measures. The satellite tables represent the set of dimensions that can be used to look at the measures from different angles.

    By using conformed dimensions, a "data warehouse bus is established. Conformed dimensions are dimensions used by multiple fact tables. This method of modeling enables executing multi-fact, multi-grain queries ensuring a predictable, clean set of results. When new facts or dimensions are added, they can be quite easily added to the model, representing a new star schema.

    However not all IBM Cognos Frameworks need to be build on dimensionally modeled databases. Sometimes a data warehouse is not available and reporting is enabled directly on an OLTP (On Line Transaction Processing)-database, used in an operational system like an invoicing or order entry system, or an operational data store. These types of databases are modeled relationally and are highly normalized.

    There are a number of drawbacks to do reporting on a relational model. The first drawback is query performance, a highly normalized model will lead to dozens of tables in a single SQL statement, leading to large execution plans and slow performance. Doing such queries on a production environment could even lead to problems with the applications operational performance. Relational data sources also pose a number of modeling challenges for the framework modeler to create predictable query results.

    Therefore it is recommended to always use a data warehouse with star schemas as source for reporting.

    Query Flavour

    When a framework is published, a compiled version of it is made available on Cognos Connection, called a package. This package can support 2 query modes: Relational Querying and OLAP-style reporting.

    With Relational Querying, IBM Cognos will build an SQL statement when a user drags and drops objects on a report. All objects available in the database can be easily queried at the lowest grain. Drill up/down is not possible in this query mode.

    The other way of querying is OLAP styled reporting based on a cube. A cube is a multidimensional store of data. The most common use of a cube is to do drill up/down analysis. The drawback is that a cube usually does not contain all the fields available in the database. Most often figures in a cube are summarized so the lowest grain is not available. Reporting on a cube is however very fast. Frameworks provide a mechanism that allow for OLAP styled reporting without the need of an actual physical cube. Cognos will emulate OLAP behaviour and will write SQL to retrieve the data in the background. These types of frameworks are called Dimensionally Modeled Relational or in short DMR. With the introduction of Dynamic Query Mode, performance of DMR models can be boosted to the level of native cubes by using the advanced caching features Dynamic Query Mode offers.

    Both relational models and DMR models can be supported from a single framework.

    Figure 4: Cognos 10.2.1 OLAP Style Reporting

     

    Best Practices in Modeling IBM Cognos 10.2 Semantic Layers in Framework Manager

     

    Figure 5: Cognos 10.2.1 Relational Style Reporting

     

    Framework Objects

    A framework uses a number of objects to create a structured model. A namespace creates a qualifying container for objects, avoiding naming conflicts. Within a namespace, the modeler can use folders to group standalone filters or query subjects. Namespaces will structure frameworks. In a namespace a number of query subjects are added. They represent the tables in a framework. There are three different types of query subjects:

    • Data Source query subjects: performs a query on the underlying data source
    • Model query subjects: refers to an existing query subject in the model
    • Stored Procedure query subjects: used to retrieve data from stored procedures.

    Standalone filters are pre-designed filters that can easily be re-used in the reporting tools by the author.

    Figure 6: Cognos 10.2.1 Measure Dimension and Regular Dimension

     

    For OLAP functionality, two additional objects are available: a Measure Dimension and a Regular Dimension. A Measure Dimension contains a collection of numeric values. The Regular Dimension provides the accompanying set of descriptions and identifiers. The Measure and Regular Dimensions are linked with Scope Relationships to define the level at which the measures are available for reporting.

    Creating Durable Models

    While creating a model it is important to create a proper structure. The use of a multi-tier structure will shield the end-user from changes at database level such as migration to a different database technology, or changes to column or table names. By creating an efficient layered structure, relational models can be modeled into virtual star schemas, providing predictable and reliable query results to the end user.

    The first step in creating the Framework Model is importing the metadata. This can be handled by using the Metadata Wizard. It is good practice to create a separate namespace for every data source that is needed in the framework. On top of the namespace for the data source, a global namespace should be created: the Data Foundation View.

    Query subjects are linked together using Relationships. When all data source objects are imported, the model should be scrutinized to verify all relations between the query subjects are correct. It is good practice not to blindly import the relationships. By manually creating the relationships, a much higher level of control is achieved. Relations should always and only be created in the Database Foundation View. Mixing relationships at different levels will only cause confusion and incorrect results.

    A query subject can be edited by replacing the standard SQL with custom written SQL. For maintenance purposes, it is however recommended never to make any changes in freehand SQL. If you do, the query subject has to be manually adjusted if changes are made at database level. When changes are made in the database, importing is by far the easiest way to update the query subjects. You can also use the Update command in the Tools menu to update a single query subject.

    Although it is possible to import data from different data sources, the reflection should be made that there is a performance penalty in doing this. When the data sources are on different servers or use different technologies, IBM Cognos will not be able to write SQL-statements that will contain objects from both data sources. Instead, IBM Cognos will write 2 queries and stitch these together locally at the IBM Cognos BI server. Therefore it is highly recommended to use only 1 data source per physical database platform.

    In the Data Foundation view some other tasks need to be done. By using the proper tab pages, calculations and determinants can be added to the query subjects without making changes to the SQL code. Embedded/standalone filters should be added and database column names are translated to more understandable business names.

    Figure 7: Cognos 10.2.1 Calculations on database query subjects

     

    For every query item, the modeler should check if the usage is set correctly. The usage of a field can be an identifier, attribute or fact. Facts are numeric, usually additive or semi-additive data. All indexed columns or columns containing business keys should be set as identifier. Attributes are typically all other strings.
    For every fact column, the aggregate should be set. Other options that should be set are the format, screen tip, description These properties are inherited by derived objects at a later stage in the modeling process.

    Model for predictable results

    The greatest challenge for the model developer is creating a model that returns proper query results at all times, no matter what columns were selected in the report by the user. When importing from a relational data source, cardinality is detected based on a set of rules.

    IBM Cognos uses the following rules:

    • cardinality is always applied in the scope of a query performed by the user
    • 1 or 0-to-n relationship implies a fact but only if all relationships to that query subject are 1 or 0-to-n
    • 1 or 0 to 1 implies a dimension

    This means it is possible that a query subject will behave as a dimension in one query and as fact in another query. This is typically the case with snowflake dimensions. This situation can be handled by using model query subjects. The model query subject will logically condense the snowflake into one object, thus enforcing the correct context in every query. However, there is a performance drawback. Condensing multiple tables in a single model query subject will force Cognos to retrieve the entire snowflake even when no fields are needed from the underlying tables. Therefore it is better not to condense the snowflake using a model query subject. Instead, model the snowflakes with 1:1 relationships. Tables in the snowflake can be joined using 1:1 relationships instead of 1:n relationships. This will allow the usage of Minimized SQL, retrieving only the objects that are needed and ensure the proper usage of the query subjects.

    Figure 8: Cognos 10.2.1 Context Explorer

    Handle Ambiguous Relationships

    There are two types of relationships that can provide inconsistent result sets if not handled by the modeler. The first occurs when there are multiple valid relationships. This typically occurs between facts and dimensions. In a fact table, a different dates are present: invoice date, ship date, order date all point to the date dimension. Combining multiple dates in a single query will no longer return results.
    Another issue occurs when handling recursive relationships. The classic example is the manager employee relation. An employee has a manager. The manager is an employee and also has a manager that again is an employee.

    These situations can be handled by creating multiple model query subjects for every occurrence. You would however have to reset all the properties of every model query subject created leading to unnecessary work. A convenient solution to this problem is using shortcuts. There are two types:

    • Regular Shortcut: reference to the source objects but inherits all properties including relationships
    • Alias Shortcut: behaves independently of the source object, so different relationships can be set

    The creation of multiple alias shortcuts on a table that use different relationships will handle these ambiguous relationships graciously. Regular shortcuts will be used while creating the Presentation View.

    Multi fact multi grain queries

    A determinant is needed to identify levels of aggregation within the query subjects. This is a particularly useful feature when dealing with multi-fact, multi-grain queries. When you have a sales fact at day level and a target fact at month level, combining both facts in a single query would lead to incorrect results. The targets would be multiplied several times as they are stored at month level and not at day level. Determinants will change the default behaviour of the query. Cognos will recognise the difference in grain and will write 2 queries that will be stitched together to return proper results at the proper grain.

    Figure 9: Cognos 10.2.1 Determinants

    The key element in performing multi-fact multi-grain queries is by using a conformed dimension shared between both fact tables. When retrieving 2 measures from two different fact tables using a different granularity, Cognos can determine the correct aggregation when determinants are specified. A determinant will specify what set of columns will uniquely define a set of columns. Each level is specified identifying the key and attributes that belong to a level. The lowest level is marked unique.

    This will enable the report developer to create a report showing revenue at week level versus month figures without double counting the lowest grain fact. Cognos uses the mechanism of stitch queries to perform these types of requests. A stitch query will perform a full outer join to break queries into multiple selects, one for each fact table and then stitch the data back together. Determinants are specified at the Data Foundation Layer.

    Consolidate

    When all data related issues and reporting traps are handled, the next step in the modeling process is creating a
    Consolidation View. The consolidation view usually is split up into two namespaces: a Relational View and a Dimensionally Modeled Relational (DMR) view of the metadata.

    The first is used for normal reporting and generates SQL that is fired to the database. The second is used in multi-dimensional analysis and resembles an OLAP cube. The main difference between a Dimensionally Modeled Relation model (DMR) and an OLAP-cube is that the latter is physically stored in a multidimensional way. The DMR-model is a virtual way of modeling the data source and does not physically stores data.

    Figure 10: Cognos 10.2.1 Consolidation View

    Relational

    In this view model query subjects will be created using the query subjects in the Data Foundation View. The structure will, unlike the Data Foundation View, not resemble the database. The main goal of this layer is to provide an easy to understand structure that is recognisable to business users. It is perfectly okay to combine a snowflake into a single model query subject as this would be a logical point of view of the business users. Normally technical meaningless objects such as load dates or sequence numbers like primary keys and foreign keys to dimensions should be removed or hidden. Facts should only contain measures and degenerate dimensions. All foreign keys to dimensions should be hidden to the business user. It is good practice not to remove these technical fields but to keep them separated/hidden in a subfolder in the model query subject. While debugging reports, it can be quite handy to be able to include the primary key of a table to identify exactly which record has issues.

    In this layer, no relationships between query subjects should be laid, ever. A model query subject is also the best place to use macro functions and parameter maps to handle multilingual tables. By using the calculations in Model Query subjects, modelers can avoid entering freehand SQL in data source query subjects, which should be avoided at all times for maintenance purposes.

    DMR

    Dimensionally Modeled Relational models are virtual OLAP cubes. The data is presented in an OLAP-style, but is not physically stored on the server. Instead at every user request, a query is executed. This style of modeling is used when you want to enable analysis, using drill up / drill down in Analysis Studio. Since the introduction of Dynamic Query Mode that also supports relational databases like Microsoft SQL Server, Oracle and IBM DB2, an advanced caching mechanism was put in place. This caching mechanism will provide similar performance as a physical cube when primed correctly.

    DMR models are made up out of Regular Dimensions (dimensions) and Measure Dimensions (facts). A Regular Dimension consists of one or more defined hierarchies containing levels, keys, captions and attributes. Level information is used to roll up the measures. Each level should have the key and caption defined. If there is a Unique key, the bottom level should be marked Unique, otherwise, the combination of all upper levels is used to identify a member. If the star schema is modeled in its final form, Regular Dimensions can be quickly generated by using Merge in New Regular Dimension on the relational view of the Consolidation view.

    Figure 11: Cognos 10.2.1 Detail of regular dimension

    If a Regular Dimension is based on a query subjects that has determinants specified, it is recommended that one level corresponds to each determinant and that the order of levels is equal to the order of determinants. Create a determinant for every level needed. Multiple hierarchies can be specified, but you cannot use them together in a single report query. If this would be a requirement, create a regular dimension for every hierarchy.

    A Measure Dimension is a logical grouping of facts which enables OLAP-styled querying of a relational database. Measure Dimensions and Regular Dimensions are joined through scope relations. These scope relations are only logical, the underlying query subject joins remain in use. A scope relation will specify what levels of a dimension are in scope' for a certain measure. A scope relation is mandatory and will be created automatically using the underlying query subject joins.

    Presentation

    The final step in modeling a framework is creating a Presentation Layer. The Presentation Layer is built from several Star Schema Groupings. Star Schema Groupings make the model more intuitive to the end user by showing only related facts and dimensions. For every star schema a different namespace is created, showing the end user functional business areas of which to select elements in the query. Using the wizard, star schema groupings can be created quickly.

    Multilingual

    Framework Manager allows the modeler to translate static report content such as field names and descriptions. This is done by adding languages to the framework. Not only static content can be translated transparently for the end-user quite often product descriptions are kept in multiple languages in the data source.

    There are two methods of storing this information:

    • separate column for every language for example PRODUCT_EN, PRODUCT_DU, PRODUCT_FR
    • separate row for every language

    Whatever solution was chosen, macro functions enable the modeler to create the proper SQL at runtime, by using the language options set by the user.

    When there are multiple columns for every language, the modeler can specify that the column name retrieved at runtime is dependent of the user language. Sometimes the languages are not mapped correctly. Therefore a mapped value is chosen from the Language_lookup parameter map.

    • #'[NAMESPACE].[QUERY SUBJECT].[QUERY ITEM_'+ $Language_lookup{$runLocale}+ '] '#

    If the multilingual data is stored in rows, a filter can be added:

    • [NAMESPACE].[QUERY SUBJECT].[LANGUAGE] = #$Language_lookup{$runLocale}+ '] '#
    Figure 12: Cognos 10.2.1 Translations

      Choosing the proper Design Language at the start of a project is crucial. Once set, it cannot be easily changed without modifying the source XML of the framework. It is recommended to always choose a dialect as design language. For example if the main language of the framework is English, use English Zimbabwe as design language. Also keep the original database column name in the design language in the Data Foundation View and Consolidation View. Doing so will enable you to see what database columns are in a report when debugging the report. Off course you will have to change your language in Cognos Connection to the design language of the framework.

      When column names are changed in the database, only change the column names in the Data Foundation View. Changing the column name in the design language in the Consolidation View would break the report. This only applies to the design language (English Zimbabwe), all other languages (English) can be changed freely without affecting the report. So clever use of the design language will allow you to be able to easily change column names without breaking existing reports.

      If multiple languages are used, translation files (Projects-Actions-Export) will make it easy to translate the model. An Excel-file can be exported containing all language values. When properly translated, the file can be re-imported just as easily.

      Enhance performance

      Aggregate tables are probably the single most cost effective measure in boosting reporting and data warehouse performance. Unlike competing products, Framework Manager does not have native functionality to facilitate aggregates. However nearly all major vendors offer functionality within the database to transparently rewrite queries to aggregate tables. In Oracle for example, this functionality is called query rewrite and materialized views, in DB2 these are called materialized query tables. By using a query rewrite functionality, the database will transparently rewrite the query to the aggregate or detail table, depending on what level of detail the user requested.

      Part of the aggregate lacuna in Framework Manager can be filled by using Dynamic Cubes. Dynamic cubes are a new in-memory cubing technology. Currently only star schemas are supported as a data source, which is perfectly fine in our data warehouse context. What is different about this cubing technology is that it is self-tuning. After using the cube for a while, in memory aggregates will be proposed and built automatically.

      It is also possible to allow Cognos to use caching mechanisms to cache recurrent query results by leveraging Dynamic Query Mode. To enable this, the governor Allow usage of local cache should be enabled at the framework. The cache of DMR models is stored until it is cleared or refreshed. The cache for relational models is stored as long as the data source connection is open, which is typically 5 minutes. This caching mechanism greatly improves performance of relational data sources, even to a comparable level as cubes.

      To enable Dynamic Query Mode a switch should be set at either package level or project level. As switching from compatible to Dynamic Query Mode could cause issues, careful report migration testing is needed. It is also possible to create 2 packages: 1 using Dynamic Query Mode and the other still using Compatible Query Mode. Legacy report will continue running and new reports can be built using DQM. From the maintenance perspective, all packages use the same metadata.

      Figure 13: Cognos 10.2.1 Setting DQM at package level

      While developing the model, it can be annoying to query the full database while testing the model. For that purpose it is possible to use Design Mode filters that can be accessed from the query subjects Filter tab.

      Reuse metadata

      Framework Manager Packages can be used in all IBM Cognos client tools, making it the universal glue that binds everything together. IBM Cognos Framework Query subjects can be directly re-used in Cognos Insight to build self-service insights. Even when you are using Microsoft Office as the front end tool, the framework package is used to retrieve data. It is however a pity that is not possible to generate a dynamic cube by using the DMR definitions. In essence both describe an OLAP layer, so inter-changeability would be a nice feature for a future release.

      Security

      Using the same framework, multiple package can be defined each containing one or more star schemas. Granting or denying access to a package a very effective and easy way to implement a basic level of data security. However, each individual object at any level can be secured.

      Figure 14: Cognos 10.2.1: Setting data security

      Data security will restrict users to query data they are not allowed to. For example, a district manager will only be able to query data of his district. Row level security can be put in place in two different ways. It is possible to hard code the values for every group. However a more generic approach is to create embedded filters using security macro functions such as the LDAP username.

      Segments

      Segments are a very interesting feature to minimize maintenance. A segment is a new project that is linked by a shortcut to the main project. The master project has access to the entire model, including all segments. Suppose an organisation has a centralized data warehouse. The data warehouse team created a framework. The solution is very popular and other teams want to use part of the framework model or the framework model entirely in their environment, likely making minor changes such as adding new facts or dimensions.

      Instead of copying the framework for every team, a segment can be created for the other teams. This has a number of advantages:

      • when the centralized data warehouse changes, changes only need to propagated once to the Framework Model
      • the master project has access to the entire model, the central data warehouse team can analyse impact immediately
      • the centralized team can see how other teams are using the original framework

      In short, segmenting allows a durable solution for similar models used by different teams.

      Conclusion

      IBM Cognos 10 Framework Manager manages the semantic layer of a Cognos 10 deployment. The main focus is creating a user-friendly, comprehensive model that provides in repeatable and predictable query results. Modeling a metadata layer requires an extensive business and underlying data source knowledge. To create a model that offers repeatable and predictable results, the model should always, physically or virtually be conceived as one or multiple star schemas. By structuring the framework in a layered approach, any downstream effects of database changes can be minimized. IBM Cognos Framework Manager offers a well featured metadata modeling tool that allows a durable approach in modeling semantic layers. The introduction of Dynamic Query Mode is a game changer concerning performance. Using advanced caching mechanisms, performance of DMR models can be boosted up to the level of physical cubes.