By: Alex Berson and Larry Dubov
Service provider takeaway:A business rules engine (BRE) helps manage and enforce business rules. This section of the chapter excerpt from the book Mastering Data Management and Customer Integration for a Global Enterprise will provide an overview of business rules engines for MDM consultants.
Download the .pdf of the chapter here.
A business rules engine (BRE) is a software application or a system that is designed to manage and enforce business rules based on a specified stimulus, for example, an event of attribute value changes. Business rules engines are usually architected as pluggable software components that separate the business rules from the application code. This separation helps reduce the time, effort, and costs of application maintenance by allowing the business users to modify the rules as necessary without the need for application changes.
In general, a BRE may help register, classify, and manage the business rules it is designed to enforce. In addition, a BRE can provide functionality that detects inconsistencies within individual business rules (for example, a rule that violates business logic), as well as rule sets. A rule set is a collection of rules that apply to a particular event and must be evaluated together.
In the context of the CDI Data Hub, BRE software manages the rules that define how to reconcile the conflicts of bidirectional synchronization. For example, if a date-of-birth attribute is changed in the CRM system supporting the service center and in the self-service web channel, an organization may define a business rule that requires the changes to this attribute that came from the self-service channel to take precedence over any other changes. A more complex rule may dictate to accept changes to the date of birth only if the resulting age of the customer does not exceed the value of 65. There may be another business rule that would require a management approval in the case when the age value is greater than 65. The BRE would evaluate and enforce all rules that apply to a particular event.
At a minimum, a full-function BRE will include the following components:
- Business Rule Repository A database that stores the business rules defined by the business users
- Business Rule Designer/Editor An intuitive, easy-to-use, front-end application and a user interface that allows users to define, design, document, and edit business rules
- A Query and Reporting Component Allows users and rules administrators to query and report existing rules
- Rules Engine Execution Core Actual code that enforces the rules
There are several types of business rules engines available today that differ by at least the following two dimensions: by the way they enforce the rules and by the types of rules they support. The first dimension differentiates the engines that interpret business rules in a way similar to a script execution, from the engines that "compile" business rules into an internal executable form to drastically increase the performance of the engine. The second dimension is driven by the types of rules -- inference rules and reaction rules:
- Inference Engines support complex rules that require an answer to be inferred based on conditions and parameters. For example, an Inference BRE would answer a question like "Should this customer be offered an increased credit line?"
- Reaction Rules Engines evaluate reaction rules automatically based on the context of the event. The engine would provide an automatic reaction in the form of real-time message, directive, feedback, or alert to a designated user. For example, if the customer age in the Data Hub was changed to qualify for mandatory retirement distribution, the reaction BRE would initiate the process of the retirement plan distribution by contacting an appropriate plan administrator.
Advanced BRE solutions support both types of business rules in either translator / interpreter or compilation mode. In addition, these engines support rules conflict detection and resolution, simulation of business rules execution for "what-if" scenarios, and policy-driven access controls and rule content security. Clearly, such an advanced BRE would be useful in supporting complex data synchronization and conflict reconciliation requirements of the Data Hub. Architecturally, however, a BRE may be implemented as a component of a Data Hub or as a general business rules engine that serves multiple lines of business and many applications. The former approach leads to a specialized BRE that is fine-tuned to effectively process reconciliation rules of a given style and context of the Data Hub. The latter is a general-purpose shared facility that may support a variety of business rules and applications, an approach that may require the BRE to support more complex rules-definition language syntax and grammar, and higher scalability and interoperability with the business applications. To isolate Data Hub design decisions from the specifics of the BRE implementation, we strongly recommend that companies take full advantage of the service-oriented approach to building a Data Hub environment and to encapsulating the BRE and its rules repository as a set of well-defined services that can be consumed by the Data Hub on an as-needed basis.
Data Delivery and Metadata Concerns
The complexities and issues of populating Data Hub give rise to a different set of concerns. These concerns have to be solved in order to enable data consumers (systems, applications, and users) to find and use the right data and attest to its quality and accuracy. The Information Consumer zone addresses these concerns by providing a set of services that help find the right data, package it into the right format, and make available the required information to the authorized consumers. While many of these concerns are typical for any data management and data delivery environment, it is important to discuss these concerns in the context of the CDI Data Hub and its data location service.
As we look at the overall enterprise data landscape, we can see the majority of data values spread across the Data Hub and many heterogeneous source systems. Each of these systems may act as a master of some data attributes, and in extreme cases, it is even possible that some data attributes have many masters. Every time a consumer requests a particular data record or a specific data attribute value, this request can be fulfilled correctly only when the requesting application "knows" what system it should query to get the requested data. This knowledge of the master relationship for each data attribute, as well as the knowledge of the name and location of the appropriate masters, is the responsibility of the Attribute Location service. Architecturally, it is an enterprise-wide data service that hides the implementation details from the applications and other service consumers. Conceptually, this service acts as a directory for all data attributes under management, and this directory is active, that is, the directory is continuously updated as data attributes migrate or get promoted from old masters to the Data Hub -- a natural evolution of a CDI environment from a Registry style to the Transaction Hub. Logically, however, this service is a subset of a larger service framework that represents an enterprise-wide metadata repository -- a key component of any enterprise-wide data strategy and data architecture. As we mentioned in Chapter 5, the Metadata Repository role is much broader than just providing support for the Attribute Locator service, and also includes such internal Data Hub services as Record Locator and even Key Generation services.
Although a detailed discussion of metadata is beyond the scope of this book, we briefly discuss the basic premises behind metadata and the metadata repository in the section that follows. This section describes how a metadata repository helps enable just-in-time data delivery capabilities of some Data Hub implementations as well as some end-user applications such as real-time or operational Business Intelligence applications.
In simple terms, metadata is "data about data," and if managed properly, it is generated whenever data is created, acquired, added to, deleted from, or updated in any data store and data system in scope of the enterprise data architecture.
Metadata provides a number of very important benefits to the enterprise, including:
- Consistency of definitions Metadata contains information about data that helps reconcile the difference in terminology such as "clients" and "customers," "revenue" and "sales," etc.
- Clarity of relationships Metadata helps resolve ambiguity and inconsistencies when determining the associations between entities stored throughout data environment. For example, if a customer declares a "beneficiary" in one application, and this beneficiary is called a "participant" in another application, metadata definitions would help clarify the situation.
- Clarity of data lineage Metadata contains information about the origins of a particular data set and can be granular enough to define information at the attribute level; metadata may maintain allowed values for a data attribute, its proper format, location, owner, and steward. Operationally, metadata may maintain auditable information about users, applications, and processes that create, delete, or change data, the exact timestamp of the change, and the authorization that was used to perform these actions.
There are three broad categories of metadata:
- Business metadata includes definitions of data files and attributes in business terms. It may also contain definitions of business rules that apply to these attributes, data owners and stewards, data quality metrics, and similar information that helps business users to navigate the "information ocean." Some reporting and business intelligence tools provide and maintain an internal repository of business-level metadata definitions used by these tools.
- Technical metadata is the most common form of metadata. This type of metadata is created and used by the tools and applications that create, manage, and use data. For example, some best-in-class ETL tools maintain internal metadata definitions used to create ETL directives or scripts. Technical metadata is a key metadata type used to build and maintain the enterprise data environment. Technical metadata typically includes database system names, table and column names and sizes, data types and allowed values, and structural information such as primary and foreign key attributes and indices. In the case of CDI architecture, technical metadata will contain subject areas defining attribute and record location reference information.
- Operational metadata contains information that is available in operational systems and run-time environments. It may contain data file size, date and time of last load, updates, and backups, names of the operational procedures and scripts that have to be used to create, update, restore, or otherwise access data, etc.
All these types of metadata have to be persistent and available in order to provide necessary and timely information to manage often heterogeneous and complex data environments such as those represented by various Data Hub architectures. A metadata management facility that enables collection, storage, maintenance, and dissemination of metadata information is called a metadata repository.
Topologically, metadata repository architecture defines one of the following three styles:
- Centralized Metadata repository
- Distributed Metadata repository
- Federated or Hybrid Metadata repository
The centralized architecture is the traditional approach to building a metadata repository. It offers efficient access to information, adaptability to additional data stores, scalability to capture additional metadata, and high performance. However, like any other centralized architecture, centralized metadata repository is a single point of failure. It requires continuous synchronization with the participants of the data environment, may become a performance bottleneck, and may negatively affect quality of metadata. Indeed, the need to copy information from various applications and data stores into the central repository may compromise data quality if the proper data validation procedures are not a part of the data acquisition process.
A distributed architecture avoids the concerns and potential errors of maintaining copies of the source metadata by accessing up-to-date metadata from all systems' metadata repositories in real time. Distributed metadata repositories offer superior metadata quality since the users see the most current information about the data. However, since distributed architecture requires real-time availability of all participating systems, a single system failure may potentially bring the metadata repository down. Also, as source systems configurations change, or as new systems become available, a distributed architecture needs to adapt rapidly to the new environment, and this degree of flexibility may require a temporary shutdown of the repository.
A federated or a hybrid approach leverages the strengths and mitigates the weaknesses of both distributed and centralized architectures. Like a distributed architecture, the federated approach can support real-time access of metadata from source systems. It can also centrally and reliably maintain metadata definitions or at least references to the proper locations of the accurate definitions in order to improve performance and availability.
Regardless of the architecture style of the metadata repository, any implementation should recognize and address the challenge of semantic integration. This is a well-known problem in metadata management that manifests itself in the system's inability to integrate information properly because some data attributes may have similar definitions but have completely different meanings. The reverse is also true. A trivial example is the task of constructing an integrated view of the office staff hierarchy for a company that was formed because of a merge of two entities. If you use job titles as a normalization factor, a "Vice President" in one company may be equal to a "Partner" in another. Not having these details explained clearly in the context becomes a difficult problem to solve systematically. The degree of difficulty grows with the diversity of the context. Among the many approaches to solving this challenge is the metadata repository design that links the context to the information itself and the rules by which this context should be interpreted.
Enterprise Information Integration and Integrated Data Views
Enterprise Information Integration (EII) is a set of technologies that leverage information collected and stored in the enterprise metadata repository to deliver accurate, complete, and correct data to all authorized consumers of such information without the need to create or use persistent data storage facilities. The fundamental premise of EII is to enable authorized users to just-in-time and transparent access to all information they are entitled to.
Conceptually, EII technologies complement other solutions found in the Information Consumer zone by defining and delivering virtualized views of integrated data that can be distributed across several data stores including a Data Hub.
EII data views are based on the data requests and metadata definitions of the data under management. These views are independent from the technologies of the physical data stores used to construct these views.
Moreover, advanced EII solutions can support information delivery across a variety of channels including the ability to render the result set on any computing platform, including various mobile devices. Looking at EII from a CDI Data Hub architecture viewpoint, and applying service-oriented architecture principles, we can categorize EII technologies as components of the Information Consumer zone. The EII components that deliver requested data views to the consumers (users or applications) should be designed, implemented, and supported in conjunctions with the data location and delivery services depicted in Figure 6-2.
Although, strictly speaking, EII is not a mandatory part of the Data Hub architecture, it is easy to see that using EII services allows a Data Hub to deliver the value of an integrated information view to the consuming applications and users more quickly, at a lesser cost, and in a more flexible and dynamic fashion.
In other words, a key part of any CDI Data Hub design is the capability of delivering data to consuming applications periodically and on demand in agreed-upon formats. But being able to deliver data from the Data Hub is not the only requirement for the Information Consumer zone. Many organizations are embarking on the evolutionary road to a Data Hub design and implementation that makes the Data Hub a source for analytical and operational data management including support for the Business Intelligence and Servicing CRM systems. This approach expands the role of the Data Hub from the data integration target to the master data source that feeds value-added business applications. This expanded role of the Data Hub and the increased information value of data managed by the Data Hub require an organizational recognition of the importance of enterprise data strategy, broad data governance, clear and actionable data quality metrics with specially appointed data stewards that represent business units, and the existence and continuous support of an enterprise metadata repository.
The technical, business, and organizational concerns of data strategy, data governance, data management and data delivery that were discussed in this and the previous chapter are some of the key factors necessary to make any CDI initiative a useful, business-value-enhancing proposition.
About the book
Master Data Management and Customer Data Integration for a Global Enterprise explains how to grow revenue, reduce administrative costs, and improve client retention by adopting a customer-focused business framework. Learn to build and use customer hubs and associated technologies, secure and protect confidential corporate and customer information, provide personalized services, and set up an effective data governance team. Purchase the book from McGraw-Hill Osborne Media.
Reprinted with permission from McGraw-Hill from Master Data Management and Customer Data Integration for the Global Enterprise by Alex Berson and Larry Dubov (McGraw-Hill, 2007)