Leveraging SAP’s Enterprise Data Management tools to enable ML/AI success
Background
In our previous blog post, “Master Your ML/AI Success with Enterprise Data Management”, we outlined the need for Enterprise Data Management (EDM) and ML/AI initiatives to work together in order to deliver the full business value and expectations of ML/AI. We made a set of high-level recommendations to increase EDM maturity and in turn enable higher value from ML/AI initiatives. A graphical summary of these recommendations is shown below.
Figure 1 – High level recommendations to address EDM challenges for ML/AI initiatives
In this post, we will present a specific instantiation of technology for bringing those concepts to life. There are countless examples that could be shown, but for the purposes of this post, we will present a solution within the SAP toolset. The end result is an implementation environment where the EDM technologies work hand-in-hand with ML/AI tools to help automate and streamline both these processes.
SAP’s preferred platform for ML/AI is SAP Data Intelligence (DI). When it comes to EDM, SAP has a vast suite of tools that store, transfer, process, harness, and visualize data. We will focus on four tools that we believe provide the most significant impact to master ML/AI initiatives implemented on DI. These are SAP Master Data Governance (MDG), SAP Data Intelligence (DI) – Metadata Explorer component, and to a smaller extent, SAP Information Steward (IS). SAP Data Warehouse Cloud (DWC) can also be used to bring all the mastered and cleansed data together and to store and visualize the ML outputs.
Architecture
As with any other enterprise data solution, the challenge is to effectively integrate a set of tools to deliver the needed value, without adding the cost overhead of data being moved and stored in multiple places, as well as the added infrastructure, usage and support costs. For enterprises that run on SAP systems, a high-level architecture and descriptions of the tools that would achieve these benefits is shown below.
Figure 2 –High-level MDG/DI architecture and data flow
1. SAP MDG (Master Data Governance) with MDI (Master Data Integration)
SAP MDG and MDI go hand in hand. MDI is provided with the SAP Cloud Platform. It enables communication across various SAP applications by establishing One Domain Model (ODM). It enables a consistent view of master data across the end-to-end scenarios.
SAP MDG is available as S/4 HANA or ERP-based. This tool helps ensure high quality and trusted master data for initial and ongoing purposes. It can become a key part of the enterprise MDM and data governance program. Both active and passive governance are supported. Based on business needs, certain domains are prioritized out of the box in MDG. MDG provides the capabilities like Consolidation, Mass Processing and Central Governance coupled with governance workflows for Create-Read-Update-Delete (CRUD) processes.
SAP has recently announced SAP MDG, cloud edition. While it is not a replacement for MDG on S/4 HANA, MDG cloud edition is planned to come with core MDG capabilities like Consolidation, Centralization and Data Quality Management to centrally manage core attributes of Business Partner data. This is a useful “very quick start” option for customers who never used MDG, but it can also help customers already using MDG on S/4HANA to build out their landscape to a federated MDG approach for better balancing centralized and decentralized master data.
2. Data Intelligence (with Metadata Explorer component)
SAP IS and MDG are the pathways to make enriched, trusted data available to Data Intelligence, which is used to actually build the ML/AI models. We can reuse SAP IS rules and metadata terms directly in SAP DI. This is achieved in DI by utilizing its data integration, orchestration, and streaming capabilities. DI’s Metadata Explorer component also facilitates the flow of business rules, metadata, glossaries, catalogs, and definitions to tools like IS (on-prem) for ensuring consistency and governance of data. Metadata explorer is geared towards discovery, movement and preparation of data assets that are spread across diverse and disparate enterprise systems including cloud-based ones.
3. Information Steward (IS) – Information Steward is an optional tool, useful for profiling data, especially for on-prem situations. The data quality effort can be initiated by creating the required Data Quality business rules, followed by profiling the data and running Information Steward to assess data quality. This would be the first step towards initial data cleansing, and thereby data remediation, using a passive governance approach via quality dashboards and reports. (Many of these features are also available in MDG and DI). SAP IS helps an enterprise address general data quality issues, prior to using specialized tools like SAP MDG to address master data issues. It can be an optional part of any ongoing data quality improvement initiative for an enterprise.
4. Data Warehouse Cloud (DWC) – Data Warehouse Cloud is used in this architecture to bring all the mastered and cleansed data together into the cloud, perform any other data preparation or transformations needed, and to model the data into the format needed by the ML models in DI. DWC is also used to store the results of the ML models, and to create visualizations of these results for data consumers.
Figure 3 – Summary of Functionality of SAP tools used for EDM
While there are some overlaps in functionality between these tools, Data Intelligence is more focused on the automation aspects of these capabilities. DI is primarily intended as an ML platform, and therefore has functionality such as the ability to create data models and organize the data in a format that facilitates the ML/AI process (ML Data Manager). This architecture allows for capitalizing on the EDM strengths of MDG and IS. This is also consistent with the strategic direction of SAP, that is, providing comprehensive “Business Transformation as a Service” approach, leading with cloud services. Together, these tools work in a complementary way (for hybrid on-prem plus cloud scenarios), and the combination of these tools work hand in hand to make trusted data available to AI/ML.
Conclusion
In summary, the SAP ecosystem has several EDM tools that can help address the data quality and data prep challenges of the ML/AI process. SAP tools like MDG and DI Metadata Explorer component have features and integration capabilities that can easily be leveraged during or even before the ML/AI use cases are underway. If used in conjunction with the general EDM maturity recommendations summarized above, these tools will help to deliver the full business value and expectations of ML/AI use cases.
In our next post, we will continue our discussion on EDM tools, some of their newer features, how they have evolved, and how ML/AI has been part of their own evolution. As a reminder, if you missed the first post in this series, you can find it here: “Master Your ML/AI Success with Enterprise Data Management”.
Inspired Intellect is an end-to-end service provider of data management, analytics and application development. We engage through a portfolio of offerings ranging from strategic advisory and design, to development and deployment, through to sustained operations and managed services. Learn how Inspired Intellect’s EDM and ML/AI strategy and solutions can help bring greater value to your analytics initiatives by contacting us at [email protected].
LinkedIn https://www.linkedin.com/company/inspired-intellect/
Editor’s Note – I co-authored this blog with my colleague, Pravin Bhute, who serves as an MDM Architect for our partner organization, WorldLink.