The Surprising Dangers in a Data Migration
Understanding Why Data Governance Has Been Given A Bad Name
A data readiness and migration project is a critical part of virtually all enterprise-wide initiatives, e.g., digital transformation. The initiative may be a new on-premise application or mapping a set of data to load a cloud application, or a Big Data program. Invariably it also requires ongoing data integration procedures. The collective data requirement is sometimes large and complicated (think of digital transformation), and sometimes small and straightforward (think of a new set of BI dashboards).
Regardless of how big or small it is, many IT professionals underestimate how unique and risky a data readiness and migration project can be, especially when compared to a typical application code development project. How often have data professionals heard the phrase, ‘it’s only data?’ The results of underestimating the scope and complexity of data readiness and migration can lead to disaster for the enterprise.
We don’t do data
When considering the dangers associated with data migration, we must start by recognizing the risk of treating data-focused work as an afterthought or a side project. What if I were to tell you that I’m delivering your new car next month, but the fuel system is not in scope, and I cannot provide the production status or guarantees? Further, I don’t know if it will be electric or hybrid. That would be crazy talk. And, yet, faced with a data-centric project, most IT implementation teams echo those same sentiments. They do one or more of the following:
1. Assign the data readiness scope to another group or outsource IT
2. Explicitly call out that ‘we don’t do data’ leaving the scope undefined
3. Expect the application owner, aka ‘the business’, to suddenly learn how to analyze and cleanse data for a new application or business process.
So, if you were about to purchase a new automobile, would you accept the salesman saying, ‘we don’t build the fuel system?’ Of course not. Yet, data is the fuel for your ERP, CRM and BI applications, and migration is the system that delivers the data. Moreover, data is the fuel that powers digital transformation and, of course, it is the fuel for big data initiatives.
Fit for use
When it comes to delivering a data set that is complete and ready to migrate into a new application, data readiness experts understand there are proven best-practices that must be followed. The following are three of many conventions that should be considered during the initial planning phases and throughout a typical data readiness project:
1. Delivering fit for use data. Because data readiness and migration are usually part of comprehensive IT projects with far-reaching implications (e.g., digital transformation, acquisition of a separate company, global expansion), the data team must be fully integrated as part of the overall development or discovery project, with clear accountability for delivering data that is ‘fit for use’. Data readiness should be incorporated in every phase of the project, and the requirements should be clearly understood long before testing begins. The requirements must reflect the end data product, what you are delivering, in addition to the strategy for moving the data.
2. Project specific approach. Integration may be distinct by the type of project. A few examples include:
Large ERP re-implementation – Data migration is based on transforming source data to target ready data sets. This process requires data quality mitigation at multiple steps, including, at the source, as part of the transformation, and often at the target, post go live. Mapping requirements need to be drafted early, and maintained, as they will change over time. Typically, there will be a data lead assigned to every functional area, to ensure that data migration keeps pace with any requirements updates. For example, when mapping customer and vendor data to a new financial application, such as SAP S/4 Central Finance, you will ensure the quality of the source data, you will transform and harmonize to a ‘business partner’ entity that represents both customer and vendor, and you will add detail as part of value mapping at load. This level of complexity requires a highly integrated team of professionals addressing both the data migration and the business requirements associated with the business partner data.
Agile development projects – Data readiness requirements that can change with development sprint require that the data work be factored into each scope/delivery decision. For example, if we were developing a series of analytics dashboards, we would include specific requirements data validation in our planning for each sprint.
Discovery programs, including big data – Data readiness focuses on data curation, the ability to represent the quality of the source data and needed harmonization across sources. Data analysts and scientists as data consumers expect a clear understanding of baseline quality, which is defined early in the data readiness effort. For example, if you are working with Automated Patient Chart data, there is a mandate that you cannot change the data. The data readiness team works with the data discovery team to determine what data quality thresholds are needed, and what data quality metrics apply to each record. The data need not change. Data scientists only need access to the associated data quality measures, at the record level.
3. Keep the end in mind. Data quality needs to be considered early in any project; you can never start too soon. Creating a data set that is fit for use is not an easy task, nor is it inexpensive, even with useful software tools for data quality and integration. However, this exercise requires a fundamental understanding that you are delivering data; you are not delivering code.
Irrespective of the type of project undertaken, you will use software to transform and move your data. However, software is only a means to an end. Useful data is the end objective. Parameters for success should not be related to testing code, or the number of records moved. Success metrics will focus on whether the data is fit for use, or not.
4. Going fast does not save time. Too many teams underestimate this effort and ‘rush to code’ based on the belief that this will speed things up. It will not speed things up. Coding to incomplete requirements always slows things down. And, this has huge impacts, resulting in project delays and sometimes failure. When it comes to data readiness, that is precisely what some project teams attempt to do. Even, to this day! The reality is that we cannot design a car, much less build a car, without knowing what type of fuel system is required. You want to know that your car will work, not that the factory that built it is working.
Data that takes you where you need to go
The risks associated with data readiness and migration are not insignificant. They range from project delays and cost overruns, implementations that do not meet business needs, data that cannot support business processes, and, on occasion, canceled projects. For example, if your CRM objective is to improve customer relationships, then you need to have good customer data. Bad data might lead to contacting the same customer multiple times because of duplicate records, not knowing the contact name, or having no visibility to history.
Quality data is imperative to success. Managing data readiness projects requires distinct approaches to staffing, detailed analysis of content, a comprehensive understanding of how the data will support business needs, exact timing/sequencing, and, of course, testing.
Your most impactful implementations will have data that is fit for use, supports business process needs, and possibly enables new operational insights. We can build an entire car in a few weeks, but if we want it to provide transportation, the fuel system needs to be functional, reliable and integrated with the rest of the components. The same principle applies to data readiness and migration.