Data Migration is defined as the process of selecting, preparing, extracting, and transforming data and permanently transferring it from one computer storage system to another.
Data migration has the reputation of being risky and difficult and it’s certainly not an easy process. It is time-consuming with many planning and implementation steps, and there is always risk involved in projects of this magnitude.
Without a sufficient understanding of both source and target, transferring data into a more sophisticated application will amplify the negative impact of any incorrect or irrelevant data, perpetuate any hidden legacy problem and increase exposure to risk. A data migration project can be a challenge because administrators must maintain data integrity, time the project correctly to minimize the impact on the business and keep an eye on costs.
However, following a structured methodology will reduce the pain of managing complex data migration.
On May 14, Arithmos hosted a complimentary webinar on performing a successful data migration in Life Sciences. The webinar was conducted by Alessandro Longoni, Arithmos Senior Project Manager & Data Analyst, and focused on the challenges in data migration and ways of overcoming them.
Continue reading to learn the tips on successful performance of data migration in Life Sciences.
Tip 1 – Understanding the data
Before starting the data migration, you have to prepare your data for the migration, carrying out an assessment of what is present in the legacy system, understanding clearly which data needs to be migrated, avoiding duplication and promoting quality and standardization.
We can divide the assessment of legacy system in two macro categories:
- Assessment of the data meaning
- Assessment of the data quality
Every piece of data that you move is something that has to be validated, to be cleaned and transformed. In data migration projects, migrating only relevant data ensures efficiency and cost control.
Understanding how the source data will be used in the target system is necessary for defining what to migrate. It is important to look at how people are using existing data elements. Are people using specific fields in a variety of different ways, which need to be considered when mapping out the new system’s fields?
The second macro area is the assessment of the quality of the data. It is very important to define a process to measure data quality early in the project, in order to obtain details of each single data piece, and identify unused fields or obsolete records that may have undergone multiple migration. It is also important to avoid the migration of duplicate records or not relevant records.
The quality analysis typically leads to a data cleaning activity.
Cleaning the source data is a key element of reducing data migration effort. This is usually a client’s responsibility. However, the client can be supported by the provider, who will perform specific data extractions, aggregations and normalizations in order to reduce client’s effort.
Another way to clean data is the adoption of migration scripts for cleaning purposes. It is important to understand that this kind of activity could create validation issues, because we are modifying the source data leaving the perimeter of a pure data migration and creating potential data integrity issues.
Tip 2 – Project Governance
The best for approaching a data migration project is clearly defining roles and responsibilities and avoiding accountability overlapping. This can be done in several steps:
- Define the owner of the data in the target system
- Include the business users in decision-making. They understand the history, the structure and the meaning of the source data. Additionally, if business users are engaged in the data migration project, it will be easier for them to interact with migrated cases after GoLive.
- Rely on data migration experts. Each data migration requires assistance from experts, which can fill the gap between business and IT, where both are key stakeholders but often unable to understand each other.
Based on our experience, what makes a difference is the presence of a business analyst. This is a person that acts as a bridge between the technical staff involved in the technical implementation of the migration, and the businesspeople. The business analyst can explain in a clear way technical requirements, and that can really help the business to define the migration rules based on how the target system will use the migrated data.
Tip 3 – Roll back & Dry Run
A roll back strategy has to be put in place in order to mitigate risks of potential failures. Access to source data have to be done in read only mode. This prevents any kind of data modification and ensures its integrity. Backups have to be performed on the target system in order to restore it in case of failures.
Accurate data migration dry run allows to execute validation and production migrations without incidents or deviations. Procedures and processes have to be tested in order to check the completeness of the records, and to ensure the integrity and authenticity in according with data migration rules and purposes.
Tip 4 – The Importance of the Data Mapping Specification Document
Data Mapping Specifications document is the core of data migration. It ensures a complete field mapping and it is used to collect all mapping rules and exceptions.
This project phase is usually long and tiring for a number of reasons:
- Volume and amount of data details
- Technical activity with technical documents
- Little knowledge of dynamics of target database
- Compromises that have to be made
The Data Mapping Specifications document specified details all the rules related to the data that is migrated. The following tips can help you to do it in the most efficient way:
- Clarify what has to be migrated and what shouldn’t be migrated
- Clean source data – this will reduce the number of fields to migrate
- Liaise with a business analyst that will translate technical requirements and help to explain how data will work in the target system
- Rely on data migration expert that have already performed similar data migration in the past
- Avoid using an Excel sheet for mapping table fields – a more visual document with pictures will ease the conversation with the business users
Tip 5 – Perform comprehensive validation testing
To ensure that the goals of the data migration strategy are achieved, a company needs to develop a solid data migration verification process. The data migration verification strategy needs to include ways to prove that the migration was successfully completed, and data integrity was maintained.
Tools or techniques that can be used for data migration verification include source vs target data integrity checks. A combination of manual and automatic checks can be used to cover all verification needs. The verification can include qualitative and quantitative checks.
In order to carry out data verification, a processing of a migrated data through a workflow must be done. This ensures that migrated data properly interacts with the target system functionalities.
The sampling strategy is defined in the Data migration plan and it should be driven by the risk impact of the data. 100% sampling is not feasible and adequate, therefore standards such as ANSI/AQL is usually used to define the risk-based sampling strategy.
This article is based on the webinar “How to Perform a Successful Data Migration in Life Sciences” presented by Alessandro Longoni on May 14. Register today to get access to the webinar recording.