Data harmonisation refers to all efforts to combine data from different sources and aims to bring together various types, levels and sources of data in a way that they can be made compatible, comparable, reconciled and duplicates-free and thus, useful for decision making. Data harmonisation offers a unique opportunity to merge databases from studies with different methodologies and measurements that would otherwise could not be merged into one larger database. Data harmonisation techniques include Artificial Intelligence, Machine Learning and Map Projection.
Data linkage refers to all efforts to integrate data from different sources about the same person or entity to create a new, richer dataset. Data linkage methods are usually deterministic or probabilistic, but can be a combination of both. In deterministic methodology, the records must agree exactly on every character of every key variable to conclude that they correspond to the same entity. Probabilistic data linkage relies on calculating weights or scores for each key variable, based on the number of agreements and disagreements in the records.
Data Linkage and Data Harmonisation can complement each other to achieve the richest possible database to assist with decision making. Both areas are increasingly popular and require high computational capacity that is currently available, however, consensus for important barriers such as health data governance is still pending to be reached.