Outlier detection methodologies for alternative data sources: International review of current practices
This research has been funded by the Office for National Statistics as part of the research programme of the Economic Statistics Centre of Excellence (ESCoE). This paper was first published in July 2020: “Outlier detection methodologies for alternative data sources: International review of current practices" (ESCoE TR-07) by Janine Boshoff, Xuxin Mao and Garry Young.
The construction of consumer price indexes (CPI) has historically relied on manually and centrally collected price data. As point of sale (POS) scanner data and web-scraped data become more accessible, these alternative data represent a rich new source of information to produce consumer price information. While outlier detection methodologies are well established for traditional data sources, more research is required to better understand the unique quality and format of the alternative data. Several national statistical institutions (NSIs) have already started to conduct research into alternative data source and the outlier detection methodologies that are necessary before these data can be incorporated into CPI calculations. This project reviews the outlier detection methodologies adopted by NSIs that have started to incorporate alternative data sources in their calculation of CPI.