Outlier detection methodologies for alternative data sources: International review of current practices

Publication date: 2 Dec 2020 | Publication type: NIESR Discussion Paper | NIESR Author(s): Boshoff, J; Mao, X; Young, G | JEL Classification: C43, E31 | NIESR Discussion Paper Number: 523

This research has been funded by the Office for National Statistics as part of the research programme of the Economic Statistics Centre of Excellence (ESCoE). This paper was first published in July 2020: “Outlier detection methodologies for alternative data sources: International review of current practices" (ESCoE TR-07) by Janine Boshoff, Xuxin Mao and Garry Young. 

 

Abstract

The construction of consumer price indexes (CPI) has historically relied on manually and centrally collected price data. As point of sale (POS) scanner data and web-scraped data become more accessible, these alternative data represent a rich new source of information to produce consumer price information. While outlier detection methodologies are well established for traditional data sources, more research is required to better understand the unique quality and format of the alternative data. Several national statistical institutions (NSIs) have already started to conduct research into alternative data source and the outlier detection methodologies that are necessary before these data can be incorporated into CPI calculations. This project reviews the outlier detection methodologies adopted by NSIs that have started to incorporate alternative data sources in their calculation of CPI.
 

Keyword tags: 
consumer price index
multilateral indices
outlier detection
scanner data
web-scraped data