Web Scraping for Price Statistics in the Philippines 2021
Author: Josefina V. Almeda, Ph.D., Manuel Leonard F. Albis, Jessa S. Lopez, Maegan S. Saroca & Shushimita G. Pelayo
Abstract:
Official price statistics in the Philippines are mainly sourced from the conduct of regular surveys and censuses which entail high costs. As businesses move into digital platforms, alternatives to these traditional data sources have become more available and one of which is web scraping. Web scraping is the process of collecting information from the web. As digital and online platforms become increasingly utilized for commerce, web scraping offers a way to increase the frequency while reducing the cost of data collection compared to price surveys. This paper aims to compute an online CPI of the National Capital Region (NCR) which will be compared to the official CPI of NCR calculated by the PSA. This study introduces a hybrid approach in the computation of the online CPI and presents the results of the five-month official run of the developed web scraping programs. Finally, this paper provides recommendations that will be useful for future web scraping projects in the Philippines.
Keywords:web scraping, online prices, CPI, PCOICOP, R, RSelenium, rvest