Web Scraping for Price Statistics in the Philippines
Author: Manuel Leonard F. Albis, Maegan S. Saroca, Shushimita G. Pelayo & Jessa S. Lopez
Official price statistics in the Philippines are mostly sourced from regular surveys and censuses, the conduct of which entail high costs. Alternatives to these traditional data sources have presently become available as businesses move into digital platforms, one of which is web scraping. Web scraping is the process of collecting information from the web. As digital and online platforms become increasingly utilized for commerce, web scraping may help in reducing the frequency and cost brought about by price surveys due to time and resource constraints. This paper aims to determine the feasibility of web scraping in collecting price statistics for the calculation of the Consumer Price Index (CPI) in the Philippines. This study includes a pilot-run of a fourstage web scraping process conducted thrice a week during Mondays, Wednesdays and Fridays for one week through an automated platform developed using the R software and designed to scrape prices on a regular interval. This paper also presents a web scraping feasibility assessment of each major division of the Philippine Classification of Individual Consumption According to Purpose (PCOICOP). Ultimately, this paper provides recommendations for future web scraping projects in the Philippines.
Keywords:web scraping, online prices, CPI, PCOICOP, R