Supporting an Expert-centric Process of New Product Introduction With Statistical Machine Learning




Demand Forecasting, New Product Introduction, Statistical Machine Learning, Gradient Boosting, XGBoost


Industries that sell products with short-term or seasonal life cycles must regularly introduce new products. Forecasting the demand for New Product Introduction (NPI) can be challenging due to the fluctuations of many factors such as trend, seasonality, or other external and unpredictable phenomena (e.g., COVID-19 pandemic). Traditionally, NPI is an expertcentric process. This paper presents a study on automating the forecast of NPI demands using statistical Machine Learning (namely, Gradient Boosting and XGBoost). We show how to overcome shortcomings of the traditional data preparation that underpins the manual process. Moreover, we illustrate the role of cross-validation techniques for the hyper-parameter tuning and the validation of the models. Finally, we provide empirical evidence that statistical Machine Learning can forecast NPI demand better than experts.


Download data is not yet available.


F. M. Bass, “A new product growth for model consumer durables,” Management science, vol. 15, no. 5, pp. 215–227, 1969.

J. A. Norton and F. M. Bass, “A diffusion theory model of adoption and substitution for successive generations of high-technology products,” Management science, vol. 33, no. 9, pp. 1069–1086, 1987.

H. Lee, S. G. Kim, H.-w. Park, and P. Kang, “Pre-launch new product demand forecasting using the bass model: A statistical and machine learning-based approach,” Technological Forecasting and Social Change, vol. 86, pp. 49–64, 2014.

P. Yin, G. Dou, X. Lin, and L. Liu, “A hybrid method for forecasting new product sales based on fuzzy clustering and deep learning,” Kybernetes, 2020.

J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Annals of statistics, pp. 1189–1232, 2001.

T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining, 2016, pp. 785–794.

S. Arlot, A. Celisse, et al., “A survey of cross-validation procedures for model selection,” Statistics surveys, vol. 4, pp. 40–79, 2010.

G. C. Cawley and N. L. Talbot, “On over-fitting in model selection and subsequent selection bias in performance evaluation,” The Journal of Machine Learning Research, vol. 11, pp. 2079–2107, 2010.

U. Hjorth and U. Hjort, “Model selection and forward validation,” Scandinavian Journal of Statistics, pp. 95–105, 1982.

J. U. Hjorth, Computer intensive statistical methods: Validation, model selection, and bootstrap. CRC Press, 1993.

J. Racine, “Consistent cross-validatory model-selection for dependent data: Hv-block cross-validation,” Journal of econometrics, vol. 99, no. 1, pp. 39–61, 2000.

Z. Huang, “Clustering large data sets with mixed numeric and categorical values,” in Proceedings of the 1st pacific-asia conference on knowledge discovery and data mining,(PAKDD), Citeseer, 1997, pp. 21–34.

T. M. Kodinariya and P. R. Makwana, “Review on determining number of cluster in kmeans clustering,” International Journal, vol. 1, no. 6, pp. 90–95, 2013.

E. C. Mik and G. Koole, “New product demand forecasting,” Vrije Universiteit Amsterdam, Amsterdam, 2019.

S. Beheshti-Kashi, H. R. Karimi, K.-D. Thoben, M. L¨ utjen, and M. Teucke, “A survey on retail sales forecasting and prediction in fashion markets,” Systems Science & Control Engineering, vol. 3, no. 1, pp. 154–161, 2015.

R. van Steenbergen and M. Mes, “Forecasting demand profiles of new products,” Decision support systems, vol. 139, p. 113 401, 2020.

A. L. Loureiro, V. L. Migu´ eis, and L. F. da Silva, “Exploring the use of deep neural networks for sales forecasting in fashion retail,” Decision Support Systems, vol. 114, pp. 81–93, 2018.