K-Means Clustering Analysis For Identifying Product Purchase Patterns Based On Country On E-Commerce Platforms

Authors

  • Ryan Arya Pramudya Universitas 17 Agustus 1945 Semarang
  • Vinsent Brilian Adiguna Universitas 17 Agustus 1945 Semarang

DOI:

https://doi.org/10.31004/riggs.v4i4.3296

Keywords:

K-means, Clustering, E-Commerce, Marketing Strategy, Elbow Method

Abstract

E-commerce sites get a lot of transaction data from people in different countries who like different kinds of products. It is very important to know how people buy things based on their country and the type of product they buy in order to come up with better and more efficient marketing plans. This study seeks to discern product purchasing patterns by country through the application of the K-Means clustering algorithm on international e-commerce transaction data. This study utilized a dataset comprising 6,000 e-commerce transaction records, characterized by two primary variables: country and product category. Several methods were used in the preprocessing stage. For example, missing values were replaced to deal with missing data, nominal data was changed to numerical data to change categorical data into numerical data, and Z-transformation was used to normalize the data so that it was all on the same scale. We used the K-Means algorithm to group data into clusters with different k values, such as k=2, 5, 10, 15, 20, and 25. We then used the average within centroid distance metric and the elbow method to find the best number of clusters. The elbow method analysis showed that the best number of clusters was k=10, which showed a big drop in the average within centroid distance value. The ten clusters with algorithms K-Means that were made show very specific market segmentation, with each cluster having its own set of countries and product categories that are most popular.

Downloads

Download data is not yet available.

Author Biography

Vinsent Brilian Adiguna, Universitas 17 Agustus 1945 Semarang

Bisnis Digital FEB UNTAG Semarang

References

A Sharma, R., & Kumar, P. (2023). Digital Transformation in E-Commerce: Trends and Challenges in the Post-Pandemic Era. International Journal of Electronic Commerce Studies, 14(2), 145–162.

Chen, A., Wang, H., & Zhang, L. (2023). Cross-Cultural Consumer Behavior Analysis in Global E-Commerce Platforms. Journal of Business Research, 156, 324–338.

Jain, D., & Singh, S. (2023). A Comprehensive Review of Clustering Algorithms for Big Data Analytics. Journal of King Saud University - Computer and Information Sciences, 35(4), 101–119.

Ahmed, M., Seraj, R., & Islam, S. M. S. (2020). The k-means Algorithm: A Comprehensive Survey and Performance Evaluation. Electronics, 9(8), 1295. https://doi.org/10.3390/electronics9081295

Ramadhani, P., Wibowo, A., & Prasasti, D. (2023). Customer Segmentation Using K-Means Clustering and RFM Analysis for E-Commerce Personalization Strategy. Journal of Information Systems Engineering and Business Intelligence, 9(1), 85–94.

Hidayat, N., Wardhani, S., & Rakhmawati, A. (2023). Implementation of K-Means Clustering for Customer Shopping Pattern Analysis in Fashion E-Commerce. International Journal of Advanced Computer Science and Applications, 14(5), 234–242.

Wang, L., Liu, Y., & Zhang, X. (2022). Big Data Analytics in E-Commerce: A Systematic Literature Review. Journal of Theoretical and Applied Electronic Commerce Research, 17(3), 1044–1070.

Han, J., Kamber, M., & Pei, J. (2022). Data Mining: Concepts and Techniques (4th ed.). Morgan Kaufmann.

Kumar, S., & Sharma, D. (2020). Enhanced K-Means Clustering Algorithm with Improved Initial Centroids. International Journal of Computer Sciences and Engineering, 8(6), 150–157.

Murtagh, F., & Contreras, P. (2022). Algorithms for Hierarchical Clustering: An Overview and Recent Advances. WIREs Data Mining and Knowledge Discovery, 12(1), e1430.

Prasetyo, Y., Santoso, B., & Kurniawan, I. (2023). Comparative Analysis of Clustering Algorithms for E-Commerce Customer Segmentation: K-Means, DBSCAN, and Hierarchical Clustering. Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(2), 345–356.

Ardiansyah, R., Handayani, F., & Safitri, N. (2023). Market Basket Analysis and Customer Clustering for Product Recommendation in Indonesian E-Commerce. Journal of Big Data, 10, 78.

Ahmed, M., Seraj, R., & Islam, S. M. S. (2020). The k-means Algorithm: A Comprehensive Survey and Performance Evaluation. Electronics, 9(8), 1295. https://doi.org/10.3390/electronics9081295

Murtagh, F., & Contreras, P. (2011). Algorithms for hierarchical clustering: an overview. WIREs Data Mining and Knowledge Discovery, 2(1), 86–97. https://doi.org/10.1002/widm.53

Gelbrich, K., Müller, S., & Westjohn, S. (2023). Global trends in consumer behavior. In Cross-Cultural Consumer Behavior (pp. 54–67). Edward Elgar Publishing. https://doi.org/10.4337/9781803923192.00009

Li, Y., Zhang, R., & Jiang, D. (2022). Order-Picking Efficiency in E-Commerce Warehouses: A Literature Review. Journal of Theoretical and Applied Electronic Commerce Research, 17(4), 1812–1830. https://doi.org/10.3390/jtaer17040091

Han, J., Kamber, M., & Pei, J. (2012). Advanced Pattern Mining. In Data Mining (pp. 279–325). Elsevier. https://doi.org/10.1016/b978-0-12-381479-1.00007-1

Rahadiyan, H. A. (2023). Segmentation of Mentoring Customer Characteristics Using the K-Means Method and Hierarchical Clustering for Customer Relationship Management (CRM). Jurnal CoreIT: Jurnal Hasil Penelitian Ilmu Komputer Dan Teknologi Informasi, 9(1), 64. https://doi.org/10.24014/coreit.v9i1.21567

E-commerce Recommendation Algorithm Based on Big Data Analysis and Genetic Fuzzy Clustering. (2023). Financial Engineering and Risk Management, 6(9). https://doi.org/10.23977/ferm.2023.060904

Anam, S., Fitriah, Z., Hidayat, N., & Maulana, M. H. A. A. (2023). Classification Model for Diabetes Mellitus Diagnosis based on K-Means Clustering Algorithm Optimized with Bat Algorithm. International Journal of Advanced Computer Science and Applications, 14(1). https://doi.org/10.14569/ijacsa.2023.0140172

Wang, Y., Krishna Saraswat, S., & Elyasi Komari, I. (2023). Big data analysis using a parallel ensemble clustering architecture and an unsupervised feature selection approach. Journal of King Saud University - Computer and Information Sciences, 35(1), 270–282. https://doi.org/10.1016/j.jksuci.2022.11.016

Sharma, V. (2025). Law and Emerging Technologies in Global Commerce: Regulatory Challenges in the Digital Transformation Era. Journal of International Commercial Law and Technology, 6(1), 549–554. https://doi.org/10.61336/jiclt/25-01-51

Kaewpradit, T. (2025). Optimizing Retail Strategy: A Data-Driven Approach to Customer Segmentation Using RFM Analysis and K-Means Clustering. Elsevier BV. https://doi.org/10.2139/ssrn.5238097

KumarSihag, V., & Kumar, S. (2013). Graph based Text Document Clustering by Detecting Initial Centroids for k-Means. International Journal of Computer Applications, 62(19), 1–4. https://doi.org/10.5120/10185-5005

Published

03-11-2025

How to Cite

[1]
R. A. Pramudya and V. B. Adiguna, “K-Means Clustering Analysis For Identifying Product Purchase Patterns Based On Country On E-Commerce Platforms”, RIGGS, vol. 4, no. 4, pp. 83–88, Nov. 2025.