K-Means Clustering Analysis For Identifying Product Purchase Patterns Based On Country On E-Commerce Platforms
DOI:
https://doi.org/10.31004/riggs.v4i4.3296Keywords:
K-means, Clustering, E-Commerce, Marketing Strategy, Elbow MethodAbstract
E-commerce sites get a lot of transaction data from people in different countries who like different kinds of products. It is very important to know how people buy things based on their country and the type of product they buy in order to come up with better and more efficient marketing plans. This study seeks to discern product purchasing patterns by country through the application of the K-Means clustering algorithm on international e-commerce transaction data. This study utilized a dataset comprising 6,000 e-commerce transaction records, characterized by two primary variables: country and product category. Several methods were used in the preprocessing stage. For example, missing values were replaced to deal with missing data, nominal data was changed to numerical data to change categorical data into numerical data, and Z-transformation was used to normalize the data so that it was all on the same scale. We used the K-Means algorithm to group data into clusters with different k values, such as k=2, 5, 10, 15, 20, and 25. We then used the average within centroid distance metric and the elbow method to find the best number of clusters. The elbow method analysis showed that the best number of clusters was k=10, which showed a big drop in the average within centroid distance value. The ten clusters with algorithms K-Means that were made show very specific market segmentation, with each cluster having its own set of countries and product categories that are most popular.
Downloads
References
A Sharma, R., & Kumar, P. (2023). Digital Transformation in E-Commerce: Trends and Challenges in the Post-Pandemic Era. International Journal of Electronic Commerce Studies, 14(2), 145–162.
Chen, A., Wang, H., & Zhang, L. (2023). Cross-Cultural Consumer Behavior Analysis in Global E-Commerce Platforms. Journal of Business Research, 156, 324–338.
Jain, D., & Singh, S. (2023). A Comprehensive Review of Clustering Algorithms for Big Data Analytics. Journal of King Saud University - Computer and Information Sciences, 35(4), 101–119.
Ahmed, M., Seraj, R., & Islam, S. M. S. (2020). The k-means Algorithm: A Comprehensive Survey and Performance Evaluation. Electronics, 9(8), 1295. https://doi.org/10.3390/electronics9081295
Ramadhani, P., Wibowo, A., & Prasasti, D. (2023). Customer Segmentation Using K-Means Clustering and RFM Analysis for E-Commerce Personalization Strategy. Journal of Information Systems Engineering and Business Intelligence, 9(1), 85–94.
Hidayat, N., Wardhani, S., & Rakhmawati, A. (2023). Implementation of K-Means Clustering for Customer Shopping Pattern Analysis in Fashion E-Commerce. International Journal of Advanced Computer Science and Applications, 14(5), 234–242.
Wang, L., Liu, Y., & Zhang, X. (2022). Big Data Analytics in E-Commerce: A Systematic Literature Review. Journal of Theoretical and Applied Electronic Commerce Research, 17(3), 1044–1070.
Han, J., Kamber, M., & Pei, J. (2022). Data Mining: Concepts and Techniques (4th ed.). Morgan Kaufmann.
Kumar, S., & Sharma, D. (2020). Enhanced K-Means Clustering Algorithm with Improved Initial Centroids. International Journal of Computer Sciences and Engineering, 8(6), 150–157.
Murtagh, F., & Contreras, P. (2022). Algorithms for Hierarchical Clustering: An Overview and Recent Advances. WIREs Data Mining and Knowledge Discovery, 12(1), e1430.
Prasetyo, Y., Santoso, B., & Kurniawan, I. (2023). Comparative Analysis of Clustering Algorithms for E-Commerce Customer Segmentation: K-Means, DBSCAN, and Hierarchical Clustering. Jurnal Teknologi Informasi Dan Ilmu Komputer, 10(2), 345–356.
Ardiansyah, R., Handayani, F., & Safitri, N. (2023). Market Basket Analysis and Customer Clustering for Product Recommendation in Indonesian E-Commerce. Journal of Big Data, 10, 78.
Ahmed, M., Seraj, R., & Islam, S. M. S. (2020). The k-means Algorithm: A Comprehensive Survey and Performance Evaluation. Electronics, 9(8), 1295. https://doi.org/10.3390/electronics9081295
Murtagh, F., & Contreras, P. (2011). Algorithms for hierarchical clustering: an overview. WIREs Data Mining and Knowledge Discovery, 2(1), 86–97. https://doi.org/10.1002/widm.53
Gelbrich, K., Müller, S., & Westjohn, S. (2023). Global trends in consumer behavior. In Cross-Cultural Consumer Behavior (pp. 54–67). Edward Elgar Publishing. https://doi.org/10.4337/9781803923192.00009
Li, Y., Zhang, R., & Jiang, D. (2022). Order-Picking Efficiency in E-Commerce Warehouses: A Literature Review. Journal of Theoretical and Applied Electronic Commerce Research, 17(4), 1812–1830. https://doi.org/10.3390/jtaer17040091
Han, J., Kamber, M., & Pei, J. (2012). Advanced Pattern Mining. In Data Mining (pp. 279–325). Elsevier. https://doi.org/10.1016/b978-0-12-381479-1.00007-1
Rahadiyan, H. A. (2023). Segmentation of Mentoring Customer Characteristics Using the K-Means Method and Hierarchical Clustering for Customer Relationship Management (CRM). Jurnal CoreIT: Jurnal Hasil Penelitian Ilmu Komputer Dan Teknologi Informasi, 9(1), 64. https://doi.org/10.24014/coreit.v9i1.21567
E-commerce Recommendation Algorithm Based on Big Data Analysis and Genetic Fuzzy Clustering. (2023). Financial Engineering and Risk Management, 6(9). https://doi.org/10.23977/ferm.2023.060904
Anam, S., Fitriah, Z., Hidayat, N., & Maulana, M. H. A. A. (2023). Classification Model for Diabetes Mellitus Diagnosis based on K-Means Clustering Algorithm Optimized with Bat Algorithm. International Journal of Advanced Computer Science and Applications, 14(1). https://doi.org/10.14569/ijacsa.2023.0140172
Wang, Y., Krishna Saraswat, S., & Elyasi Komari, I. (2023). Big data analysis using a parallel ensemble clustering architecture and an unsupervised feature selection approach. Journal of King Saud University - Computer and Information Sciences, 35(1), 270–282. https://doi.org/10.1016/j.jksuci.2022.11.016
Sharma, V. (2025). Law and Emerging Technologies in Global Commerce: Regulatory Challenges in the Digital Transformation Era. Journal of International Commercial Law and Technology, 6(1), 549–554. https://doi.org/10.61336/jiclt/25-01-51
Kaewpradit, T. (2025). Optimizing Retail Strategy: A Data-Driven Approach to Customer Segmentation Using RFM Analysis and K-Means Clustering. Elsevier BV. https://doi.org/10.2139/ssrn.5238097
KumarSihag, V., & Kumar, S. (2013). Graph based Text Document Clustering by Detecting Initial Centroids for k-Means. International Journal of Computer Applications, 62(19), 1–4. https://doi.org/10.5120/10185-5005
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Ryan Arya Pramudya, Vinsent Brilian Adiguna

This work is licensed under a Creative Commons Attribution 4.0 International License.


















