Distributed Frequent Itemset Mining with Bitwise Method and Using the Gossip-Based Protocol

Hoda Rafieipour, Azadeh Abdollah Zadeh, Mehrdad Mirzaei

Abstract


Nowadays, distributed systems are prevalent and practical in network environments. In distributed systems, pattern recognition help to extract information from network nodes. Meanwhile, data mining in such systems needs resource consideration in terms of storage and computational time. The primary requirement of these systems is a scalable mechanism to distribute the tasks on several databases. Moreover, to do a centralized process, relocating data from all nodes or partial nodes to a central node has confidential risks and traffic overhead. Therefore, distributed data mining in distributed environments needs systematic and structural techniques. In this paper, we propose a new algorithm to extract frequent itemsets in Wireless Sensor Networks. Through this algorithm, nodes frequent local itemsets are obtained with a Bitwise approach, and nodes are classified into clusters by using the Low Energy-Adaptive Clustering Hierarchy (LEACH) algorithm. Connecting the head cluster is performed by a Gossip-based protocol to achieve the values of global support, and it finally resulted in the extraction of frequent itemsets. The proposed algorithm has been simulated in various scenarios using Java software, and algorithm efficiency is evaluated in terms of execution time and average accuracy. Our algorithm is compared with a Gossip-based algorithm, and then some improvements in execution time have been presented.


Keywords


Frequent Itemset mining, distributed data mining, Gossip-based protocol, Bitwise approach

Full Text:

Abstract PDF

References


Agrawal, R., & Shafer, J. (1996). Parallel Mining of Association Rules. IEEE Trans. Knowledge and Data Eng, 962-969.

Agrawal, R., Imieliński, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD international conference on Management of data (pp. 207-216). IEEE.

Ashrafi, M., Taniar, D., & Smith, K. (2004). ODAM: An optimized distributed association rule mining algorithm. IEEE distributed systems online.

Bagheri, M. M.-H. (2012). Mining Distributed Frequent Itemsets Using a Gossip-Based Protocol. In 2012 9th International Conference on Ubiquitous Intelligence and Computing and 9th International Conference on Autonomic and Trusted Computing (pp. 780-785). IEEE.

Burdick, D., Calimlim, M., & Gehrke, J. (2001). Mafia: A maximal frequent itemset algorithm for transactional databases. In Proceedings 17th international conference on data engineering (pp. 443-452). IEEE.

Cheung, D., Han, J., Ng, V., Fu, A., & Fu, Y. (1996). A fast distributed algorithm for mining association rules. Fourth International Conference on Parallel and Distributed Information Systems (pp. 31-42). IEEE.

Guan, H., & Ip, H. (2007). A study of parallel data mining in a peer-to-peer network. Concurrent Engineering, 281-289.

Han, J., Pei, J., & Yin, Y. (2000). Mining frequent patterns without candidate generation. ACM sigmod record, 1-12.

Heinzelman, W., Chandrakasan, A., & Balakrishnan, H. (2000). Energy-efficient communication protocol for wireless microsensor networks. 33rd annual Hawaii international conference on system sciences (p. 10). IEEE.

Izadi, V., Shahri, P., & Ahani, H. (2020). A compressed-sensing-based compressor for ECG. Biomedical engineering letters, 1-9.

Lin, C., Li, W., Chen, J., Chung, W., Chung, S., & Lin, K. (2019). A Distributed Algorithm for Fast Mining Frequent Patterns in Limited and Varying Network Bandwidth Environments. Applied Sciences, 1859.

Lin, K., & Chung, S. (2015). A fast and resource-efficient mining algorithm for discovering frequent patterns in distributed computing environments. Future generation computer systems, 49-58.

Lin, K., & Lo, Y. (2013). Efficient algorithms for frequent pattern mining in many-task computing environments. Knowledge-Based Systems, 10-21.

Mohamed, A., Modir, A., Shah, K., & Tansel, I. (2019). Control of the Building Parameters of Additively Manufactured Polymer Parts for More Effective Implementation of Structural Health Monitoring (SHM) Methods. Structural Health Monitoring.

Mohamed, A., Modir, A., Tansel, I., & Uragun, B. (2019). Detection of Compressive Forces Applied to Tubes and Estimation of Their Locations with the Surface Response to Excitation (SuRE) Method. 9th International Conference on Recent Advances in Space Technologies (RAST) (pp. 83-88). IEEE.

Otey, M., Wang, C., Parthasarathy, S., Veloso, A., & Meira, W. (2003). Mining frequent itemsets in distributed and dynamic databases. Third IEEE International Conference on Data Mining (pp. 617-620). IEEE.

Park, B., & Kargupta, H. (2002). Distributed data mining: Algorithms, systems, and applications.

Schuster, A., Wolff, R., & Trock, D. (2005). A high-performance distributed algorithm for mining association rules. Knowledge and Information Systems, 458-475.

Sohrabi, M., & Barforoush, A. (2012). Efficient colossal pattern mining in high dimensional datasets. Knowledge-Based Systems, 41-52.

Sohrabi, M., & Barforoush, A. (2013). Parallel frequent itemset mining using systolic arrays. Knowledge-Based Systems, 462-471.

Song, W., Yang, B., & Xu, Z. (2008). Index-BitTableFI: An improved algorithm for mining frequent itemsets. Knowledge-Based Systems, 507-513.

Sumalatha, S., & Subramanyam, R. (2020). Distributed mining of high utility time-interval sequential patterns using the MapReduce approach. Expert Systems with Applications.

Surakanti, S., Khoshnevis, S., Ahani, H., & Izadi, V. (2019). Efficient Recovery of Structural Health Monitoring Signal based on Kronecker Compressive Sensing. International Journal of Applied Engineering Research, 14(23), 4256-4261.

Tanbeer, S., Ahmed, C., & Jeong, B. (2009). Parallel and distributed frequent pattern mining in large databases. 11th IEEE International Conference on High-Performance Computing and Communications (pp. 407-414). IEEE.

Wolff, R., & Schuster, A. (2004). Association rule mining in peer-to-peer systems. IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 2426-2438.

Zaïane, O., El-Hajj, M., & Lu, P. (2001). Fast parallel association rule mining without candidacy generation. IEEE international conference on data mining (pp. 665-668). IEEE.


Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.