Determination of the best rule-based analysis results from the comparison of the Fp-Growth, Apriori, and TPQ-Apriori Algorithms for recommendation systems
Keywords:
association rule, Fp-growth, apriori, TPQ-apriori, rapidminerAbstract
The popular association rule algorithms are Apriori and fp-growth; both of these algorithms are very familiar among data mining researchers; however, there are some weaknesses found in the association rule algorithm, including long dataset scans in the process of finding the frequency of the item set, using large memory, and the resulting rules being sometimes less than optimal. In this study, the authors made a comparison of the fp-growth, Apriori, and TPQ-Apriori algorithms to analyze the rule results of the three algorithms. TPQ- Apriori is an algorithm developed from the Apriori algorithm. For experiments, the Apriori and fp-growth algorithms use RapidMiner and Weka tools, while the TPQ-apriori algorithm uses self-built application programs. The dataset used is the sales data for the Kopegtel NTB department store, which has been uploaded on the Kaggle site. As for the results of testing the base rules from the overall results of testing the rules with the good Kopegtel dataset for 100%, 50%, and 25% of the total volume of the dataset, a conclusion can be drawn that the larger the dataset to be processed, the results will be more optimal when using the fp-growth algorithm RapidMiner, but not optimal if the dataset to be processed is small. It is different from using the Apriori and Weka FP-growth algorithms, where the resulting rules are less than optimal if the dataset used is large and optimal if the dataset is small. Several rules do not appear in the fp-growth and Apriori Weka algorithms because the two algorithms do not have a tolerance value in Weka's tools for the support of the rules that will be displayed. Meanwhile, the TPQ- Apriori algorithm that has been developed is capable of producing optimal rules for both large datasets and small datasets.