samplesize meaning in Chinese
- A popular solution toimprove the speed and scalability of the association rule mining is todo the algorithm on a random sample instead of the entire database . buthow to effectively define and efficiently estimate the degree of errorwith respect to the outcome of the algorithm , and how to determine the samplesize needed are entangling researches until now . in this paper , an effective and efficient algorithm is given based on the pac probably approximate correct learning theory to measure and estimatesample error
关联规则挖掘作为数据挖掘的核心任务之一,由于其任务本身的复杂性通常需要多次整个扫描数据库才能完成挖掘任务且频繁模式可能产生组合爆炸,使得从原始的大规模数据集上抽取一部分样本,在其上寻找用户感兴趣的近似规则成为目前提高算法效率和可扩展性的一种简单有效的现实可行方法之一。 - Then , a new adaptive , on - line , fast samplingstrategy - multi - scaling sampling - is presented inspired by mra multi - resolution analysis and shannon sampling theorem , for quicklyobtaining acceptably approximate association rules at appropriate samplesize . both theoretical analysis and empirical study have showed that thesampling strategy can achieve a very good speed - accuracy trade - off
但是,取样策略必须在算法的效率和结果的精确性之间进行很好的折中, “如何确定合适的样本大小使得运行于其上的关联规则挖掘满足精确性的要求取样复杂性”成为这一方法的关键难解问题。