Web Site

Economy-point.org



» Economics » Multivariate procedure » Topics begins with D » DATA Mining


Page modified: Friday, June 23, 2006 20:29:56

By DATA Mining or also data digging one understands systematic (or semiautomatic usually automated) discovering and extracting of unknown quantity information from large quantities of data.

General information

Enormous data sets develop today in enterprises, in research projects, in administrations or in the Internet. DATA Mining makes the automatic evaluating possible of such volume of data by statistic procedures, artificial neural nets, Fuzzy Clustering procedures or genetic algorithms. A goal thereby is seeking out rules and samples and/or statistic remarkablenesses. Thus e.g. leave themselves. Changes in the behavior of customers or customer groups seek out and business strategies can to it be aligned. In addition, deviating behavior of individual persons can be recognized. This calls data-security commissioners on the plan, which accompany the application of the procedures of the DATA Mining critical.

Definition

Franc Bensberg understands DATA Mining as integrated process, "„by use of methods to a volume of data the samples discovered "“. The term DATA Mining is defined here from process orientated view; A goal is the recognition of samples. Consciously in this definition without the term of the information one does, since DATA Mining of the level of the semiotic is assigned to sigma tables. In the context of the explorativen data analysis the taking place argument with information in the sense of a subjective knowledge increase, which runs off on the pragmatic level, is assigned to the Knowledge Discovery in Databases.

Franc Bensberg does without the restriction often which can be found in the literature large data sets: Also smaller data sets can contain significant samples, which can discover and be pointed out by DATA Mining. A demarcation from DATA Mining to the statistic data analysis as well as a restriction that the DATA Mining zuordenbaren methods do not take place however.

In the following DATA Mining is understood following Bensberg as integrated process, which discovers by use of DATA Mining techniques to a volume of data samples and communicates. DATA Mining techniques are techniques, which can be assigned to the explorativen data analysis. A goal of the explorativen data analysis - and thus designing characteristic for the definition of DATA Mining techniques - is beyond the representation of the data "„the search for structures and characteristics [...]. It is therefore typically used if the question or is unclear also the choice of a suitable statistic model is not exactly defined. "“Therefore the interpretation of the discovered samples is incumbent on thereby the respective receiver, is the DATA Mining process to be assigned and does not represent conceptionally the demarcation for the concept of the Knowledge Discovery in Databases. The DATA Mining process covers thus, on the basis of the data selection, all activities, which are necessary for communication of samples discovered in volume of data. HUKEMANN divides this process following FAYYAD, PIATETSKY SHAPIRO and SMYTH into the phases: Task definition, selection and extraction, preparation and transformation, pattern recognition, evaluation and presentation.

Since the DATA Mining process on sigma tables the level takes place, the question arises, to what extent the evaluation of the results can be considered as a component of the DATA Mining process. While in the phase of the pattern recognition the and descriptive accuracy is examined, discovered samples are examined in the context of the evaluation for their relevance, their comprehensibility, their usefulness and usability as well as for their novelty. Here at the basis the lying quality functions depend strongly on subjective factors of influence and thus the pragmatic level influx songs. Thus the evaluation of the results must be assigned to the Knowledge Discovery in Databases. The acceptance that for any question and the task definition derived from it the DATA Mining process successfully schedules, may not be regarded as durable. In practice this process fails regularly both because of missing as well as because of incorrect data sources. Thus HIPPNER and SAVAGE refer to the fact that the individual phases in "„intensive interaction with the user and with numerous feedbacks run off "“. BERRY and LINOFF do completely without the imbedding of the task definition into the DATA Mining process. Thus it remains ensured that also the nondirectional search for samples, which no central question precedes can be illustrated by the DATA Mining process. For the guarantee of effectiveness and efficiency however comprehensive knowledge over setting of tasks and domain purchase must be present with the user. Only like that it is guaranteed that all possibilities and chances, which the domain-specific question offers, is used and any problems regarding the total goal is economically meaningfully solved. In the following the DATA Mining process is divided into four phases: Data selection, data preparation, pattern recognition and communication.

The interface to the Knowledge Discovery in Databases represents communication of the discovered samples, which can be evaluated and interpreted in the further one.


Articles in category "DATA Mining"

We found here 2 articles.

D

» DATA Mining
» Diskriminanzanalyse

Related Websites

We found here 5 related websites.

Page cached: Wednesday, July 5, 2006 14:55:47
Valid XHTML 1.0!  Valid CSS!

Page copy protected against web site content infringement by Copyscape