A. Data Preparation
The main data sets used by our data-mining operation are identified and
cleansed for any data impurities. Since the data in the data warehouse
are integrated and filtered, the data warehouse is the target set for our
data-mining operations.
B. Data Analysis and Classification
We study the data and identify the common data characteristics or patterns in
order to apply the specific algorithms to find:
- Data groupings, classifications, clusters, or sequences.
- Data dependencies, links, or relationships.
- Data patterns, trends, and deviations.
C. Knowledge Acquisition
During this phase, we select the appropriate modeling or knowledge
acquisition algorithms. The most typical algorithms used in data mining are
based on neural networks, decision trees, rules induction, genetic algorithms,
classification and regression trees, memory-based reasoning, nearest neighbor
and data visualization. We use many of these algorithms in any combination to
generate a computer model that reflects the behavior of the target data set.
D. Prognosis
The data mining findings are used to predict future behavior and forecast
business outcomes.