Data Mining / Nursing Informatics Reading and Sharing

Data Mining: Concepts and Techniques, Third Edition (The Morgan Kaufmann Series in Data Management Systems)

Data mining (also known as knowledge discovery and data mining (KDD), knowledge discovery in data, and knowledge discovery in databases), refers to sorting through data to identify patterns and relationships in large relational databases in order to extract and utilize data. Data mining allows data to be extracted and transformed, stored in a data base, accessed, and analyzed by software programs. Additionally, the data retrieved can be presented in various forms, such as in graphs or tables. The types of relationships that are commonly found in data mining include classes, clusters, associations, and sequential patterns.

Data mining focuses on producing a solution that generates useful forecasting through a four-phase process:

problem identification
exploration of the data
pattern discovery
knowledge deployment, or application of knowledge to new data to forecast or generate predictions.

Informatics and Nursing

Problem identification

Initial phase of data mining
the problem must be defined, and everyone involve must understand the objectives and requirements of the data mining process they are initiating

Exploration of the data

begins with exploring and preparing the data for the data mining process
might include data access, cleansing, sampling, and transformation
- a technique that can be used for data reduction is clustering.
  - Clustering groups of statistical units into clusters (classes) in order to reduce the overall number of statistical units.
  - A cluster is comprised of elements that are similar to each other and dissimilar to other clusters, so clustering is essential a method of grouping.
  - To determine why the groups are different, then a different data reduction technique, factor analysis, must be used.
the goal is to identify the relevant or important variables and determine their nature

Pattern discovery

model building/ pattern identification
a complex phase of data mining
different models are applied to the same data to choose the best model for the data set being analyzed
model chosen should be identify the patterns in the data that will support the best predictions.
model must be tested, evaluated, and interpreted
this phase ends with a highly predictive, consistent patterns-identifying model

Knowledge deployment/ Application of knowledge

takes the pattern and model identified in the pattern discovery phase and applies them to new data to test whether they can achieve the desired outcome.

Data Mining / Nursing Informatics Reading and Sharing

Like this:

Related

Published by

Like this:

Leave a ReplyCancel reply

Share this:

Like this:

Related

Published by

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from ni365