Zusammenfassung der Ressource
Data pre-processing
- Data cleaning
- Missing
values
- Ignore the tuple
- Fill in the missing value manually
- Use a global constant to fill in the missing value
- Use the attribute mean to fill in the missing value
- Use the attribute mean for all samples belonging to the same class as the given tuple
- Use the most probable value to fill in the missing value
- Use the most probable value to fill in the missing value
- Noisy
data
Anmerkungen:
-
Noise is
a random error or variance in a measured variable.
- Binning
- Regression
- Clustering
- Data
cleaning
as a
process
- Data
integration and
transformation
- Data Integration
- Data Transformation
- Smoothing
- Aggregation
- Generalization
- Normalization
- Attribute construction
- Data reduction
- Data cube
aggregation
- Attributes
subset
selection
- Stepwise forward selection
- Stepwise backward elimination:
- Combination of forward selection and backward elimination
- Decision tree induction
- Dimensionality
reduction
- Numerosity
reduction
- Data
discretization and
concept hierarchy
generation
- Why Preprocess the Data?
- Data Discretization
and Concept Hierarchy
Generation
- Discretization and Concept
Hierarchy Generation for
Numerical Data
- Concept Hierarchy
Generation for
Categorical Data
- Descriptive Data Summarization
- Measuring the
Central
Tendency
- Measuring the
Dispersion of
Data
- Graphic Displays of
Basic Descriptive
Data Summaries