Data pre-processing

Description

Mind Map on Data pre-processing, created by Saravanakumar on 18/02/2015.
Saravanakumar
Mind Map by Saravanakumar, updated more than 1 year ago
Saravanakumar
Created by Saravanakumar about 9 years ago
27
0

Resource summary

Data pre-processing
  1. Data cleaning
    1. Missing values
      1. Ignore the tuple
        1. Fill in the missing value manually
          1. Use a global constant to fill in the missing value
            1. Use the attribute mean to fill in the missing value
              1. Use the attribute mean for all samples belonging to the same class as the given tuple
                1. Use the most probable value to fill in the missing value
                  1. Use the most probable value to fill in the missing value
                  2. Noisy data

                    Annotations:

                    •    Noise is a random error or variance in a measured variable.    
                    1. Binning
                      1. Regression
                      2. Clustering
                      3. Data cleaning as a process
                        1. Data integration and transformation
                          1. Data Integration
                            1. Data Transformation
                              1. Smoothing
                                1. Aggregation
                                  1. Generalization
                                    1. Normalization
                                      1. Attribute construction
                                  2. Data reduction
                                    1. Data cube aggregation
                                      1. Attributes subset selection
                                        1. Stepwise forward selection
                                          1. Stepwise backward elimination:
                                            1. Combination of forward selection and backward elimination
                                              1. Decision tree induction
                                              2. Dimensionality reduction
                                                1. Numerosity reduction
                                                  1. Data discretization and concept hierarchy generation
                                                  2. Why Preprocess the Data?
                                                    1. Data Discretization and Concept Hierarchy Generation
                                                      1. Discretization and Concept Hierarchy Generation for Numerical Data
                                                        1. Concept Hierarchy Generation for Categorical Data
                                                        2. Descriptive Data Summarization
                                                          1. Measuring the Central Tendency
                                                            1. Measuring the Dispersion of Data
                                                              1. Graphic Displays of Basic Descriptive Data Summaries
                                                              Show full summary Hide full summary

                                                              Similar

                                                              Germany 1918-45
                                                              paul giannini
                                                              Orwell and 1984
                                                              Polina Strich
                                                              AQA Human Geography
                                                              georgie.proctor
                                                              Religious Language
                                                              michellelung2008
                                                              F211: Transport in animals keywords and info
                                                              Gurdev Manchanda
                                                              Physics: Energy resources and energy transfer
                                                              katgads
                                                              Cell structure F211 OCR AS Biology
                                                              helen.rebecca
                                                              Nazi Germany 1933-39
                                                              c7jeremy
                                                              Astronomy Practice Quiz
                                                              cbruner
                                                              Test Primer Parcial - Tecnologías de la Información I
                                                              Ing. José Luis A. Hernández Jiménez