Data Mining from Big Data 4V-s

Description

For more information check out the course HKPolyUx: ISE101x Knowledge Management and Big Data in Business on edx.org https://courses.edx.org/courses/course-v1:HKPolyUx+ISE101x+3T2017/course/
Prohor Leykin
Mind Map by Prohor Leykin, updated more than 1 year ago
Prohor Leykin
Created by Prohor Leykin about 6 years ago
47
0

Resource summary

Data Mining from Big Data 4V-s
  1. Volume - the simpliest
    1. High dimensions
      1. Large number of records
        1. New sources
        2. Velocity harder
          1. Interaction with a customer
            1. Capture data, learn and act
              1. Enhance customer's journey
                1. Iteratively improve user expericence
          2. Variety - the hardest
            1. Number of data owners exploded
            2. Value
              1. KYC (know your customer)
                1. Difficult for internet companies - never sees customers
                2. Being able to exploit all the data available
                  1. Age of analythics
                    1. Access to Data
                      1. SQL
                        1. Look-up a few records
                          1. Populate standard report
                          2. OLAP, mining
                            1. Create new report
                            2. Data Mining
                              1. Locate a problem
                                1. Optimize business process
                                  1. Answer a tough question
                                    1. Understand something new
                                      1. Finding interesting structure in data
                                        1. Interesting patterns
                                          1. Segmentation, data clustering
                                          2. Predictive models
                                            1. Classification, regression
                                            2. Hidden relations
                                              1. Affinity (summarization) - relation between fields, associations
                                              2. Work for Data scientist
                                                1. Understands business needs
                                                  1. Able to close those gaps
                                                    1. Algorithms
                                                      1. Knows more about statistics than programmer
                                                      2. Data logic
                                                        1. Knows more about programming, than statistician
                                                  2. Technologies
                                                    1. Summarization
                                                      1. Variable corellation
                                                        1. Frequent itemsets
                                                          1. Association rules
                                                          2. Clustering
                                                            1. Distance
                                                              1. Partition
                                                                1. Sequence analysis
                                                                2. Classification / prediction
                                                                  1. Decision trees
                                                                    1. Neural networks
                                                                      1. Bayers nets
                                                                        1. Regression
                                                                          1. Support vector machines
                                                                      2. But
                                                                        1. End user is not a statistician
                                                                          1. Lack data warehousing expertise
                                                                            1. IT focus is to keep running
                                                                              1. Building data warehouse is too expensive
                                                                          2. Proliferating analytics throughout the organization
                                                                            1. Make every part of business smarter
                                                                              1. Embedding analytics into every area
                                                                                1. Significant business value
                                                                              2. Acquire and enhance actions
                                                                                1. Marketing and sales
                                                                                  1. Identify potential customer
                                                                                    1. Establish campaign effectiveness
                                                                                    2. Manufacturing process
                                                                                      1. Causes of manufacturing problems
                                                                                      2. Customer behaviour
                                                                                        1. Affinities, propensities
                                                                                        2. Fraudulent transaction detection
                                                                                          1. Loan approval
                                                                                            1. Establish credit worthiness of customer
                                                                                            2. Web analytics and metrics
                                                                                              1. Model user preferences
                                                                                                1. Recommendation, targeting
                                                                                          Show full summary Hide full summary

                                                                                          Similar

                                                                                          Managing Digital Data Review
                                                                                          Shannon Anderson-Rush
                                                                                          Big Data - Hadoop
                                                                                          Pedro J. Plasenc
                                                                                          Chapter 19 Key Terms
                                                                                          Monica Holloway
                                                                                          Data Warehousing and Mining
                                                                                          i7752068
                                                                                          SEGURIDAD DIGITAL
                                                                                          Ivonne Montes De Oca Ospina
                                                                                          Insurance Policy Advisor
                                                                                          Sufiah Takeisu
                                                                                          QLIK Sense - Business Analyst
                                                                                          Abir Chowdhury
                                                                                          Top 5 Data Science Certifications In-demand By Fortune 500 Firms in 2022
                                                                                          Data science council of America
                                                                                          Top 5 Data Science Certifications In-demand By Fortune 500 Firms in 2022
                                                                                          Data science council of America
                                                                                          Big Data
                                                                                          Edgar Reverón