Data lake (warehouse)


For more information check out the course HKPolyUx: ISE101x Knowledge Management and Big Data in Business on
Prohor Leykin
Mind Map by Prohor Leykin, updated more than 1 year ago
Prohor Leykin
Created by Prohor Leykin over 6 years ago

Resource summary

Data lake (warehouse)
  1. Data mining from Big Data (4V's)
    1. 4 V's
      1. Volume
        1. The simpliest
        2. Variety
          1. That is what really makes the data Big
          2. Velocity
            1. Delays significantly decrease the value of information
            2. Value
            3. Finding patterns, clusters, corelations, predicting variable
            4. ETL (Extract, transform and load)
              1. Hadoop (open source). Bring algorithm to a data MapReduce
                1. To load or not to load Data to a Lake?
                2. Streaming instead of batch update
                3. Data governance
                  1. Or just liabilities?
                    1. Data lake ->
                      1. Lake may leak
                        1. Data swamp ->
                          1. Toxic swamp
                    2. Information assets
                      1. All kinds of  information collected by an organization
                        1. Structural
                          1. Unstructural
                            1. Semi structural information
                          2. Business objectives
                            1. Know your customer (KYC)
                            2. OLAP (olnine analysis processing)
                              1. OLTP (online transaction processing)
                                1. Collect and process the data at the moment of  it's creation
                                  1. OLA (online action) due to user's activity, patterns, clusters, value prediction
                                Show full summary Hide full summary


                                IS 383 Quiz
                                Natalie Balzert
                                Managing Digital Data Review
                                Shannon Anderson-Rush
                                Big Data - Hadoop
                                Pedro J. Plasenc
                                SEGURIDAD DIGITAL
                                Ivonne Montes De Oca Ospina
                                QLIK Sense - Business Analyst
                                Abir Chowdhury
                                Top 5 Data Science Certifications In-demand By Fortune 500 Firms in 2022
                                Data science council of America
                                Top 5 Data Science Certifications In-demand By Fortune 500 Firms in 2022
                                Data science council of America
                                Big Data
                                Edgar Reverón
                                Conceptual Map: Big data
                                Jeffrey Bedoya