SPARK

Description

Mind Map on SPARK, created by BOGDAN SHEVCHENKO on 18/10/2016.
BOGDAN SHEVCHENKO
Mind Map by BOGDAN SHEVCHENKO, updated more than 1 year ago
BOGDAN SHEVCHENKO
Created by BOGDAN SHEVCHENKO about 9 years ago
15
0

Resource summary

SPARK
  1. RDD
    1. Действия
      1. Set
        1. Intersection(otherSet)
          1. union(otherSet)
            1. cartesian(otherSet)
            2. Functional
              1. filter(func)
                1. map(func)
                2. distinct
                3. Трансформации
                  1. saveAsTextFile(path)
                    1. array
                      1. collect()
                        1. take(n)
                          1. count
                            1. drop(n)
                            2. reduce(function)

                              Annotations:

                              • функция должна быть коммутативной и ассоциативной
                          2. MapReduce
                            1. WorkFlow
                              1. SparkContext
                                1. pyspark.sql.SparkSession (sparkContext)
                                  1. pyspark.sql.SparkSession (sparkContext)
                                2. Modules
                                  1. pyspark.sql

                                    Annotations:

                                    • http://spark.apache.org/docs/latest/api/python/pyspark.sql.html
                                    1. functions
                                      1. udf
                                    2. pyspark.streaming
                                      1. pyspark.ml
                                        1. pyspark.mllib

                                          Annotations:

                                          • https://habrahabr.ru/company/mlclass/blog/251471/
                                          1. linalg
                                            1. Vectors
                                              1. dense
                                                1. sparse
                                              2. stat
                                                1. Statistics
                                                  1. colStats
                                                    1. mean
                                                      1. numNonzeros
                                                        1. variance
                                                        2. corr
                                                      2. feature
                                                        1. StandardScaler

                                                          Annotations:

                                                          • scaler = StandardScaler(withMean=True, withStd=True).fit(features) scaler.transform (features.map(lambda x:x.toArray()))
                                                        2. classification
                                                          1. LogisticRegressionWithSGD
                                                            1. RidgeRegressionWithSGD
                                                              1. NaiveBayes
                                                              2. tree
                                                                1. DecisionTree
                                                                  1. RandomForest
                                                                  2. clustering
                                                                    1. KMeans
                                                                    2. recommendation
                                                                      1. ALS

                                                                        Annotations:

                                                                        • Коллаборативная фильтрация
                                                                  3. Shuffle

                                                                    Annotations:

                                                                    • https://0x0fff.com/spark-architecture-shuffle/
                                                                    Show full summary Hide full summary

                                                                    Similar

                                                                    WordCount
                                                                    Nilesh Patel
                                                                    Filter and Map
                                                                    Nilesh Patel
                                                                    Joins
                                                                    Nilesh Patel
                                                                    Setup spark scala in windows
                                                                    Nilesh Patel