SPARK

Descrição

Mapa Mental sobre SPARK, criado por BOGDAN SHEVCHENKO em 18-10-2016.
BOGDAN SHEVCHENKO
Mapa Mental por BOGDAN SHEVCHENKO, atualizado more than 1 year ago
BOGDAN SHEVCHENKO
Criado por BOGDAN SHEVCHENKO mais de 9 anos atrás
15
0

Resumo de Recurso

SPARK
  1. RDD
    1. Действия
      1. Set
        1. Intersection(otherSet)
          1. union(otherSet)
            1. cartesian(otherSet)
            2. Functional
              1. filter(func)
                1. map(func)
                2. distinct
                3. Трансформации
                  1. saveAsTextFile(path)
                    1. array
                      1. collect()
                        1. take(n)
                          1. count
                            1. drop(n)
                            2. reduce(function)

                              Anotações:

                              • функция должна быть коммутативной и ассоциативной
                          2. MapReduce
                            1. WorkFlow
                              1. SparkContext
                                1. pyspark.sql.SparkSession (sparkContext)
                                  1. pyspark.sql.SparkSession (sparkContext)
                                2. Modules
                                  1. pyspark.sql

                                    Anotações:

                                    • http://spark.apache.org/docs/latest/api/python/pyspark.sql.html
                                    1. functions
                                      1. udf
                                    2. pyspark.streaming
                                      1. pyspark.ml
                                        1. pyspark.mllib

                                          Anotações:

                                          • https://habrahabr.ru/company/mlclass/blog/251471/
                                          1. linalg
                                            1. Vectors
                                              1. dense
                                                1. sparse
                                              2. stat
                                                1. Statistics
                                                  1. colStats
                                                    1. mean
                                                      1. numNonzeros
                                                        1. variance
                                                        2. corr
                                                      2. feature
                                                        1. StandardScaler

                                                          Anotações:

                                                          • scaler = StandardScaler(withMean=True, withStd=True).fit(features) scaler.transform (features.map(lambda x:x.toArray()))
                                                        2. classification
                                                          1. LogisticRegressionWithSGD
                                                            1. RidgeRegressionWithSGD
                                                              1. NaiveBayes
                                                              2. tree
                                                                1. DecisionTree
                                                                  1. RandomForest
                                                                  2. clustering
                                                                    1. KMeans
                                                                    2. recommendation
                                                                      1. ALS

                                                                        Anotações:

                                                                        • Коллаборативная фильтрация
                                                                  3. Shuffle

                                                                    Anotações:

                                                                    • https://0x0fff.com/spark-architecture-shuffle/

                                                                    Semelhante

                                                                    WordCount
                                                                    Nilesh Patel
                                                                    Filter and Map
                                                                    Nilesh Patel
                                                                    Joins
                                                                    Nilesh Patel
                                                                    Setup spark scala in windows
                                                                    Nilesh Patel