Zusammenfassung der Ressource
Big Data using Hadoop
Analytics
- Definion
Anmerkungen:
- Big data refers to the size of a dataset that has grown
too large to be manipulated through traditional methods. These methods
include capture, storage, and processing of the data in a tolerable amount of
time. Although the term big data was once applied to the concept of
data warehouses, it now refers to large-scale processing architectures that focus
on capacity, throughput, and genericity of processing.
- Hadoop
Anmerkungen:
- Hadoop refers to the specific software framework
developed under the Apache Project for massively distributed data processing.
Its design supports a highly scalable network of thousands of nodes backed by
petabytes of data. Hadoop was originally designed using the Java™
language but today has extended itself to many other languages for scripting.
Understand the architectures possible with Hadoop and the benefits of their
use.
- Problem-solving with
Hadoop
Anmerkungen:
- Hadoop was inspired by Google's MapReduce usage model,
Hadoop is a generic application framework for the processing of massive
amounts of data. Learn about the use of Hadoop in artificial intelligence with
Apache Mahout, Hadoop with Java technology, and combining Hadoop with the
Dojo toolkit for data visualization.
- Big data and cloud computing
- Hadoop
Analytics
Anmerkungen:
- Hadoop is at the core of the Big Data revolution. Vendors such as Cloudera, MapR, and Hortonworks have taken this open source software that enables distributed parallel processing of huge amounts of data, and included important data management and support capabilities.
- capabilities for Hadoop
Analytics
- Fastest Platform to build
analytics
Anmerkungen:
- Sophisticated but accessible predictive and spatial tools, combined in a simple, workflow design environment
- Simple sharing of Big Data analytics
Anmerkungen:
- Single click sharing of analytic applications that can be used by any decision maker
- All relevant data
Anmerkungen:
- Access, integration, and cleaning of sources of data as varied as
Hadoop (including Cloudera & MapR) or NoSQL (MongoDB) and Excel or
Teradata
- About Hadoop
software
Anmerkungen:
- Apache Hadoop software is the only distribution built from silicon up to enable the widest range of data analysis on Apache Hadoop. It is the first with hardware-enhanced performance and security capabilities
- Big data in Private
sectors
- Ebay
Anmerkungen:
- Ebay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB
Hadoop cluster for search, consumer recommendations, and merchandising
- FaceBook
Anmerkungen:
- Facebook handles 50 billion photos from its user base.