Big data refers to the size of a dataset that has grown
too large to be manipulated through traditional methods. These methods
include capture, storage, and processing of the data in a tolerable amount of
time. Although the term big data was once applied to the concept of
data warehouses, it now refers to large-scale processing architectures that focus
on capacity, throughput, and genericity of processing.
Hadoop
Annotations:
Hadoop refers to the specific software framework
developed under the Apache Project for massively distributed data processing.
Its design supports a highly scalable network of thousands of nodes backed by
petabytes of data. Hadoop was originally designed using the Java™
language but today has extended itself to many other languages for scripting.
Understand the architectures possible with Hadoop and the benefits of their
use.
Problem-solving with
Hadoop
Annotations:
Hadoop was inspired by Google's MapReduce usage model,
Hadoop is a generic application framework for the processing of massive
amounts of data. Learn about the use of Hadoop in artificial intelligence with
Apache Mahout, Hadoop with Java technology, and combining Hadoop with the
Dojo toolkit for data visualization.
Big data and cloud computing
Hadoop
Analytics
Annotations:
Hadoop is at the core of the Big Data revolution. Vendors such as Cloudera, MapR, and Hortonworks have taken this open source software that enables distributed parallel processing of huge amounts of data, and included important data management and support capabilities.
capabilities for Hadoop
Analytics
Fastest Platform to build
analytics
Annotations:
Sophisticated but accessible predictive and spatial tools, combined in a simple, workflow design environment
Simple sharing of Big Data analytics
Annotations:
Single click sharing of analytic applications that can be used by any decision maker
All relevant data
Annotations:
Access, integration, and cleaning of sources of data as varied as
Hadoop (including Cloudera & MapR) or NoSQL (MongoDB) and Excel or
Teradata
About Hadoop
software
Annotations:
Apache Hadoop software is the only distribution built from silicon up to enable the widest range of data analysis on Apache Hadoop. It is the first with hardware-enhanced performance and security capabilities
Big data in Private
sectors
Ebay
Annotations:
Ebay.com uses two data warehouses at 7.5 petabytes and 40PB as well as a 40PB
Hadoop cluster for search, consumer recommendations, and merchandising
FaceBook
Annotations:
Facebook handles 50 billion photos from its user base.