Although the origin of Big Data is not known with certainty, throughout history there have been cases
such as: the use of data in the Spartan War in ancient Greece where, thanks to the collection of data
on how many bricks made up the wall, it was possible to establish the height for the manufacture of
stairs and save the soldiers. Also in ancient Mesopotamia, data was used to keep track of crops and
livestock; it was rudimentary accounting, but it already showed the need to organize information.
John Graunt (1663), considered the father of statistics, analyzed mortality data in London to detect
epidemic patterns; he was one of the first to apply systematic data analysis. Herman Hollerith
invented a machine that read punched cards to process census data in the United States in 1887; this
advance marked the beginning of automation in data management.
In the 1990s, data warehouses and data mining techniques emerged. The term "Big Data" began to be
discussed as a concept, although it was not yet widely known. From 2000 onward, with the rise of the
internet, social media, and mobile devices, data generation skyrocketed. This gave rise to the need for
new technologies to process massive amounts of information in real time.
Implications
They can cover different dimensions of which I highlight:
Health sector: In 2009, the world experienced a flu pandemic, H1N1, also known as
swine flu. The Google Flu Trends website was able to predict it thanks to the results of
keyword searches on its search engine. In short, it improves personalized medicine,
early disease detection, and epidemic management.
Technology sector: Massive data analysis of a user's purchasing patterns cross-referenced with the
purchasing data of others, thereby creating personalized ads. Powerful infrastructures (such as cloud
computing and distributed systems) are required to handle large volumes of data.
Ethical and legal implications: The massive use of personal data poses risks to information protection
and consent. Therefore, clear legal frameworks are required to regulate data access, use, and
ownership.
Definitions
It typically refers to the application of a practical
scientific approach to data resolution, where the aim is
to investigate three attributes: Volume/quantity of data,
variety in data origin or formats, speed/consumption of
data.
The result of collecting information at its most
granular level is what is obtained when a system is
instrumented and all the data that the
instrumentation can collect is retained.
Analyzing data that is truly messy or where the correct
questions/answers are unknown to perform an analysis
that helps find patterns, anomalies, or structures amid
otherwise chaotic/complex data points.
Data that cannot be processed using standard
databases because it is too large, moves quickly, or
is too complex for traditional data processing tools.
Motivations
The motivations for Big Data stem from the desire to leverage
the enormous volume of information generated daily to improve
decision-making, optimize processes, personalize services, and
anticipate trends. Businesses, governments, and scientists use it
to innovate, reduce costs, better understand people, and solve
complex problems in real time. Technological advancement,
global digitalization, and interconnectivity have made it possible
to analyze massive amounts of data rapidly, making Big Data a
key driver of transformation across all sectors.