Calculo Diferencial - Module 1
"Probability and Statistics"(2)
1 Alejandro Baruch Saucedo Esparza - A01400284
1.1 Lic. Saúl Garcia
2 Probability and Statistics
2.1 Statistics: The science of collecting, describe, analyze, and interpret data.
2.1.1 Descriptive Statistics
2.1.1.1 Collection, organization,
overview and presentation of
sample data. Main tools:
Tables of Numbers, Graphs
and calculated quantities.
2.1.2 Inferential Statistics
2.1.2.1 Obtaining inferences or
conclusions (via conjectures)
about populations based on
information taken from the
samples.
2.1.2.2 Concepts
2.1.2.2.1 Population or Universe: The
totality of the elements or
things under consideration
2.1.2.2.1.1 Parameter: Measure of
overview that is
calculated to describe
a characteristic of a
population.
2.1.2.2.2 Sample: The
portion of the
Population that is
selected for the
analysis.
2.1.2.2.2.1 Statistic: Measure of
overview that is calculated
to describe a characteristic
of a single sample of the
population.
2.1.2.2.3 Random Variable:
Phenomena,
characteristics or values
that is subject to variations
due to chance
2.1.2.2.3.1 Qualitative random variables
2.1.2.2.3.1.1 Produce categorical responses
to describe an element of a
population
2.1.2.2.3.1.2 Quantitative random variables
2.1.2.2.3.1.2.1 Discrete: Numerical
responses arising from a
process of counting
2.1.2.2.3.1.2.2 Continuous:
Responses arising
from a process of
measurement.
2.2 Measurement Scales
2.2.1 Properties
2.2.1.1 Identity
2.2.1.1.1 Each value on the
measurement scale has a
unique meaning.
2.2.1.2 Magnitud
2.2.1.2.1 Values on the measurement scale
have an ordered relationship to one
another. That means some values are
larger and some are smaller.
2.2.1.3 Equal Intervals
2.2.1.3.1 Scale units along the scale are
equal to one another. This means
that the difference between 1 and 2
would be equal to the difference
between 19 and 20.
2.2.1.4 Minimum value of zero
2.2.1.4.1 The scale has a true zero
point, below which no
values exist.
2.2.2 Nominal Scale of Measurement
2.2.2.1 Satisfies the identity property of
measurement. Values assigned to
variables represent a descriptive
category, but have no inherent
numerical value with respect to
magnitude.
2.2.2.1.1 Example:Gender..
Individuals may be classified
as "male" or "female", but
neither value represents
more or less "gender" than
the other
2.2.3 Ordinal Scale of Measurement
2.2.3.1 Scale has the property of
both identity and magnitude.
Each value on the ordinal
scale has a unique meaning,
and it has an ordered
relationship to every other
value on the scale.
2.2.3.1.1 Example:The results of a
horse race, reported as
"win", "place", and "show".
We know the rank order in
which horses finished the
race.
2.2.4 Interval Scale of Measurement
2.2.4.1 Has the properties of identity,
magnitude, and equal intervals. With
an interval scale, you know not only
whether different values are bigger or
smaller, you also know how much
bigger or smaller they are.
2.2.4.1.1 Example: The Fahrenheit scale to measure
temperature. The scale is made up of equal
temperature units, so that the difference
between 40 and 50 degrees Fahrenheit is
equal to the difference between 50 and 60
degrees Fahrenheit.
2.2.5 Ratio Scale of Measurement
2.2.5.1 Satisfies all four of the properties
of measurement: identity,
magnitude, equal intervals, and a
minimum value of zero.
2.2.5.1.1 Example: The weight of an object. Each value
on the weight scale has a unique meaning,
weights can be rank ordered, units along the
weight scale are equal to one another, and the
scale has a minimum value of zero.
2.3 Sampling methods
2.3.1 The way that observations are
selected from a population to be in
the sample for a sample survey.
2.3.1.1 Population parameter. A population
parameter is the true value of a population
attribute.
2.3.1.2 Sample statistic. A sample statistic is
an estimate, based on sample data, of
a population parameter. Is strongly
affected by the way that sample
observations are chosen; that is by the
sampling method.
2.3.2 Sample survey: Estimate
the value of some
attribute of a population.
2.3.3 • Probability samples.
2.3.3.1 Each population element has a
known (non-zero) chance of being
chosen for the sample.
2.3.4 Non-probability samples
2.3.4.1 We do not know the probability that
each population element will be chosen,
and/or we cannot be sure that each
population element has a non-zero
chance of being chosen.
2.3.4.1.1 Advantages:convenience and cost.
Disadvantages: Do not allow you to
estimate the extent to which sample
statistics are likely to differ from
population parameters.
2.3.4.2 • Voluntary sample
2.3.4.2.1 A voluntary sample is
made up of people who
self-select into the survey.
2.3.4.3 • Convenience sample
2.3.4.3.1 A convenience sample is
made up of people who are
easy to reach.
2.3.5 Probability sampling methods:
They guarantee that the sample
chosen is representative of the
population. This ensures that
the statistical conclusions will be
valid.
2.3.5.1 Simple random sampling
2.3.5.1.1 Any sampling method that has the
following properties. o The population
consists of N objects. o The sample
consists of n objects. o If all possible
samples of n objects are equally likely
to occur, the sampling method is called
simple random sampling.
2.3.5.1.1.1 A good example would be the
lottery method. Each of the N
population members is assigned a
unique number. The numbers are
placed in a bowl and thoroughly mixed.
Then, a blind-folded researcher selects
n numbers.
2.3.5.2 Stratified sampling
2.3.5.2.1 The population is divided into groups, based on some
characteristic. Then, within each group, a probability sample
(often a simple random sample) is selected. In stratified
sampling, the groups are called strata.
2.3.5.2.1.1 Ex: a national survey. Divide the population
into groups or strata, based on geography -
north, east, south, and west. Then, within
each stratum, we might randomly select
survey respondents
2.3.5.3 Cluster sampling
2.3.5.3.1 Every member of the population is assigned to one,
and only one, group. Each group is called a cluster. A
sample of clusters is chosen, using a probability
method (often simple random sampling). Only
individuals within sampled clusters are surveyed.
2.3.5.3.1.1 The difference between cluster sampling and
stratified sampling. With stratified sampling, the
sample includes elements from each stratum.
With cluster sampling, in contrast, the sample
includes elements only from sampled clusters.
2.3.5.4 Multistage sampling
2.3.5.4.1 We select a sample by
using combinations of
different sampling
methods.
2.3.5.5 Systematic
random sampling
2.3.5.5.1 We create a list of every member of the population. From the list,
we randomly select the first sample element from the first k
elements on the population list. Thereafter, we select every kth
element on the list.
2.3.5.5.1.1 This method is different from simple random sampling since
every possible sample of n elements is not equally likely.
2.4 Frequency Distribution
2.4.1 Data set with large numbers
2.4.1.1 Grouped Frequency Distribution
2.4.1.1.1 Data
2.4.1.1.1.1 lower class limits, upper class limits, class width, class mark
2.4.1.1.2 Guidelines
2.4.1.1.2.1 Make sure each data item will fit into one
2.4.1.1.2.2 Try to make all class the same width
2.4.1.1.2.3 Make sure the classes do not overlap
2.4.1.1.2.4 Use from 5 to 12 classes(too few or to many classes can
obscure the tendencies in the data
2.4.2 Frequency
2.4.2.1 How many times an event happens
2.4.3 Relative Frequency
2.4.3.1 Frequency over total
elements given in percentage
2.4.4 Visual displays of data
2.4.4.1 Line
Graphs(polygon
of frequencies)
2.4.4.1.1 To demonstrate how a
quantity changes
respect to something.
2.4.4.2 Circle Graphs
2.4.4.2.1 Uses a circle to represent
the total of all the categories
and divides circle into
sections which sizes show
the relative magnitudes of
the categories. 360º=100%
2.4.4.3 Bar Graphs
2.4.4.3.1 Frequency distribution of non-numerical
observation bars are not touching one
another and sometimes are arranged
horizontally rather than vertically.
2.4.4.4 Stem and leaf graphs
2.4.4.4.1 Numbers Grouped(min. to solve an exam)
2.4.4.5 Histogram
2.4.4.5.1 Aseries of rectangles whose
lengths represent the
frequencies, are placed next to
another.
2.5 Measures of central tendencies
2.5.1 Mean
2.5.1.1 Average: Most common measure.
Addition of all items and then diving
the sum by the number of items.
2.5.2 Weightened mean
2.5.2.1 Sum of all products of items
weighting factors, divided by
the sum of all weighting
factors.
2.5.3 Median
2.5.3.1 Is not so sensitive to extreme
values. Divides a groups of
numbers into two parts, with
half the numbers below the
median and half above it
2.5.3.1.1 Position of the median
2.5.3.1.1.1 Frequency Distribution
2.5.3.1.2 Steps to find it
2.5.3.1.2.1 1. Rank items(airing them in numerical order from
least to greatest) 2.if number odd, median is the
middle list. 3. Every median is mean of the 2 middle
items.
2.5.3.2 Mode
2.5.3.2.1 value that occurs more often
2.6 Measures of dispersion
2.6.1 Range
2.6.1.1 A straight forward measure of dispersion.
Range=(the greatest value in the set- the
least value in the set.
2.6.2 Standard deviation
2.6.2.1 Based on deviations from
the mean at data value
2.6.2.2 Steps
2.6.3 Coefficient of variation
2.6.4 Chevyshev´s theorem
2.6.4.1 For any set of numbers regardless of how they are
distributed, the fraction of them that lie within K standard
deviations of their mean (where k>1) is at least
2.6.4.1.1 k= numbers of
standard deviations
2.7 Measures of position
2.7.1 z-score
2.7.2 Percentiles
2.7.2.1 If approximately n percent of the items in a distribution
re less than the number x, then x is the nth percentile
of the distribution, denoted Pn
2.7.3 Deciles and Quartiles
2.7.3.1 Deciles are the nine values(denoted D1-D9) along the scale that divide
a data set into ten(approximately) equal-sized parts, and quartiles are
the thress values(Q1-Q3)that divide a data set into 4 (approximately)
equal-sized parts.