Calculo Diferencial - Module 1 "Probability and Statistics"(2)

Alejandro Baruch Saucedo Esparza - A01400284
1. Lic. Saúl Garcia
Probability and Statistics
1. Statistics: The science of collecting, describe, analyze, and interpret data.
  1. Descriptive Statistics
    1. Collection, organization, overview and presentation of sample data. Main tools: Tables of Numbers, Graphs and calculated quantities.
  2. Inferential Statistics
    1. Obtaining inferences or conclusions (via conjectures) about populations based on information taken from the samples.
    2. Concepts
      1. Population or Universe: The totality of the elements or things under consideration
        Parameter: Measure of overview that is calculated to describe a characteristic of a population.
      2. Sample: The portion of the Population that is selected for the analysis.
        Statistic: Measure of overview that is calculated to describe a characteristic of a single sample of the population.
      3. Random Variable: Phenomena, characteristics or values that is subject to variations due to chance
        Qualitative random variables
        Produce categorical responses to describe an element of a population
        Quantitative random variables
        Discrete: Numerical responses arising from a process of counting
        Continuous: Responses arising from a process of measurement.
2. Measurement Scales
  1. Properties
    1. Identity
      1. Each value on the measurement scale has a unique meaning.
    2. Magnitud
      1. Values on the measurement scale have an ordered relationship to one another. That means some values are larger and some are smaller.
    3. Equal Intervals
      1. Scale units along the scale are equal to one another. This means that the difference between 1 and 2 would be equal to the difference between 19 and 20.
    4. Minimum value of zero
      1. The scale has a true zero point, below which no values exist.
  2. Nominal Scale of Measurement
    1. Satisfies the identity property of measurement. Values assigned to variables represent a descriptive category, but have no inherent numerical value with respect to magnitude.
      1. Example:Gender.. Individuals may be classified as "male" or "female", but neither value represents more or less "gender" than the other
  3. Ordinal Scale of Measurement
    1. Scale has the property of both identity and magnitude. Each value on the ordinal scale has a unique meaning, and it has an ordered relationship to every other value on the scale.
      1. Example:The results of a horse race, reported as "win", "place", and "show". We know the rank order in which horses finished the race.
  4. Interval Scale of Measurement
    1. Has the properties of identity, magnitude, and equal intervals. With an interval scale, you know not only whether different values are bigger or smaller, you also know how much bigger or smaller they are.
      1. Example: The Fahrenheit scale to measure temperature. The scale is made up of equal temperature units, so that the difference between 40 and 50 degrees Fahrenheit is equal to the difference between 50 and 60 degrees Fahrenheit.
  5. Ratio Scale of Measurement
    1. Satisfies all four of the properties of measurement: identity, magnitude, equal intervals, and a minimum value of zero.
      1. Example: The weight of an object. Each value on the weight scale has a unique meaning, weights can be rank ordered, units along the weight scale are equal to one another, and the scale has a minimum value of zero.
3. Sampling methods
  1. The way that observations are selected from a population to be in the sample for a sample survey.
    1. Population parameter. A population parameter is the true value of a population attribute.
    2. Sample statistic. A sample statistic is an estimate, based on sample data, of a population parameter. Is strongly affected by the way that sample observations are chosen; that is by the sampling method.
  2. Sample survey: Estimate the value of some attribute of a population.
  3. • Probability samples.
    1. Each population element has a known (non-zero) chance of being chosen for the sample.
  4. Non-probability samples
    1. We do not know the probability that each population element will be chosen, and/or we cannot be sure that each population element has a non-zero chance of being chosen.
      1. Advantages:convenience and cost. Disadvantages: Do not allow you to estimate the extent to which sample statistics are likely to differ from population parameters.
    2. • Voluntary sample
      1. A voluntary sample is made up of people who self-select into the survey.
    3. • Convenience sample
      1. A convenience sample is made up of people who are easy to reach.
  5. Probability sampling methods: They guarantee that the sample chosen is representative of the population. This ensures that the statistical conclusions will be valid.
    1. Simple random sampling
      1. Any sampling method that has the following properties. o The population consists of N objects. o The sample consists of n objects. o If all possible samples of n objects are equally likely to occur, the sampling method is called simple random sampling.
        A good example would be the lottery method. Each of the N population members is assigned a unique number. The numbers are placed in a bowl and thoroughly mixed. Then, a blind-folded researcher selects n numbers.
    2. Stratified sampling
      1. The population is divided into groups, based on some characteristic. Then, within each group, a probability sample (often a simple random sample) is selected. In stratified sampling, the groups are called strata.
        Ex: a national survey. Divide the population into groups or strata, based on geography - north, east, south, and west. Then, within each stratum, we might randomly select survey respondents
    3. Cluster sampling
      1. Every member of the population is assigned to one, and only one, group. Each group is called a cluster. A sample of clusters is chosen, using a probability method (often simple random sampling). Only individuals within sampled clusters are surveyed.
        The difference between cluster sampling and stratified sampling. With stratified sampling, the sample includes elements from each stratum. With cluster sampling, in contrast, the sample includes elements only from sampled clusters.
    4. Multistage sampling
      1. We select a sample by using combinations of different sampling methods.
    5. Systematic random sampling
      1. We create a list of every member of the population. From the list, we randomly select the first sample element from the first k elements on the population list. Thereafter, we select every kth element on the list.
        This method is different from simple random sampling since every possible sample of n elements is not equally likely.
4. Frequency Distribution
  1. Data set with large numbers
    1. Grouped Frequency Distribution
      1. Data
        lower class limits, upper class limits, class width, class mark
      2. Guidelines
        Make sure each data item will fit into one
        Try to make all class the same width
        Make sure the classes do not overlap
        Use from 5 to 12 classes(too few or to many classes can obscure the tendencies in the data
  2. Frequency
    1. How many times an event happens
  3. Relative Frequency
    1. Frequency over total elements given in percentage
  4. Visual displays of data
    1. Line Graphs(polygon of frequencies)
      1. To demonstrate how a quantity changes respect to something.
    2. Circle Graphs
      1. Uses a circle to represent the total of all the categories and divides circle into sections which sizes show the relative magnitudes of the categories. 360º=100%
    3. Bar Graphs
      1. Frequency distribution of non-numerical observation bars are not touching one another and sometimes are arranged horizontally rather than vertically.
    4. Stem and leaf graphs
      1. Numbers Grouped(min. to solve an exam)
    5. Histogram
      1. Aseries of rectangles whose lengths represent the frequencies, are placed next to another.
5. Measures of central tendencies
  1. Mean
    1. Average: Most common measure. Addition of all items and then diving the sum by the number of items.
  2. Weightened mean
    1. Sum of all products of items weighting factors, divided by the sum of all weighting factors.
  3. Median
    1. Is not so sensitive to extreme values. Divides a groups of numbers into two parts, with half the numbers below the median and half above it
      1. Position of the median
        Frequency Distribution
      2. Steps to find it
        1. Rank items(airing them in numerical order from least to greatest) 2.if number odd, median is the middle list. 3. Every median is mean of the 2 middle items.
    2. Mode
      1. value that occurs more often
6. Measures of dispersion
  1. Range
    1. A straight forward measure of dispersion. Range=(the greatest value in the set- the least value in the set.
  2. Standard deviation
    1. Based on deviations from the mean at data value
    2. Steps
  3. Coefficient of variation
  4. Chevyshev´s theorem
    1. For any set of numbers regardless of how they are distributed, the fraction of them that lie within K standard deviations of their mean (where k>1) is at least
      1. k= numbers of standard deviations
7. Measures of position
  1. z-score
  2. Percentiles
    1. If approximately n percent of the items in a distribution re less than the number x, then x is the nth percentile of the distribution, denoted Pn
  3. Deciles and Quartiles
    1. Deciles are the nine values(denoted D1-D9) along the scale that divide a data set into ten(approximately) equal-sized parts, and quartiles are the thress values(Q1-Q3)that divide a data set into 4 (approximately) equal-sized parts.
  4. Box plot

Next up

Calculo Diferencial - Module 1 "Probability and Statistics"(2)

Description

Resource summary

Media attachments

Similar

	Created by Alejandro Baruch over 9 years ago