Stats

Stats

Mixed Factorial ANOVA
1. Experiment designs: between and within subjects
  Annotations:
  - Between Subjects: - different participants in each condition - looks at the differences between groups Within Subjects: - same participants in each condition - differences between the treatments The dependent variable is measured in exactly the same way for each design
  1. Problems for between
    Annotations:
    - Participants variables Large group of participants required - impractical Biases lead to false conclusions - assignment, observer-expectancy, subject-expectancy It is possible to assess the baseline measure
  2. Problems for within
    Annotations:
    - Practice effects - lack of naivety - the more you do the task, the better you get Longer testing sessions when many conditions.
  3. Factorial Designs
    Annotations:
    - one dependent variable two or more independent variables. Used when we suspect that more than one IV is contributing to a DV. Allow exploration of complicated relationships between IVs and a DV
    1. Main effect: how the IVs individual effect the DV - overall trend
    2. Interactions: how IV factors combine to affect the DV
    3. Between Subjects factorial ANOVA
    4. Within Subjects factorial ANOVA
    5. Mixed factorial ANOVA
      Annotations:
      - Efficient uses of participant numbers and individual participant time - reduces cons of other designs. One of the most common types of design. Mixed factorial ANOVA assumptions and formulae are the same as for factorial ANOVA.
      1. mix of between and within factors
        Annotations:
        at least one between subjects factor and one within subjects factor
        Increasing between subjects factors rapidly makes high cost studies non-viable
        Main effect and Interaction formula
        Annotations:
        F values MS values SS values F(between df, within df) = F value, p = p value
        Within subjects
        Between Subjects
        F(between df, within df) = f value, p = p value
        F values, MS values, SS values
      2. Assumptions
        Annotations:
        Interval/ ratio data Normal distribution - histogram Homogeneity of variance - between subjects - Levene's test Sphericity of covariance - within subjects - Mauchly's test No parametric alternatives if these are violated
        1. interval/ ratio data
        2. normal distribution
        3. Homogenity of variance (between, Levene's)
        Annotations:
        want it to be non-significant
        4. Sphericity of covariance (within, Mauchly's)
        Annotations:
        want it to be non-significant
        no parametric alternatives if violated
      3. TWO RULES
        use between subjects formulae for between subjects effects and within subjects for within subjects effects
        if there is a conflict e.g. in interactions, use within subjects
      4. N = total number of scores
      5. n = number of scores within the condition
Correlation
1. Tests of Association
  Annotations:
  - Tests of the relationships between two variables and are usually performed on continuous variables. Tests where there is a shared variance between any given pair of variables. looking for an association between the samples, not a difference (independent samples t-test).
  1. Pearson's (parametric); Spearman's (nonparametric)
    Annotations:
    - Also point-biserial correlation - one continuous variable - one cateogrical variable 2 levels And simple linear regression, and multiple linear regression.
    1. Pearson's Correlation Assumptions (parametric)
      1. 1. linear relationship between variables
        Annotations:
        A linear relationship means that at any point a given change in x will lead to a change in y. If the scatterplot shows a clear nonlinear relationship do not run a Pearson's correlation. Data isn't suitable for correlation analysis if it has a curving nonlinear relationship.
      2. 2. variables measure interval/ ratio data which are normally distributed
        Annotations:
        as the mean and s.d. only accurately describe the average and dispersal of the data when the data are normally distributed. If frequency distributions fshow a non-normal distribution do not run a Pearson's correlation.
      3. 3. Data should be free of statistical outliers
        Annotations:
        because outliers have disproportionate influence on the correlation statistic or correlation coefficient (r). There is a misrepresentation of data if outliers are included. Either exclude them or use a Spearman's correlation (nonparametric) if they are more systematic.
    2. Spearman's Correlation Assumptions (nonparametric)
      1. 1. monotonic relationship between variables
        Annotations:
        either a positive, negative or curved relationship. Not a bell curve. -
        relationship that goes in one direction
      2. 2. works on ordinal/ interval/ ratio data - no need to worry about the distribution
      3. 3. outliers can be included in Spearman's analysed data
        Annotations:
        they do not exert as much influence, this is because Spearman's correlations do not use means or s.d.s but use ranks.
  2. tell us whether variables covary with other variables
  3. Pearson's correlation formula
    Annotations:
    - a. For each case, subtract the mean from the score on the X variable; repeat for the mean and score on the Y variable; multiply these two values, then add together the products for all cases. b. For each case, subtract the mean from the score on the X variable; square this difference; add together the squared value for all cases, and then find the square root. Repeat for the Y variable and multiply. Use this value to divide by.
    1. Df = no. of pairs - 2
    2. r(df) = r value, p = p value
    3. r = correlation coefficient
      1. indication of the strength of the relationship
    4. r2 = coefficient of determination
      1. measure of the strength of the relationship, describes the amount of variance explained
      2. effect size
2. Scatterplots
  Annotations:
  - typically show relationships between pairs of variables. Each point represents one pair of observations at each measurement point
  1. Bottom left to top right = positive
  2. Top left to bottom right = negative
  3. the spread gives an indication of the strength of the relationship
    1. Direction and Strength
      1. If there is low or no spread between the data points then there is a very strong correlation between the variables
        Annotations:
        If there is a reasonable spread, then there is a strong correlation between the variab;es.
        r value = 1/ -1
        Annotations:
        direct diagonal line. when there is a greater spread, the points deviate from 1/-1.
      2. If there is a high spread then there is low or no correlation
3. Interpreting correlations; facts about correlation coefficients
  1. range from -1 to 1.
  2. no units
  3. they are the same for x and y as for y and x
  4. positive values: as one variable increases so does the other
  5. negative values: as one variable increases, the other decreases
  6. positive relationship - as one value decreases, so does the other
  7. the more spread out data are, the more values will deviate from 1
  8. how close a value is to -1 or 1 indicates how close the two variables are to being perfectly linearly related
4. R values
  1. Estimating r values
    Annotations:
    - 1. plot your scatterplot and divide it accordingly to the mean x and y values in order to estimate your values. 2. count up number of points in each quadrant. A positive correlation will populate the +ve quadrants more than the -ve quadrants and vice versa.
  2. Calculating r values - determining whether two values are associated.
    1. 1.Plot the raw values against one another
      Annotations:
      - scaling problems - different means and SDs. We don't care about the means etc, only the relationships. If all the values are along the bottom, we must try to look at the data in a way that accounts for the differing means and SDs of each axis - therefore do a z score.
    2. 2. Z score gives you values which have a mean of 0 and a z score of 1.
      Annotations:
      - z score = (score-mean)/ SD No scaling or unit problems. Converting raw scores into z scores allows direct comparisons between scores even if they are measured on different scales, and thus enables a comparison of the relative probabilities of each. Z scores are referred to as standard scores because measurement scales are converted into a standardised format (mean = 0, SD =1)
    3. 3. r = the adjusted average of the product for each standardised x-y coordinate pair
      Annotations:
      - Top-right and bottom left quadrants produce positive values +/- r Calculate the area between the points and you would do this for every single value you are looking for a relationship between. The outliers would artifically inflate the correlation value (r). Bigger area = larger correlation value - further away from the mean.
      1. the closer to the diagonal a point is, the more it contributes to the r value.
        The further away from both means a point is, the more it contributes to r.
      2. r = Σ(zX x zY)/ N -1
        Annotations:
        where zX = X- x̄/ Sx
5. Limitations
  1. Correlation does not equal causation
    Annotations:
    - there can be a casual link but correlation analyses do not allow us to conclude this. To prove causation, the experiment would have to be controlled.
Regression
1. what is regression?
  1. a family of inferential statistics
  2. Test of association
  3. Help make predictions about data
  4. used when causal relationships are likely
2. Correlation does not tell you how much to intervene
3. line of best fest
  1. formula of the line gives the exact answer
4. Predictions
  1. it is possible to make predictions about how predictor variables will effect outcome variables
  2. regression gives an indication of the:
    1. unstandardised relationship
    2. between outcome (y-axis) and predictor (x-axis) variables
    3. using calculations of the intercept and gradient
      1. expressed in the form Y = a + bX
        a = intercept/ constant
        b = gradient/ coefficient
        in order to determine a, you need to calculate b first
5. Assumptions
  1. 1. the data are linearly related
  2. 2. Homoscedasticity of data
    1. residuals
      1. residuals are the difference between the actual outcome score and the predicted score outcome
      2. need same degree of variation across all predictor variable scores
      3. if data are heteroscedastic, a regression isn't the appropriate analysis
6. simple regression
  1. predicting one outcome variable from one predictor variable
  2. Y = a + bX
  3. SPSS output
    1. 1. descriptive statistics
    2. 2. correlation coefficient
    3. 3. variables enter and removed
      1. variable entered = predictor variable
      2. dependent variable = outcome variable
    4. 4. model summary (R values)
    5. 5. Check assumptions - graph tests of homoscedasticity
      1. 3 charts at the bottom
        frequency plot of standardised residuals
        histogram of residual values
        want normal distribution
        bars should approx fit the normal curve
        good indication of homoscedasticity
        normal plot of regression standardised residual
        points should follow the diagonal line
        scatterplot of regression standardised residual and regression standardised predicted value
        DV = change
        plots standardised predicted y values (x axis) against their corresponding residuals
        want to see a diffused cloud - no distinct patterns
    6. Determining whether the regression model is statistically valid - 3 R values
      1. R = pooled correlation
      2. R2 = amount of variance in the data that is explained by the model (%)
        most important value
      3. adjusted R2 = how much variance would be expected by chance
      4. ANOVA table
        test of whether the regression model is better than using the mean outcome value (y) for all cases
        is the model signfiicantly better at predicting another model
        report R2 than ANOVA result
    7. Reporting Results
      1. 1. Check descriptives and correlations
      2. 2. Check that predictor and outcome variables show a linear relationship (scatterplot)
      3. 3. Check that homscedasticity assumption is not violated
      4. Report the R2 in the test, and the ANOVA results
        R2 = , F( , )= , p <
      5. Report the coefficients in a table
7. Multiple Regression
  1. Predicting one outcome variable from more than one predictor variable
  2. Formula: Y = a + b1X1 + b2X2 +b3X3
  3. many participants are needed
  4. Methods
    1. predictors can be entered in many different orders
    2. Simultaneous
      1. all predictors are entered at the same time
      2. use for exploratory analysis
    3. Hierarchical
      1. predictors are entered in a pre-defined order
      2. used when regressions are informed by well-defined theory
    4. Stepwise
      1. predictors are entered in an order driven by how well they correlated with the outcome
      2. not used often as unstable
  5. SPSS output
    1. 1. Descriptive Statistics
    2. 2. Correlations
    3. 3. Assumptions - visual tests for homoscedasticity
    4. 4. Model
      1. summary
        how good the model is, R2
      2. ANOVA significance
  6. Reporting Results
    1. 1. Check descriptives and correlations
    2. 2. Difficult to check for linear relationships
    3. 3. Check that homoscedasticity assumption is not violated
    4. 4. Report the R2 value
      1. R2 = F(df,df) , p =
    5. 5. Report the coefficients in a table
  7. multicollinearity occurs when variables are highly correlated with each other. This is undesired.
8. Summary
  1. Regression analyses allow to make predictions about outcome variables using predictor variables
  2. All regressions assume homoscedasticity
  3. Simple (bivariate) regression uses one predictor variable. Multiple regression uses more than one.
  4. To report regressions:
    1. i) report R2 and the ANOVA in the text
    2. ii) report the coefficients in a table
Correlation is used to examine the relationship between variables
1. Regression is used to make predictions about scores on one variable based on knowledge of the values of others

Next up

Description

Resource summary

Similar

	Created by nb43 over 11 years ago