Reliability
An experiment cannot be valid if it is not reliable
reliability is the relative consistency of test scores and other educational and psychological measures
however, no measurement is perfect
ultimately becomes an issue of validity
Reliability of Measures
This refers to whether the measures used actually measure what is being tested consistently
Internal consistency/reliability (Cronbach's alpha)
if the researcher does not provide the Cronbach's alpha that is not good
Poor test-retest performance (correlation)
Low inter-rater reliability (correlation)
multiple raters used
if there are not multiple raters that is not good
Alternate/parallel forms (correlation)
The reliability coefficient ( persons r )
.90+ = excellent
.80-.89 = good
.70-.79 = adequate
< .70 = may have limited applicability
Reliability of treatment implementation from participant to participant
Lack of standardization in the study protocol introduces the chance that observed covariation may not be related to treatment
If the groups are treated differently then the study won't be able to determine that the change is attributed to the manipulation of the IV
Any lack of control over test conditions increases the chance that observed covariation may not be related to treatment
If the groups are in different test conditions then the possible change cannot be attributed to the manipulation of the IV
examples:
lack of instructions/ delivery
test conditions not the same
Regression to the mean
Essentially means that things tend to even out over time
extreme scores are rare and usually flatten out after test-retest
This can be problematic when conditions of an experiment are based on the extreme scores
high and low IQ scores
Random heterogeneity of Participants
Individual differences of participants that are related the dependent variable that can cause issues
some participants might be more impacted by the treatment than others because of this
solutions:
use people from same groups ( homogenous)
i.e. college students, south, north
can be a disadvantage because results can't be generalized
Random assignment
within-subjects design
matching participants
Construct validity - the extent to which the operational definition measures the characteristics it intends to measure
constructs are abstractions of concepts that are discussed in social and behavioral studies
i.e. social status, power, intelligence
constructs can be measured in many ways
there are several concrete representations of a construct
Variables are not constructs
constructs need to be broken down to be measurable
Variables need operational definitions
OD - define variables for the purpose of research and replication
Any construct can be measured in multiple ways
e.g., power is the construct
variables of power
amount of influence a person has at work, home, and the in the neighborhood
each would need to be measured
all give indications of power but no one represents power
Problems with operational definitions
every observation is affected by other factors that have no relationship to the construct
contains some error
other sources of influence
social interaction of interview
interviewers appearance
respondents fear of strangers
assessment of anxiety
vocabulary comprehension
expectations
different understandings of key terms
Nomological Network
the set of construct-to-construct relationships derived from the relevant theory and stated at an abstract theoretical level
basically, what relationships do you expect to see?
typically the starting point for operational definitions
Types of Validity
face validity: extent to which a test is subjectively viewed as covering the concept it says it is measuring
does it look like it tests what it says it does?
content validity: the extent to which the items reflect an adequate sampling of the characteristic
Do the tests cover all aspects that the construct is defined as
criterion validity: the extent to which peoples scores on a measure are correlated with other variables that one would expect them to be correlated with
two types:
concurrent validity: the extent to which test scores correlate with behavior the test supposedly measures when the construct is measured at the same time as the criterion. Can also test how well a new test measures against an existing test
Predictive validity: extent to which the test scores predict a future behavior