Psychology 2113, Research Methods I: Statistics

Welcome to Psychology 2113

Statistics in Research

Welcome to Psychology 2113

•      About the course

–   Where does it fit?

•    2113 is a prerequisite for most upper division courses

•    Take 3114, Research Methods II: Applications and Experimental Design, the next semester after 2113

•    3114 (Experimental) is a prerequisite for capstone and most 4000 level courses

•    Take 3003, Advanced Undergraduate Statistics, if you are planning to go to graduate school

–   Outline

–   Grades

–   Academic misconduct

Welcome..., cont.

•      Questions about the course?

•      Assistants

–   Graduate assistant

–   Undergraduate volunteers

•      Labs/Times/Colors

–   T 10:30-11:20 (Section 12): Pink

–   T 11:30-12:20 (Section 11): Yellow

–   W 12:30-1:20 (Section 14): Blue

–   W 1:30-2:20 (Section 13): Green

•      Textbook errors

Statistics in Research

•      Hamstring stretch example

•    four groups of runners, randomly assigned to groups

•    groups differed by amount of time holding the stretch

•    interested in flexibility after six weeks of stretching

–   What do we want to know?

•    The effect on flexibility due to time holding the stretch

•    The best time to hold the stretch

–   Results: control=15<30=60

Statistics in Research, cont.

•      Another example: hamstring stretch, but we simply measure a group of runners on the length of time they hold their stretch and on their flexibility.

Statistics in Research, cont.

•      Other examples

•   Trait aggressiveness and hockey penalties

•   Piano lessons and math performance

•   Age and music purchases

•   Teenagers and gambling

•   Violent video games and risk of heart trouble

Statistics in Research, cont.

•      Another example

–   Delay of gratification and SAT scores

•    4.5 year old children

•    rewards visibly exposed (E) vs. obscured (O)

•    coping ideas (I) vs. none (N)

•    four groups: EI, EN, OI, ON

•    delay time in seconds

Statistics in Research, Cont.

–   Delay of gratification and SAT scores

–   Results:

•    EI 517s, EN 365s,  OI 585s, ON 590s

 

 

 

 

 

 

 

 

 

 

 

 

 

Psychology 2113

Overview of Statistics

Preview of Inferential Statistics

Research and Research Design

Overview of Statistics

•      Population

–   Target group for inference

–   Examples

–   Parameter: numerical characteristic of population. Example: population mean is m (mu).

•      Sample

–   Subgroup of the population

–   Examples

–   Statistic: numerical characteristic of sample. Example: sample mean is X (X-bar).

Overview..., cont.

•      Sampling

–   Process of selecting sample from population

•      Random sampling

–   Independent selection

–   As contrasted to “evenness”

•      Descriptive vs. Inferential Statistics

–   Descriptive: primary purpose is to describe some aspect of the data

–   Inferential: primary purpose is to infer (to estimate or to make a decision, test a hypothesis)

Preview of Inferential Statistics

•      All inferential statistics have the following in common:

   Use of Some Descriptive Statistic
(Remember…)

Use of Probability

•      Coin toss example

–   Fair coin? p(Head)=.5

–   10 tosses, ten Heads,

•    p(10 H|fair coin)=1/1024 =.00098,

•    Reject idea (hypothesis) of fair coin

–   10 tosses, six heads,

•    p(6 or more H|fair coin) =386/1024=.377,

•    Retain hypothesis of fair coin

   Potential for Estimation
(Remember…)

Sampling Variability and Sampling Distributions

If we do the delay of gratification study again, we likely won’t get 365 for the mean of the EN group. The value of the new mean might be close, like 355. A third study might yield 375, etc.

Use of a Theoretical Distribution

Two Hypotheses, Two Decisions, Two Types of Error

•      Two hypotheses

–   Coin is fair, or coin is not fair

–   Kids in the EN group score lower, or they don’t

–   30s hamstring stretch is better than 15s, or it isn’t

•      Two decisions

–   Reject or retain a hypothesis

•      Two types of error

–   Reject a true hypothesis, retain a false hypothesis

Preview of Inferential Statistics

•      All inferential statistics have the following in common:

–   use of some descriptive statistic

–   use of probability

–   potential for estimation

–   sampling variability

–   sampling distributions

–   use of a theoretical distribution

–   two hypotheses, two decisions, two types of error

Research and Research Design

•      Research defined

–   Structured Problem Solving

•      Comparative vs. Absolute

•      Scientific methods: steps (cyclic)

–   1. encounter and identify problem

–   2. formulate hypotheses, define variables

–   3. think through consequences of hypotheses

–   4. design & run study, collect data, compute statistics, test hypotheses

–   5. draw conclusions

Research: Variables

•      Variable: entity that is free to take on different values

–   Independent variable (IV): its values are manipulated by the researcher, comes first in time

–   Dependent variable (DV): measured by researcher, follows the IV in time

–   Extraneous variable (EV): controlled by researcher

•    randomization of subjects to groups

•    keep all subjects constant on EV

•    include EV in the design of the experiment

Research: Variables

•      Variable, continued:

–   Predictor variable (PV): comes first in time but there is no manipulation, analogous to IV.

–   Criterion variable (CV): follows PV in time, analogous to DV.

 Research: Examples

•      Example: first hamstring stretch study.

–   IV was time holding stretch: control, 15s, 30s, 60s.

–   DV was flexibility.

–   EV s were age, gender, etc.

•      Example: delay of gratification study.

–   IV s were rewards (exposed, obscured) and coping ideas (ideas, none).

–   DV was delay time.

–   EV was age (controlled by                                ).

Research: Examples

•      Example: second hamstring stretch study.

–   PV was time holding stretch: 12s, 30s, 25s, etc.

–   CV was flexibility: 47, 34, etc.

–   EV s were age, gender, etc.

•      Operational definition.

–   The type that is assigned to a variable depends on how it is defined and used in any given research project.

–   Some variables can be any one of the five types, depending on how they were used in the research.

–   Anxiety example from book.

Research: Relationships

•      Types of relationships

–   Causal relationship: IV causes the DV

•    e.g. different times holding a hamstring stretch causes differences in flexibility

•    key is manipulation of IV

–   Predictive relationship: PV predicts the CV

•    e.g. different times holding a hamstring stretch merely predicts differences in flexibility

•    key is no manipulation

Types of Research

•      Types of research

–   True experiment

•    manipulation of IV

•    randomization of subjects to groups

•    causal relationship between IV and DV

–   Observational research

•    no manipulation

•    minimal control of EV

•    predictive relationship between PV and CV

•      Data: quantitative and qualitative (end 2)

Psychology 2113

Pictorial Descriptions

Stem and Leaf Display

Pictorial Descriptions

•      Frequency distribution

•      Stem and leaf display

•      Bar graph and histogram

•      Frequency polygon

•      Pie chart

•      Scatterplot

•      Skewness and kurtosis

Stem and Leaf Display

•      The first digit(s) of a score form the stem, the last digit(s) form the leaf

–   An age of 38 could be shown as                                                                                                                         

Stem and Leaf Display, cont.

•      We want 10-20 total number   of stems.

•      Number of stems per digit depends on total number of stems: can do 1, 2, or 5 stems per digit.

 

Description With Statistics

•      Aspects or characteristics of data that we can describe are

–   Middle

–   Spread

–   Skewness

–   Kurtosis

•      Statistics that measure/describe

–   middle are mean, median, mode

–   spread are range, variance, standard deviation, midrange

 

 

Description With Statistics

•      Middle = central tendency, location, center

–   Measures of middle are mean, median, mode (keywords)

•      Spread = variability, dispersion

–   Measures of spread are range, variance, standard deviation, midrange (keywords)

•      Skewness = departure from symmetry

–   Positive skewness = tail (extreme scores) in positive direction

–   Negative skewness = tail (extreme scores) in negative direction

•      Kurtosis = peakedness relative to normal curve

Skewness

 

 

 

 

Skewness

 

 

 

 

Kurtosis

 

 

 

 

 

Description With Statistics

•      Another name for middle is

 

 

 

 

 

 

 

 

 

 

 

 

 

Psychology 2113

Describing Middle

Describing Spread

Describing the Middle of Data

•      Another name for “middle” is

    _________.

•      “Middle” is the aspect of data

    we want to describe.

•      We describe/measure the middle of data in a sample with the statistics:

–   Mean.

–   Median.

–   Mode.

•      We describe/measure the middle of data in a population with the parameter m (‘mu’); we usually don’t know m, so we estimate it with X-bar.

Sample Mean

•      The sample mean is the sum of the scores divided by the number of scores, and is symbolized by X-bar, X = SX/N

•      For example, for X1=4, X2=1, X3=7, N=3, SX=12 and X = SX/N = 12/3 = 4

•      Characteristics:

–   X-bar is the balance point

 

Sample Median

•      The median is the middle of the    ordered scores, and is symbolized as X50.

•      Median position (as distinct from the median itself) is (N+1)/2 and is used to find the median.

•      Example: X1=4, X2=1, X3=7, then N=3.

–   Median position is (3+1)/2 = 4/2 = 2.

–   Place the scores in order, 1 4 7.

–   X50 is the score in position/rank 2.

–   So X50 = 4.

Sample Median, cont.

•      Another example: X1=4, X2=1, X3=7, X4=6, and N=4.

–   Median position is (N+1)/2 = (4+1)/2 = 5/2 = 2.5.

–   Place the scores in order, 1 4 6 7.

–   X50 is the score in position/rank 2.5.

–   So X50 = (4+6)/2 = 10/2 = 5.

•      Characteristics:

–   Depends on only one or two middle values.

–   For quantitative data when distribution is skewed.

–   Minimizes S|X-X50|.

Sample Mode

•      The mode is the most frequent score.

•      Examples:

–   1 1 4 7, the mode is 1.

–   1 1 4 7 7, there are two modes, 1 and 7.

–   1 4 7, there is no mode.

•      Characteristics:

–   Has problems: more than one, or none; maybe not in the middle; little info re data.

–   Best for qualitative data, e.g. gender.

–   If it exists, it is always one of the scores.

–   Is rarely if ever used.

Describing the Spread of Data

•      Another name for “spread” is _________.

•      “Spread” is the aspect of data we want to describe.

•      Any statistic that describes/measures spread should have these characteristics: it should

–   Equal zero when the spread is zero.

–   Increase as spread increases.

–   Measure just spread, not middle.

Describing the Spread of Data, cont.

•      We describe/measure the spread of data in a sample with the statistics:

–   Range = high score-low score.

–   Midrange, MR.

–   Sample variance, s*².

–   Sample standard deviation, s*.

–   Unbiased variance estimate, s².

–   s.

•      We describe/measure the spread of data in a population with the parameter s (‘ sigma’) or s²; we usually don’t know s or s², so we estimate them with one of the statistics.

Midrange (MR)

•      Formula is MR=UH-LH.

–   UH=upper hinge

–   LH=lower hinge

–   Hinges cut off 25% of the data in each tail

•      Hinge position is ([median position]+1)/2.

–   [median position] is the whole number part of the median position (remember, median pos.=(N+1)/2)

•      Use hinge position to count in from the tails to find the hinges.

Midrange (MR), cont.

•      Example: 4 1 5 3 3 6 1 2 6 4 5 3 4 1, N=14

–   Arrange data in order: 1 1 1 2 3 3 3 4 4 4 5 5 6 6

–   Compute median position =  (N+1)/2=(14+1)/2=15/2=7.5

–   Compute hinge position =

   ([median position]+1)/2=(7+1)/2=8/2=4

–   Count in to the 4th score from each tail to find UH and LH

•    UH=5 and LH=2

–   MR=UH-LH=5-2=3

Sample Variance, s*²

•      Definitional formula: s*²=S(X-X)²/N, the average squared deviation from X-bar.

•      Example: 1 2 3

–   N=3, X = SX/N=6/3=2

–   S(X-X)² = (1-2)²+(2-2)²+(3-2)²=1+0+1=2

–   s*²=2/3=.6667

•      Computational formula: s*²=[NSX²-(SX)²]/N²

–   SX² = 1²+2²+3²=1+4+9=14, SX=6, N=3

–   s*²=[3(14)-(6)²]/3²=[42-36]/9=6/9=2/3=.6667

•      s*² is in squared units of measure.

Sample Standard Deviation, s*

•      Formula: s*= Φs*²

•      Example: 1 2 3

–   N=3, X = SX/N=6/3=2

–   S(X-X)² = (1-2)²+(2-2)²+(3-2)²=1+0+1=2

–   s*²=2/3=.6667

–   s*= Φ.6667=.8165

•      s* is in original units of measure.

Unbiased Variance Estimate, s²

•      Definitional formula: s²=S(X-X)²/(N-1)

•      Example: 1 2 3

–   N=3, X = SX/N=6/3=2

–   S(X-X)² = (1-2)²+(2-2)²+(3-2)²=1+0+1=2

–   s²=2/2=1.0

•      Computational formula:

   s²=[NSX²-(SX)²]/[N(N-1)]

–   SX² = 1²+2²+3²=1+4+9=14, SX=6, N=3

–   s²=[3(14)-(6)²]/[3(2)]=[42-36]/6=6/6=1.0

•      s² is in squared units of measure

s

•      Formula: s= Φ

•      Example: 1 2 3

–   N=3, X = SX/N=6/3=2

–   S(X-X)² = (1-2)²+(2-2)²+(3-2)²=1+0+1=2

–   s²=1.0

–   s= Φ1=1.0

•      s is in original units of measure.

•      s has no official name, it is the square root of the unbiased variance estimate, s².

•      SAS

 

Box-plots

•      A pictorial description that uses a box to show the middle of the data and lines called whiskers to show the tails of a distribution.

•      Box

–   Upper end is at the UH, lower end is at the LH

–     Line across the middle is X50

•      Example: 4 1 5 3 3 6 1 2 6 4 5 3 4 1, N=14

–   Arrange data in order: 1 1 1 2 3 3 3 4 4 4 5 5 6 6

–   Compute median position = (N+1)/2=(14+1)/2=15/2=7.5

–   Median, X50, is (3+4)/2=7/2=3.5

Box-plots, cont.

–   Compute hinge position =

   ([median position]+1)/2=(7+1)/2=8/2=4

–   Count in to the 4th score from each tail to find UH and LH, 1 1 1 2 3 3 3 4 4 4 5 5 6 6

•    UH=5 and LH=2

–   Draw the box.

Box-plots, cont.

–   Whiskers are lines drawn from the ends of the box (the hinges) to adjacent values, UAV & LAV.

–   Adjacent values are the first real data values inside the inner fences.

–   Inner fences, upper and lower

•    Upper, UIF=UH+1.5MR

•    Lower, LIF= LH-1.5MR

–   Example: MR=UH-LH=5-2=3

•    UIF=UH+1.5MR=5+1.5(3)=9.5

•    LIF=LH+1.5MR=2-1.5(3)=-2.5

•    1 1 1 2 3 3 3 4 4 4 5 5 6 6

•    UAV=6

•    LAV=1

–   SAS

Box-plots, cont.

•      Another example: 1 1 1 2 3 3 3 4 4 4 4 6 9 10 11

–   N=15, median position=(N+1)/2=(15+1)/2=16/2=8.

–   Hinge position=([median position]+1)/2=(8+1)/2=4.5.

–   Hinges, X50, and midrange:

•    UH=(4+6)/2=10/2=5

•    LH=(2+3)/2=5/2=2.5

•    X50=4

•    MR=UH-LH=5-2.5=2.5

–   Inner fences:

•    UIF=UH+1.5MR=5+1.5(2.5)=5+3.75=8.75

•    LIF= LH-1.5MR=2.5-1.5(2.5)=2.5-3.75=-1.25

•    1 1 1 2 3 3 3 4 4 4 4 6 9 10 11

•    UAV=6

•    LAV=1

–   Draw box-plot

–   Outliers: outside whiskers, marked with * (end 4)

 

 

 

Psychology 2113

z Scores

Normal Distributions

Standard Normal Distribution

z Scores

•      The aspect of the data we want to describe/measure is relative position.

•      z  scores are statistics that describe the relative position of something in its distribution.

•      Verbal formula: z is something minus its mean divided by its standard deviation.

•      Formulas:

–   For X in sample, z=(X-X)/s*

–   For X in population, z=(X-m)/s

z Scores, cont.

•      Characteristics:

–   The mean of a distribution of z scores is zero.

–   The variance of a distribution of z scores is one.

–   The shape of a distribution of z scores is reflective, the shape is the same as the shape of the distribution of the Xs.

•      Example: Compute z

–   Sample, if X=34, X=40, and s*²=9, then

    z=(X-X)/s*=(34-40)/3=-6/3=-2.

–   Population, if X=10, m=8, and s²=7, then

    z=(X-m)/s=(10-8)/2.6457….=2/2.6457….=.7559

Normal Distributions

•      Family of theoretical distributions; there are many different normal distributions.

•      Characteristics:

–   Symmetric, continuous, unimodal.

–   Bell-shaped.

–   Scores range from - to + .

–   Mean, median, and mode are all the same value.

–   Each distribution has two parameters, m and s².

 

Normal Distributions, cont.

•      Examples:

–   IQ is normally distributed with m=100 and s²=225.

–   Height of American males is normally distributed with m=69 and s²=9.

•      The standard normal (or unit normal) distribution has m=0 and s²=1.

•      We can transform any normal distribution to the standard normal distribution by computing z scores: the resulting distribution of z scores will have a shape that is normal.

 

 

Standard Normal Distribution

•      We use this distribution to get probabilities associated with a z score (probability, proportion, and area under the curve are synonymous).

•      Example:

–   If Joe is 72 inches tall, what is the probability that any randomly selected man is his height or taller?

–   For height, m=69 and s²=9, so

    z =(X-m)/s=(72-69)/3=3/3=1

–   From Table A.2, p(z>1)=.1587

 

 

Standard Normal Distribution, cont.

•      Using Table A.2, there are two key facts:

–   Total area equals one

–   Symmetry

Standard Normal Distribution, cont.

•      More examples: if z is normal and p(z>1.645)=.05

•      Compute p(z<1.645)

–   Want all the area to the left of 1.645, a large area.

–   The area is in the middle and left tail.

–   Use total area=1 to get

    p(z<1.645) = 1-.05 = .95

 

 

Standard Normal Distribution, cont.

•      More examples: if z is normal and p(z>1.645)=.05

•      Compute p(z<-1.645)

–   Want all the area to the left of -1.645, a small area.

–   The area is in the left tail.

–   Use symmetry to get

    p(z<-1.645) = .05

 

 

Standard Normal Distribution, cont.

•      More examples: if z is normal and p(z>1.645)=.05

•      Compute p(z>-1.645)

–   Want all the area to the right of -1.645, a large area.

–   The area is in the middle and right tail.

–   Use symmetry and total area=1 to get

    p(z>-1.645) = .95

 

 

Standard Normal Distribution, cont.

Standard Normal Distribution, cont.

Standard Normal Distribution, cont.

Standard Normal Distribution, cont.

Standard Normal Distribution, cont.

 

 

 

 

 

 

Psychology 2113

Correlation

Regression

Correlation and Regression

•      Both examine linear (straight line) relationships.

•      Both work with a pair of scores, one on each of two variables, X and Y.

•      Correlation:

–   Defined as the degree of linear relationship between X and Y.

–   Is measured/described by the statistic r.

•      Regression:

–   Is concerned with the prediction of Y from X.

–   Forms a prediction equation to predict Y from X.

 

Correlation

•      The aspect of the data that we

    want to describe/measure is

    the degree of linear relationship

    between X and Y.

•      The statistic r describes/measures the degree of linear relationship between X and Y.

•      r=SzXzY/N, the average product of z scores for X and Y

–   Works with two variables, X and Y

–   -1<r<1, r measures positive or negative relationships

–   Measures only the degree of linear relationship

–   r2=proportion of variability in Y that is explained by X

–   r is undefined if X or Y has zero spread

–   r is dimensionless

 

 

Correlation: -1<r<1.

•      The sign of r shows the type of linear relationship between X and Y. We can use the definitional

Correlation: Linear

•      If there is a curvilinear

    relationship between X and Y,

    then r will not detect it. The value of r will be zero if there is no linear relationship between X and Y.

Correlation: r2

•      r2=proportion of variability in Y

    that is explained by X.

–  If r=.5, r2=.25, so the proportion of

   variability in Y that is explained by X is .25 (as a percentage, this shows 25% explained by X, 75% unexplained).

•      Scatterplots:

r=.5, r2=.25       r=.7, r2=.49       r=.9, r2=.81

 

 

 

 

Venn Diagrams: r2 is represented by the proportion of overlap.                                              

     Y    X             Y  X                 Y  X

 

 

 

 

Correlation: Undefined

•      If there is no spread in X or Y, then r is undefined. Note that any z is undefined if the standard deviation is zero, and r=SzXzY/N.

 

Correlation, cont.

•      r

–   Computational formula (p. 176)

–   Example, p. 177-178

–   (Excel)

–   (SAS)

•      Population correlation coefficient, r (rho).

•      Impact on r:

–   Restriction of range.

–   Combined data.

–   Extreme scores (outliers).

•      Correlation does not imply causation.

Regression

•      Regression is concerned with

    forming a prediction equation to

    predict Y from X.

•      Uses the formula for a straight line, Y’=bX+a.

–   Y’ is the predicted Y score on the criterion variable.

–   b is the slope, b=DY/ D X=rise/run.

–   X is a score on the predictor variable.

–   a is the Y-intercept, where the line crosses the Y axis, the value of Y’ when X=0, a=Y-bX.

•      Example: if b=2, a=8, and X=23,

–   then Y’=2(23)+8=54.

 

 

Regression, cont.

•      Linear only.

•      Generalize only for X values in

    your sample.

•      Actual observed Y is different from Y’ by an amount called error, e, that is, Y=Y’+e.

•      Error in regression is e=Y-Y’.

•      Many different potential regression lines.

 

 

Regression: Best Line

•      The statistics b and a are computed so as to minimize the sum of squared errors,

–   Se2=S(Y-Y’)2 is a minimum

–   This is called the Least Squares Criterion.

Regression: b and a

 

•      Computing b and a:

–   b=[NSXY-(SX)(SY)]/[NSX2-(SX)2]

–   a=Y-bX

•      Example, p.209 (SAS)

–   For males, Anxiety’=.04Age+6.65

–   For females, Anxiety’=-.16Age+15.71

 

Regression: sy.x

•      Standard error of estimate is a

    statistic that measures/describes

    spread of errors or Y scores

    in regression.

•      sy·x is the standard deviation of errors in regression

–   sy·x = ΦSe2/(N-2)= ΦS(Y-Y’)2/(N-2).

–   sy·x = Φ[(N-1)/(N-2)](sy)Φ(1-r2)

•    As r2 increases, sy·x decreases. For example, if N=100 and sy=4

•    r2      sy·x

   .2   3.94

   .4   3.68

   .6   3.22

   .8   2.41

   .9   1.75

 

 

 

Regression: Partitioning

•      Partition total spread

–   Total = Explained + Not Explained

–   This is true for proportion of spread and amount of spread.

•    Proportion: 1 = r2 + (1-r2)

•    Amount: s2y = s2y r2 + s2y(1-r2)

–   Example:                          Total Expl.  Not Expl.

•    r=.5, s2y=200,  Proportion

                           Amount

Psychology 2113

Probability

Probability

•      Defined as relative

    frequency of occurrence.

•      Basic definitions:

–   Sample space: all possible outcomes of an experiment.

–   Elementary event: a single member of the sample space.

–   Event: any collection of elementary events.

–   Probability:

•    p(elementary event)=1/(total number)

•    p(event)=(number in the event)/(total number)

–   Conditional probability:

•    p(A|B)=(number in [A and B])/(number in B)

•    The probability of A in the redefined (reduced) sample space of B.

Probability: The Juror Example

•      The sample space is all 48 jurors.

•      An elementary event is any one of the 48 jurors.

•      An event is any subgroup of the 48: e.g. the 31 who gave an award.

•      Probabilities:

–   p(elementary event)=1/48

–   p(award)=31/48=.65

•      Conditional probability: p(Award|Auth.)=18/20

–   The 20 Auth. are the reduced sample space.

–   Always do the denominator first.

–   Out of the 20 Auth., the 18 who gave an award go in the numerator.

Probability, cont.
(The Big Three)

–   Independence (1): events A

   and B are independent if

•    p(A|B)=p(A)

•    The A probability is not changed by

   reducing the sample space to B.

–   Multiplication (And) Rule (2):

•    p(A and B)=p(A)p(B|A)=p(A|B)p(B)

–   Mutually exclusive:

•    Events A and B do not have any elementary events in common.

•    Events A and B cannot occur simultaneously.

•    p(A and B)=0

–   Addition (Or) Rule (3):

•    p(A or B)=p(A)+p(B)-p(A and B)

Probability: The Juror Example: Independence

•      The first of the Big 3: Independence.

–  Is Award independent of Auth.? It is if p(Award)=p(Award|Auth.).

–  p(Award)=31/48=.65

–  p(Award|Auth.)=18/20=.90

–  No, Award is not independent of Auth. because .65Ή.90.

•      Another example.

–  Is No Award independent of Egal.? It is if

   p(No Award)=p(No Award|Egal.).

–  p(No Award)=17/48=.35

–  p(No Award|Egal.)=15/28=.54

–  No, No Award is not independent of Egal. Because .35Ή.54.

Probability: More About Independence

•      Independence:

–   A is independent of B if p(A)=p(A|B) or if p(B)=p(B|A).

•      Examples:

–   If p(A)=.3 and p(A|B)=.4, A and B are not independent because .3Ή.4.

–   Given: p(A)=.1, p(B)=.2, p(A|B)=.3, and p(B|A)=.4. Is A independent of B? Explain. No, A and B are not independent because .1Ή.3 (or .2Ή.4).

–   Given: p(A)=.1, p(B)=.2, p(A|B)=.1, and p(B|A)=.2. Is A independent of B? Explain. Yes, A and B are independent because .1=.1 (or .2=.2).

–   Given: p(A)=.47, p(B)=.34, and p(B|A)=.49. Which o f these is the reason that A is not independent of B? .47Ή.34, .47Ή.49, .34Ή.49.

Probability: More About Ind., cont.

•      People in a restaurant were asked before and after their meal if they thought they would like a dessert. Is Dessert independent of Before?

•    p(Dessert)=55/104=.53 and p(Dessert|Before)=34/52=.65

•    No, Dessert is not independent of Before because .53Ή.65, or

•    p(Before)=52/104=.50 and p(Before|Dessert)=34/55=.62

•    No, Dessert is not independent of Before because .50Ή.62.

–   Which two probabilities would you have to examine to determine if No Dessert is independent of After?

•    One way: see if p(No Dessert) equals p(No Dessert|After).

•    Or, see if p(After) equals p(After|No Dessert).

•    Now, what are the two probabilities for each of these?

Probability: The Juror Example: Multiplication Rule

•      The second of the Big 3: Multiplication (And) Rule:  p(A and B)=p(A)p(B|A)=p(A|B)p(B).

–   This is the product of two probabilities: one about A, one about B, one a marginal probability, one a conditional probability.

–   Compute p(Award and Auth.). We know the answer to this is 18/48=.375.

•    p(Award and Auth.)=p(Award)p(Auth.|Award)=(31/48)(18/31)=18/48

•    p(Award and Auth.)=p(Award|Auth.)p(Auth.)=(18/20)(20/48)=18/48

–   Compute p(Award and Egal.). We know the answer to this is 13/48=.27.

•    p(Award and Egal.)=p(Award)p(Egal.|Award)=(31/48)(13/31)=13/48

•    p(Award and Egal.)=p(Award|Egal.)p(Egal.)=(13/28)(28/48)=18/48

Probability: More About Multiplication Rule

•      Multiplication (And) Rule:                                          p(A and B)=p(A)p(B|A)=p(A|B)p(B).

–   This is the product of two probabilities: one about A, one about B, one a marginal probability, one a conditional probability.

–   Other examples:

•    p(A)=.2, p(B)=.6, p(A|B)=.3, p(A and B)=p(A|B)p(B)=(.3)(.6)=.18

•    p(A)=.5, p(B)=.1, p(B|A)=.2, p(A and B)=p(A)p(B|A)=(.5)(.2)=.10

–   If all four probabilities are given, you can do the problem two ways and you should get the same answer: p(A)=.2, p(B)=.4, p(A|B)=.3, p(B|A)=.6,

•    p(A and B)=p(A)p(B|A)=(.2)(.6)=.12

•    p(A and B)=p(A|B)p(B)=(.3)(.4)=.12

Probability: The Juror Example: Mutually Exclusive

•      Not one of the Big 3 and not the same as independence.

•      Are No Award and Auth. mutually exclusive? No, because p(No Award and Auth.)=2/48Ή0. If none of the 48 jurors had responded in this cell, then No Award and Auth. would have been mutually exclusive.

•      Are Award and No Award mutually exclusive? Yes, because all 48 jurors have been classified as one of the two, and p(Award and No Award)=0.

Probability: The Juror Example: Addition Rule

•      The third of the Big 3: Addition (Or) Rule:                 p(A or B)=p(A) + p(B) - p(A and B). This is the most complicated of the Big 3 because it uses the Multiplication Rule to get its answer.

–   Compute p(Award or Auth.).

•    p(Award or Auth.)=p(Award)+p(Auth.)-p(Award and Auth.) = (31/48)+(20/48)-(18/48)=33/48=.69

–   Compute p(Award or Egal.).

•    p(Award or Egal.)=p(Award)+p(Egal.)-p(Award|Egal.) = (31/48)+(28/48)-(13/48)=46/48=.96

Probability: More About the Addition Rule

•      Addition (Or) Rule: p(A or B)=p(A)+p(B)-p(A and B).

–  Note that you add together two marginal probabilities and subtract off a joint (And) probability.

•      Other examples:

–   p(A)=.2, p(B)=.6, p(A|B)=.3.

•    First, compute p(A and B): p(A and B) = p(A|B)p(B)=(.3)(.6)=.18.

•    Now, compute p(A or B): p(A or B) = p(A)+p(B)-p(A and B) =     .2+.6-.18=.62.

–   p(A)=.5, p(B)=.1, p(B|A)=.2.

•    First, compute p(A and B): p(A and B) = p(A)p(B|A)=(.5)(.2)=.10.

•    Now, compute p(A or B): p(A or B) = p(A)+p(B)-p(A and B) =     .5+.1-.10=.50.

Probability: Review

•      Defined as relative

    frequency of occurrence.

•      Basic definitions:

–   Sample space: all possible outcomes of an experiment.

–   Elementary event: a single member of the sample space.

–   Event: any collection of elementary events.

–   Probability:

•    p(elementary event)=1/(total number)

•    p(event)=(number in the event)/(total number)

–   Conditional probability:

•    p(A|B)=(number in A and B)/(number in B)

•    The probability of A in the redefined (reduced) sample space of B.

•    The only probability that does not have total number in the denominator.

 

Probability, cont.
(The Big Three)

–   Independence (1): events A

   and B are independent if

•    p(A|B)=p(A)

•    The A probability is not changed by

   reducing the sample space to B.

–   Multiplication (And) Rule (2):

•    p(A and B)=p(A)p(B|A)=p(A|B)p(B)

–   Mutually exclusive:

•    Events A and B do not have any elementary events in common.

•    Events A and B cannot occur simultaneously.

•    p(A and B)=0

–   Addition (Or) Rule (3):

•    p(A or B)=p(A)+p(B)-p(A and B)         (end 8)

 

 

Psychology 2113

Sampling Distributions

        Introduction

          Types of Distributions

          Sampling Distribution of X

          Other Sampling Distributions

          Estimation

Sampling Distributions: Introduction

•      Pivotal subject: distributions of statistics. “Foundation…linchpin…important…crucial.”

•      You need sampling distributions to make inferences:

–   To get probabilities of statistics for decision making about parameters.

–   To get information necessary to estimate parameters.

•      A distribution that could be formed by drawing all possible samples of a given size N from some population, computing the statistic for each sample, and arranging these statistics in a distribution.

•      Every statistic has a sampling distribution.

Types of Distributions

•      Population:

–   Distribution of all possible scores, X’s;

–   Usually large, unobtainable, and hypothetical;

–   Has parameters m and s2, the values of which are usually unknown;

–   Unknown shape;

–   We want to infer to one of the parameters or to the distribution itself.

Types of Distributions, cont.

•      Sample:

–   Distribution of the N scores that we actually have, X’s;

–   Usually a manageable size, already obtained, and real;

–   Contained in what we will call our real world;

–   Has known statistics like X and s2;

–   Known shape;

–   We want to infer from one of the statistics to a parameter.

Types of Distributions, cont.

•      Sampling distribution:

–   Distribution of a statistic over all possible samples, for example, X’s;

–   Shows the variability of the statistic;

–   Theoretical;

–   Has parameters and usually a known shape;

–   The bridge for the inference from the sample to the population, from the statistic to the parameter;

–     Where we get the probabilities of the statistic so we can make decisions about the parameter.

Types of Distributions, cont.

Sampling Distribution of X-bar

•      The sampling distribution of X-bar

–   Has the purpose of any sampling distribution: to obtain probabilities…

–   Has the definition of any sampling distribution: the distribution of a statistic.

–   Has specific characteristics:

•    Mean: mX = m

•    Variance: s2X =s2/N

•    Shape is normal if

–   Population is normal
–   N is large (Central Limit Theorem)

Sampling Distribution of X-bar (Review)

•      The sampling distribution of X-bar

–   Purpose is_______________________ _______________________________.

–   Definition is______________________ _______________________________.

–   Has specific characteristics:

•    Mean: mX = ___

•    Variance: s2X =_______

•    Shape is __________ if

–   Population is ____________
–   N is _______ (_______________ Theorem)

Sampling Distribution of X-bar : Use Of zX-bar

•      IQ of deaf children:

–   What is the mean of this population distribution? Is it 100, like for the population of all IQ scores (m=100 and s2=225)?

–   What is the probability of getting X=88.07 or less if m=100 (and s2=225)?

–   To get this probability, we need a new statistic, zX=(X-m)/Φ(s2/N).

•    zX=(88.07-100)/Φ(225/59) = -6.11

•    p(X<88.07)=p(z<-6.11)<.00003

Use Of zX-bar

•   IQ of deaf children:

–   So what does this look like and how does it help us decide about m=100? Is the mean of the IQ of deaf children 100?

–   Because the probability of getting X=88.07 or less if m=100 is so small, less than .00003, we reject the idea that m=100.

–   It is very unlikely to get the data that led to X=88.07 from a population with m=100.

Other Sampling Distributions

•      The sampling distribution of X-bar is the first sampling distribution we learn, but it is not the only one (all statistics have sampling distributions).

•      All sampling distributions have in common:

–   Purpose: to obtain probabilities…

–   Definition: the distribution of a statistic.

•      But each sampling distribution has specific characteristics like mean, variance, and shape.

Other Sampling Distributions, cont.

•      Sampling distributions of

    s*2 and s2:          

–   Both have shapes that are positively skewed.

–   The mean of s*2 is [(N-1)/N]s2, always smaller than s2.

–   The mean of s2 is s2.

     

Other Sampling Distributions, cont.

•      Sampling distributions of

    r, s*, and s.

–   r: the mean is r (rho) if r=0, and the shape is symmetric but not normal.

–     s* and s: neither has a mean equal to s.       

Estimation

•      You need sampling

    distributions to make

    inferences:

–   To get probabilities of statistics for decision making about parameters.

–   To get information necessary to estimate parameters.

•      Estimation is the calculation of an approximate value of a parameter.

–   Point estimation is the use of a statistic as a single value (point) to estimate a parameter.

–   Any statistic can be used to estimate any parameter.

–   Some statistics are good, and logical, estimates of particular parameters, such as X-bar as an estimate of m.

–   “Unbiased estimate” is one definition of “good estimate.”

Estimation: Unbiased

•      Unbiased estimate: A statistic

    is an unbiased estimate of a

    parameter if the mean of its sampling distribution is

    equal to the parameter: mstatistic=desired parameter.

•      The following statistics are unbiased estimates of their corresponding parameters:

–   X-bar is an unbiased estimate of m because mX=m.

–   s2 is an unbiased estimate of s2 because m=s2.

–   r is an unbiased estimate of r because mr=r if r=0.

•      Note that the statistic and parameter can change, but the definition of unbiased is mstatistic=parameter.

Estimation: Unbiased, cont.

•      The following statistics are

    not unbiased estimates of

    their corresponding parameters (each is a biased

    estimate):

–   s*2 is a biased estimate of s2 because ms*²Ήs2.

–   s* is a biased estimate of s because ms*Ήs.

–   s is a biased estimate of s because ms Ήs.

•      Note that the statistic and parameter can change, but the definition of unbiased is mstatistic=parameter, always m of the statistic, and this m equals the desired parameter.

Estimation: Confidence Intervals

•      Sampling distributions

    give information necessary to estimate parameters.

•      Estimation is the calculation of an approximate value of a parameter.

–   Interval estimation allows you to obtain an interval of potential values for a parameter.

–   For the problem with IQ of deaf children, we found that X = 88.07 for our sample mean. We know that X is a good (unbiased) estimate of m, but we also know that X has variability so it is unlikely that m=88.07. However, 88.07 should be close to m.

Estimation: Confidence Intervals, cont.

•      A confidence interval for m gives an interval of values around X-bar that are likely to include the true value of m.

–   A 95% confidence interval for m is given by

           X-1.96(Φs2/N) to X+1.96(Φs2/N).

–   For the problem with IQ of deaf children, X = 88.07, N=59, and s2=225. So the 95% confidence interval for m is

          88.07-1.96(Φ225/59) to 88.07+1.96(Φ225/59)

                                  84.24 to 91.90.

–   We can say that we are 95% confident that the m of the IQ of deaf children is between 84.24 and 91.90. Or, we can say that 95% of intervals like this one will include the true value of the m of the IQ of deaf children.

 

 

 

Estimation: Confidence Intervals, cont.

We can say that 95% of intervals like 84.24 to 91.90 will include the true value of the m of the IQ of deaf children. The true value of this m is unknown, but many intervals, each from a different sample, would cluster around the true mean.

 

 

 

 

 

 

 

Psychology 2113

Hypothesis Testing

        Introduction

          Examples

          Key Terms

Hypothesis Testing: Introduction

•      This is the last of the seven topics common to all inferential statistic, and so it integrates all of the other six. Principally, hypothesis uses probability and the sampling distribution of a statistic to make decisions about a parameter.

•      Hypothesis testing is the process of testing tentative guesses about relationships between variables in populations. These relationships between variables are evidenced in a statement , a hypothesis, about a population parameter.

Hypothesis Testing: Examples

•      Rat-shipment example: are the rats defective? Or are they OK? If m=33 and s2=361, is the X=44.4 from the sample of N=25 rats significantly different from 33?

•      Compute zX=(X- m)/Φ(s2/N)=(44.4-33)/Φ(361/25)= 11.4/3.8=3.00.

•      Find p(X>44.4)=p(z>3)=.0013. It is unlikely the rats came from a population with m=33 for the mean run time. So we decide that mΉ33 and that the rats are defective.

Hypothesis Testing: Examples, cont.

•      IQ of deaf children example: are the deaf children lower in IQ? Or are they average? If m=100 and s2=225, is the X=88.07 from the sample of N=59 deaf children significantly lower than 100?

•      Compute zX=(X- m)/Φ(s2/N)=(88.07-100)/Φ(225/59)=  -11.93/1.95=-6.11.

•      Find p(X<88.07)=p(z<-6.11)<.00003. It is unlikely the deaf children came from a population with m=100 for the mean IQ. So we decide that m<100 and that the children have lower IQ scores (remember, this is due to the fact that their language, ASL, is not English, so they score lower on the verbal part of the total IQ test).

Hypothesis Testing: Key Terms

•      Test statistic: a statistic used only for the purpose of testing hypotheses; e.g. zX.

•      Assumptions: conditions placed on a test statistic necessary for its valid use in hypothesis testing;

–    for zX, the assumptions are that the population is normal in shape and that the observations are independent.

•      Null hypothesis: the hypothesis that we test; Ho.

•      Alternative hypothesis: where we put what we believe; H1. Both Ho and H1 are stated in terms of a parameter.

Hypothesis Testing: Key Terms, cont.

•      Significance level: the standard for what we mean by a “small” probability in hypothesis testing; a.

•      Directional and non-directional hypotheses.

•      One- and two-tailed tests, critical values, and rejection values.

•      Decision rules:

–   Critical value decision rules.

–   p-value decision rules.

–   (end of 10)

HO and H1

•      Rat-shipment example:

–   We start with H1. We believe that there is something wrong with the rats, or that mΉ33. So we have H1: mΉ33.

–   Next, we state Ho. The null is always the opposite of the alternative. Within Ho and H1, the set of potential values of the parameter to be tested usually contains all possible numbers. The null hypothesis usually has the “equals” in it. So we have Ho: m=33.

•      IQ of deaf children example: 

–   Again, we start with H1. We believe that the deaf children will score lower on the IQ test because English is not their native language, or that m<100. So we have H1: m<100.

–   Next, we state Ho. So we have Ho: m>100.

Significance Level

•      The significance level is the small probability used in hypothesis testing to determine an unusual event that leads you to reject Ho.

–   The significance level is symbolized by a (alpha).

–   The value of a is almost always set at a=.05.

–   The value of a is chosen before data are collected.

–   If Ho is rejected when a=.05, here are examples of what you say:

•    “The mean of the IQ of deaf children, X=88.07, is significantly lower than 100, z=-6.11, p<.00003.”

•    “The mean of the run times, X=44.4, is significantly different from 33, z=3.00, p=.0013.”

Directional and Non-directional Hypotheses

•      Directional hypotheses specify a particular direction for values of the parameter.

–   IQ of deaf children example: Ho: m>100, H1: m<100.

•      Non-directional hypotheses do not specify a particular direction for values of the parameter.

–   Rat shipment example: Ho: m=33, H1: mΉ33.

•      Another example:

–   Suppose you believe that dancers are more introverted than other people. You have N=26 dancers and know that for this age group with your male/female ratio that m=19.15.

–   So you have Ho: m<19.15 and H1: m>19.15.

One- and Two-Tailed Tests, Critical Values, and Rejection Values

•      One- and two-tailed tests:

–   A one-tailed test is a statistical test that uses only one tail of the sampling distribution of the test statistic.

–   A two-tailed test is a statistical test that uses two tails of the sampling distribution of the test statistic.

•      Critical values are values of the test statistic that cut off a or a/2 in the tail(s) of the theoretical reference distribution.

•      Rejection values are the values of the test statistic that lead to rejection of Ho.

One- and Two-Tailed Tests, Critical Values, and Rejection Values, cont.

•      Rat shipment example: Ho: m=33, H1: mΉ33

•      Two-tailed test

•      Critical values are   zcrit=-1.96 and   zcrit= 1.96

•      Rejection values are <-1.96 and >1.96

One- and Two-Tailed Tests, Critical Values, and Rejection Values, cont.

•      IQ of deaf children example:             Ho: m>100,           H1: m<100.

•      One-tailed test

•      Critical value is         zcrit=-1.645

•      Rejection values are <-1.645

One- and Two-Tailed Tests, Critical Values, and Rejection Values, cont.

•      Introversion of dancers example: Ho: m<19.15,        H1: m>19.15.

•      One-tailed test

•      Critical value is      zcrit=1.645

•      Rejection values are >1.645

Critical Value Decision Rules

•      Rat shipment example: Ho: m=33, H1: mΉ33.

•      Reject Ho if the observed zX<-1.96 or if zX>1.96.

•      The observed zX was zX=3.00.

Critical Value Decision Rules, cont.

•      IQ of deaf children example:             Ho: m>100,           H1: m<100.

•      Reject Ho if the observed           zX<-1.645.

•      The observed zX was zX=-6.11.

Critical Value Decision Rules, cont.

•      Introversion of dancers example: Ho: m<19.15,        H1: m>19.15.

•      Reject Ho if the observed           zX>1.645.

•      The observed zX was zX=.76.

Compute zX for Introversion of Dancers

•      Remember, you believe that dancers are more introverted than other people. You have N=26 dancers and know that for this age group with your male/female ratio that m=19.15. So you have Ho: m<19.15 and H1: m>19.15.

•      Introversion of dancers example: are the dancers higher in introversion? Or are they average? If m=19.15 and s=4.32, is the X=19.79 from the sample of N=26 dancers significantly higher than 19.15?

•      Compute zX=(X- m)/Φ(s2/N)=                             (19.79-19.15)/Φ(18.6624/26)=.64/.85=.76.

p-Value Decision Rules

•      Rat shipment example: Ho: m=33, H1: mΉ33.

•      Reject Ho if the SAS (2-tailed) p-value is <a=.05.

•      The SAS p-value is .0026.

•      Reject Ho: m=33 because .0026<.05 (p-value is < a).

p-Value Decision Rules, cont.

•      Reject Ho if

–   ½ the SAS p-value <a, and

–   the observed zX is in the tail specified by H1.

•      ½ the SAS p-value is .00003 and the observed zX was in the left tail (as in H1).

 

p-Value Decision Rules, cont.

•      Reject Ho if

–   ½ the SAS p-value <a, and

–   the observed zX is in the tail specified by H1.

•      ½ the SAS p-value is .2236 and the observed zX was in the right tail (as in H1).

•      So, retain Ho: m<19.15.

 

 

 

 

Psychology 2113

Types of Error

Power

Factors That Influence Power

Types of Errors

•      Two hypotheses, two decisions, two types of error: this was one of the seven topics common to all inferential methods.

•      The two hypotheses are Ho and H1, and the two decisions are to Reject Ho and to Retain Ho.

•      Now we come to the errors that you can make in hypothesis testing:

–   A Type I error: to reject Ho when Ho is true.

–   A Type II error: to retain Ho when H0 is false (H1 is true).

Types of Errors, cont.

•      Each of these types of errors has a probability of occurring:

–   p(Type I error)=p(reject Ho | Ho true)=a.

–   p(Type II error)=p(retain Ho | H1 is true)=b.

•      We summarize these in a 2x2 box:

Types of Errors, cont.

•      Now let’s see what this looks like with a picture of the two distributions.

–   p(Type I error)=p(reject Ho | Ho true)=a.

–   p(Type II error)=p(retain Ho | H1 is true)=b.

Types of Errors, cont.

•      If you have already rejected Ho, the only error you can make is a Type I error, and because you have not retained Ho, then b=0 (after the fact).

•      If you have already retained Ho, the only error you can make is a Type II error, and because you have not rejected Ho, then a=0 (after the fact).

Types of Errors, cont.

•      It is extremely important to keep the probabilities of both types of errors small.

–   We keep a small by definition, a=.05. We have direct control over a.

–   However, we do not have direct control over b. To keep b small, we keep 1- b=power large by using the influence of several factors, thus indirectly controlling power (and b).

Power

•      Power = p(rejecting Ho) = 1-b

•      We keep b small and 1-b large indirectly by using the influence of several factors: effect size, N, s2, a, and type of hypotheses.

Power: Effect Size

•      Effect size, for zX is g=(m-mo)/s, the difference between the true mean and the mean given in the Ho divided by the population standard deviation.

•      As effect size increases, power increases.

Power: Sample Size, N

•      N, sample size, is the factor that gives you the greatest control over power. You usually can choose N, and N has a great influence on power.

•      As N increases, power increases.

Power: s2

•      s2, population variance, offers you little control over power. You usually have controlled s2 through good research methods.

•      As s2  decreases, power increases.

Power: a

•      a, p( Type I error), usually set at .05, also offers you little control over power. You can choose to use .01 or smaller, but will rarely use a larger than .05.

•      As a increases, power increases.

Power: Type of Hypotheses

•      Directional hypotheses have greater power if you are correct in predicting direction, but virtually zero power if you are wrong.

Power: Type of Hypotheses, cont.

•      Non-directional hypotheses have good power in either direction, but lower power than that for a directional hypothesis in the correct direction.

Power: Review

•      Power = p(______________) = ____

•      We keep b small and 1-b large indirectly by using the influence of several factors:

•      effect size=_________________: as effect size increases, power_____________.

•      N: as N increases, power_____________

•      s2: as s2________________, power increases.

•      a: as a increases, power_____________.

•      A directional hypothesis gives greater power if______________________________________.

•      A non-directional hypothesis gives good power in__________________________.

–   (end 11

–    

Psychology 2113

New Test Statistics

One-Sample t-test

Test of Correlation

New Test Statistics

•      All test statistics (inferential methods) have some things in common: use of descriptive statistics, use of probability…all of the basics of hypothesis testing. For example, all have a null hypothesis, all use a, and for all, increasing N increases power.

•      But some things are different. For every new test statistic, we will cover four topics:

–   Situation, including the hypotheses.

–   Test statistic.

–   Theoretical reference distribution, critical values, and decision rules.

–   Assumptions.

New Test Statistics, cont.

•      I encourage you to start a chart. Put the four topics on the left side (rows) and the test statistics on the top (columns). Start with zX.

One-Sample t-Test

1. Situation/hypotheses

 

2. Test statistic

 

3. Distribution

 

4. Assumptions

 

t Distributions

•      t distributions have the following characteristics:

–   Theoretical distribution that is symmetric, smooth, unimodal,  and has m=0.

–   Looks like the standard normal distribution, but has longer tails and more variability.

–   The greater variability is due to t statistics having not only a mean, X, that varies from sample to sample, but also a variance, s2.

t Distributions, df

•      t distributions have only one parameter, df (degrees of freedom). The formula for df can change from one t statistic to the next.

•      The working definition for df is “In a sample variance, df=number of independent components   – number of parameters estimated.”

•      The one-sample t has the unbiased sample variance, s2, in its formula. In s2=S(X-X)2/(N-1), there are N values of X, the independent components, and 1 statistic, X-bar, that estimates the 1 parameter, m. So df=N-1 for the one-sample t.

 

df, One-Sample t

•      The df for the one-sample is N-1. Note that the whole concept of df came with the t-test. There was no concept of df associated with zX. So whatever changed from zX to t is what brought with it the concept of df. So how does t differ from zX? t has s2.

t Table

•      Now we can use df and a=.05 to find a critical value for t, tcrit. The t table is organized by df for the rows and a for one- and two-tailed tests for the columns. If N=10, then for a one-sample t, df=N-1=10-1=9. For a two-tailed test with a=.05 and df=9, the critical values are ±2.262.

One-Sample t-Test: Example

•      Are people who are interrupted in a task accurate in estimating how long they have spent on the task? People who were given 20 3-letter anagrams to solve (e.g. arn is ran) were interrupted after doing 10 of them and asked to estimate how long they had worked on the task. The researchers formed a ratio of estimated to actual time, and mratio should be 1 if the people are accurate in estimating time.

•      The ratios for the N=10 people are .911  1.011  1.807  2.010  1.911  2.156  1.251  1.516  2.730  1.160

•      Get the sum and the sum of squares of the ratios:

•      SX=16.463 and SX2=30.119405

One-Sample t-Test: Example

•      Now compute the mean, X, and the unbiased variance, s2, and s.

•      X=1.646, s2=.3352, s=.579.

•      Ho:m=1, H1:m Ή1. So we are now ready to compute t=(X-m)/Φ(s2/N) = (1.646-1)/Φ(.3352/10)=3.53

•      Using a critical value decision rule, the upper tcrit is 2.262 and 3.53>2.262. Using a p-value decision rule, the SAS p-value was .0064<a=.05.

•      So both decision rules lead us to reject Ho:m=1. What does this look like in the sampling distribution of t?

One-Sample t-Test: Example, cont.

•    Find the observed t=3.53 and the upper tcrit=2.262 in the distribution below. Because 3.53>2.262, or because .0064<.05, we reject Ho:m=1. People interrupted in a task significantly overestimate the time spent in the task.

Test of Correlation: r

•      Continuing with your chart, we will add a new test statistic to zX and the one-sample t. You already know it as a descriptive statistic, but here it will be used to test hypotheses.

Test of Correlation: df

•      The df for r is N-2. It can be shown why df=N-2 from the standard error of estimate. The standard error of estimate is a statistic that describes spread of errors or Y scores in correlation and regression. So, in sY.X we look for independent components and statistics (that estimate parameters).

r Critical Values

•      Now we can use df and a=.05 to find a critical value for r, rcrit. The table of rcrit is organized by df for the rows and a for one- and two-tailed tests for the columns. If N=10, then for r, df=N-2=10-2=8. For a two-tailed test with a=.05 and df=8, the critical values are ±.632.

Test of Correlation: Example

•      Researchers believed stress for police officers increased as number of hours spent moonlighting on a second job increased. For 28 officers, r was .45. Is r significantly larger than zero?

Confidence Intervals for m

•    Remember, interval estimation allows you to obtain an interval of potential values for a parameter.

•    For the problem about the ratio of estimated time to actual time for interrupted anagram solvers, we found  X=1.646 for our sample mean. We know that X is a good (unbiased) estimate of m, but we also know that X has variability so it is unlikely that m=1.646. However, 1.646 should be close to m. Now we will see how to get an interval for m when we don’t know s2.

Confidence Intervals for m, cont.

•      A confidence interval for m gives an interval of values around X that are likely to include the true value of m.

–  A 95% confidence interval for m is given by

                           X-tcrit(Φs2/N) to X+tcrit(Φs2/N).

–  For the problem about the ratio of estimated time, X=1.646, N=10, s2=.3352, df=N-1= 9, and tcrit= ±2.262. So the 95% confidence interval for m is

  1.646-2.262(Φ.3352/10) to 1.646+2.262(Φ.3352/10)

                                 1.23 to 2.06.

 

Confidence Intervals for m, cont.

•      So for the 95% confidence interval for m of

                             1.23 to 2.06

we can say that we are 95% confident that the m of

the ratio of estimated to actual time is between 1.23

and 2.06. Or, we can say that 95% of intervals like

this one will include the true value of the m of the ratio

of estimated time to actual time for people interrupted

after solving 10 of 20 anagrams. Note that 1 is not in

the interval, so we reject Ho: m=1.

 

Confidence Intervals for m, cont.

We can say that 95% of intervals like 1.23 to 2.06 will include the true value of the m of the ratio of estimated to actual time. The true value of this m is unknown, but many intervals, each from a different sample, would cluster around the true mean.

 

 

 

 

 

 

 

 

Psychology 2113

Two-Sample Tests

    Two-Independent-Sample t-test

    Two-Dependent-Sample t-test

Two Samples

•      The one-sample t-test and test of correlation are realistic, useful statistical tests. The tests that we will learn next are even more so: they don’t need a known value of m. They both use two samples.

•      You can evaluate research on two groups of people who saw a brief film of a car wreck. Is there any difference in estimates of speed between those who were asked, “How fast were the cars going when they hit into each other?” vs “How fast were the cars going when they smashed into each other?”

Two-Independent- Sample t-test

•              Situation/hypotheses

 

 

•              Test statistic

 

 

•              Distribution

 

 

•              Assumptions

Two-Independent- Sample t-test

•      We have independent samples whenever there is not any obvious dependency present. When we cover the two-dependent-sample t, we will see some of these obvious ways samples can be dependent.

•      Why is df=n1+n2-2? The denominator for the two-independent-sample t has both s21 and s22. In s21 =S(X1-X1)2/(n1-1), there are n1 independent X1 scores and one statistic, X1. So, for s21, the df equals n1-1. Similarly for s22, with df=n2-1. Adding the two df together gives df=n1-1+ n2-1=n1+ n2-2.

Two-Independent-Sample t-test: Example

•      One group (PC=perceived control) of 30 students thought items they submitted might be selected for the test. The other group (NC=no control) of 30 was told writing the items was a study aid. Students were randomly assigned to groups. Exam stress was measured by number of symptoms.

•      Results: for Group PC, X1=10, s21=8.8276.            For Group NC, X2=12.5, s22=8.6034.                     df=n1+n2-2=30+30-2=58.                                   Critical values for df=55 are ±2.004.                      The computed value of t=3.28, so we reject Ho:mPC=mNC because 3.28>2.004.

Two-Independent-Sample t-test: Example, cont.

•    The sampling distribution for t for the exam stress example is shown below. We reject Ho:mPC=mNC because t=3.28>2.004 or because p=.0018<a=.05. The two groups differ significantly in number of symptoms of stress.

Two-Independent-Sample t-test: Robustness

•      What happens to t when its assumptions are not met? Is t a good statistic? Is a still equal to .05? The topic of robustness of test statistics examines their quality or validity when an assumption is not met (when the assumption is violated). A statistic is robust to violation of an assumption if

–   Its sampling distribution is well-fit by its theoretical distribution.

–   atrue»aset. Note that atrue  is from the sampling distribution, and aset is from the theoretical distribution.

•      When aset=.05, “approximately equals” is defined as .04 to .06.

–     We get this information from research on statistics.

Two-Independent-Sample t-test: Robustness, cont.

Two-Dependent- Sample t-test

•              Situation/hypotheses

 

 

•              Test statistic

 

 

•              Distribution

 

 

•              Assumptions

Two-Dependent- Sample t-test: X,X Pairs

•      We have dependent samples whenever we have X,X pairs of scores. Such pairs can happen in at least three different ways:

–   Researcher-produced pairs. If students in the exam stress study had been matched on GPA, the researcher would have produced the pairs. The X scores on number of symptoms in the PC group would be dependent on the X scores in the NC group.

–   Naturally occurring pairs. For example, husband-wife, siblings, etc.

–   Repeated measures. This could be the pre-  and post-test scores when people are measured before and after a treatment.

Two-Dependent-Sample t-test: Test Statistic

•      The t for this test is based on first getting difference scores, d=X1-X2. Then the        statistics in t can be computed:

–   d=Sd/N

–   s2d=S(d-d)2/(N-1)=[NSd2-(Sd)2]/[N(N-1)]

•      Why is df=N-1? The denominator for the two-dependent-sample t has s2d. In s2d, there are N independent d’s and one statistic, d. So, for s2d and the two-dependent-sample t, the df equals   N-1.

Two-Dependent- Sample t-test: Example

•      Does intravenous injection of  butyrate, a flavor enhancer, give                  increased fetal hemoglobin in 6 sickle-cell anemia patients?

•      Results: for pre-injection scores, X=14.3, and for post-injection scores, X=38.6.   

•      Computations on d gave d=24.3,  s2d=347.8667, and t=3.20. With N=6 patients, df=N-1=6-1=5. The critical value for df=5 is 2.015. So we reject Ho:mPost<mPre because 3.20>2.015.

•      SAS

Two-Dependent-Sample t-test: Example, cont.

•    The sampling distribution for t for the butyrate injection example is shown below. We reject Ho:mPost<mPre because t=3.20>2.015, or, because ½p=.012<a=.05 and t is positive.

 

 

 

 

 

 

 

Psychology 2113

One-Way ANOVA

        Introduction

        Logic

        F-test

One-Way ANOVA: Introduction

•      Now we examine a test statistic that will let us test hypotheses about two or more means, so we can use two or more groups. The two-sample t-tests could work with only two groups; the one-way ANOVA uses two or more.

•      Does smoking impact your thinking? Non-smokers (NS), active smokers (AS, had just smoked), and deprived smokers (DS, not smoked for 3 hours) did several tasks, ranging from simple to complex. In complex tasks, there were significant differences between the groups, with the AS group doing the worst.

One-Way ANOVA: Logic

•      The ANOVA F-test uses a different logic than zX, r, or any of the t-tests. They were all based on a logic that looked for how far the test statistic was from a middle value of zero. If the statistic was far enough away from zero, and in agreement with H1, then you rejected Ho.

•      The ANOVA’s logic forms an F-ratio of two sample variances, one based on the group means (Between) and the other based on scores within groups (Within). If Ho of equal populations means is true, both variances should be equal and the average F will be about 1. If the population means aren’t equal, we expect Between>Within, average F>1, and we reject Ho if F>Fcrit (note: do demo).

One-Way ANOVA: Logic, cont.

H1:any differences in mj’s

One-Way ANOVA: Logic, cont.

•      Notation: n=# obs. per group, J=# groups, N=nJ.

•      Two sample variances:

–   One based on group means (Between). Compute s2X and multiply it by n. This also is called MSB, so ns2X=MSB.

–   One based on scores within groups (Within). Compute s2 of observations within each of the J groups, and the average of these J values of s2 is s2pooled, also called MSW.

•      Form the test statistic, F=(ns2X)/(s2pooled)=MSB/ MSW.

•      Hypotheses:

–   Ho:m1=m2=…=mJ

–   H1:any differences in mj’s

One-Way ANOVA: Logic, cont.

 

We reject Ho if F>Fcrit.

One-Way ANOVA: F-test

1. Situation/hypotheses

 

2. Test statistic

 

3 .Distribution

 

4. Assumptions

One-Way ANOVA: Factors vs. Levels

•      The ANOVA is a general statistical tool, including the one-way ANOVA, the two-way ANOVA and beyond. The “one” in “one-way” refers to the number of factors (variables that classify the subjects into groups). A one-way layout looks like this:

 

 

 

 

One-Way ANOVA: Test Statistic

•      Hypotheses: if J=4

–   Ho:m1=m2=m3=m4

–   H1:any differences in mj’s

•      The test statistic is the F-ratio,                        F=MSB/MSW=(SSB/dfB)/(SSW/dfW), where          dfB=J-1 and dfW=N-J (or if you have =n’s, J(n-1)).

•      Example: if SSB=410, SSW=630, n=30, and J=4, then dfB=J-1=4-1=3, and dfW=J(n-1)=4(29)=116, so F=(410/3)/(630/116)=136.67/5.43=25.16.

One-Way ANOVA: F Distribution

•      The F distribution is a positively skewed distribution with a minimum of zero.

•      It has two parameters, the df for the numerator variance and the df for the denominator variance. For the one-way ANOVA, the df for the numerator is dfB and the df for the denominator is dfW.

•      The F table of critical values is organized by dfB, dfW, and a (.05 and .01). Only upper-tail critical values are given because we expect the F only to get large if H1 is true.

One-Way ANOVA: F Distribution, cont.

•      Here is a picture of the F distribution with dfB=3 and dfW=60, with the critical value that cuts off a=.05 in the upper tail.

One-Way ANOVA: Assumptions

•      The one-way ANOVA F statistic is distributed as    FJ-1,N-J only if all of the assumptions are met. If any of the assumptions are not met, then F only approximately has this distribution and we need to ask questions about robustness for each assumption.

–   Normality: like the two-independent-sample t, F is reasonably robust to non-normality, except for mixed distributions.

–   Equal variances: unlike the t, F is not robust to very unequal variances, even with large and equal n’s.

–   Independence: like the t, F is not robust to dependence in the data, but we typically meet this assumption.

One-Way ANOVA: Unequal Variances

•      Unlike the t, F is not robust to very unequal variances, even with large and equal n’s, if J>2.

•      For example, for J=4, n=50, if the population variances are in the ratio of 16:1:1:1, then the true a is .088 when a is set at .05. Note that .088 is larger than the .06 that we set as an upper boundary on a=.05. Also note that the n=50 per group is a bunch larger than the n=15 that it took to make the t robust to any ratio in variances.

•      The F is robust to slightly unequal variances, but you don’t know the population variances.

•      This problem of the F’s lack of robustness to very unequal variances is resolved when we get to the next statistical procedures, MCPs.

One-Way ANOVA: Example, Liar Data

•      Which occupation should best be able to detect liars? Secret Service agents, judges, and psychiatrists were compared on percent correct in detecting which of ten people were lying (Liar data, p.471). Here XSS=64, Xjudges=56.57, Xpsy=57.71.

•      Hypotheses:

–   Ho:mSS=mJudges=mPsy

–   H1:any differences in mj’s

•      F=MSB/MSW=(SSB/dfB)/(SSW/dfW), where dfB=J-1 and dfW=J(n-1).

•      SSB=1120, SSW=16,845.7143, n=35, and J=3, then dfB=J-1=3-1=2, and dfW=J(n-1)=3(34)=102, so F=(1120/2)/(16845.7143/102)=560/165.1541=3.39.

One-Way ANOVA: Example, Liar Data

•      Next, get an Fcrit for dfB=2 and dfW=102. Using 2 and 100, with a=.05 we have Fcrit=3.09.

One-Way ANOVA: Example, Liar Data

•      The results of an ANOVA are often reported in an ANOVA summary table. Below is the summary table for the Liar Data. (SAS)

One-Way ANOVA: t2=F

•      The final topic for the ANOVA is to show the connection between the two-independent-sample t and the one-way ANOVA  F when J=2.

•      The relationship is that t2=F. Let’s do an example: J=2, n=31.

–   Note the df for both tests: for t, df=n1+n2-2=31+31-2=60. For F, dfB=J-1=2-1=1, dfW=J(n-1)=2(30)=60.

–   Now, get tcrit and Fcrit: tcrit =±2.00 and Fcrit=4.00.

–   If you square –2, you get 4, and if you square 2, you get 4, so squaring the values of tcrit gives the value of Fcrit.

–     Note that the tail area cut off by the t of –2 is .025, and the tail area cut off by the t of 2 is also .025. Add them and you get .05, the tail area cut off by the F of 4.

One-Way ANOVA: t2=F

•      Here is a picture of what happens with t260=F1,60.

 

 

 

 

 

 

Psychology 2113

Multiple Comparison Procedures (MCPs)

        Introduction

        Tukey’s HSD

        Fisher-Hayter Test

Multiple Comparison Procedures: Introduction

•      If the ANOVA F rejects Ho, it is favoring H1; but H1 merely says “any difference in the mj’s”. So the F doesn’t tell you which groups have different means, it says “some difference, somewhere.” This fact, along with the lack of robustness to unequal variances, makes us not rely on the F as the most important statistic for a one-way design.

•      We need tests for the multiple differences that exist between the J means. For example, which of the groups is best at detecting liars: Secret Service agents, judges, or psychiatrists? The significant F merely says there is a difference.

Multiple Comparison Procedures: Introduction

•      Pairwise comparisons are differences in means taken two at a time. On J means, there are C=J(J-1)/2 pairwise comparisons.

•      The hypotheses for pairwise comparisons are Ho:mj=m and H1:mjΉm.

Multiple Comparison Procedures: Introduction

•      Error rates:

–   p()=p(at least one Type I error)<(1-(1-a’)C<Ca’ where C is the number of pairwise comparisons you are doing and a’ is the alpha set for each comparison.

–   Error rate per comparison sets a’=.05 for each comparison, so p()<(1-(1-.05)C<C(.05) could be large. For C=3, p()<(1-(1-.05)3<3(.05) gives p()=.143<.15.

–   Error rate familywise controls p() at a maximum of .05 for a group of C comparisons by keeping a’ small.

Multiple Comparison Procedures: Introduction

•      For the liar data, J=3, so C=J(J-1)/2=3(2)/2=3. There are three pairwise comparisons: SS vs. judges, SS vs. psychiatrists, and judges vs. psychiatrists.

–   Error rate per comparison sets a’=.05 for each comparison, so p()=.143<.15.

–   Error rate familywise sets a’=.016952427 and p()=.05.

Multiple Comparison Procedures: Introduction

•      If J=4, C=J(J-1)/2=4(3)/2=6. There are six pairwise comparisons: 1 vs. 2, 1 vs. 3, 1 vs. 4, 2 vs. 3, 2 vs. 4, and 3 vs. 4.

–   Error rate per comparison sets a’=.05 for each comparison, so p()=.265<.30.

–   Error rate familywise sets a’=.008512445 and p()=.05.

•      Tukey’s MCP automatically controls p() familywise.

Multiple Comparison Procedures: Tukey

•      We will use a t statistic for multiple comparisons, and the Studentized Range distribution that has a critical value of q/Φ2.

Multiple Comparison Procedures: Example

•      Tukey’s MCP on the liar data gives three t statistics:

–   SS vs. judges,                                                             t=(64-56.57)/Φ[(2(165.1541))/35]=7.43/3.072=2.42

–   SS vs. psychiatrists,                                                             t=(64-57.71)/Φ[(2(165.1541))/35]=6.29/3.072=2.05

–   Psychiatrists vs. judges,                                                             t=(57.71-56.57)/Φ[(2(165.1541))/35]=1.14/3.072=.37

•      Critical value: q/Φ2

–   For J=3, dfW=102 (we’ll use 60), and a=.05, the value of q=3.40. So the t critical value is q/Φ2=3.40/Φ2=2.40.

–   Only the SS vs. judges t is significant (reject Ho because 2.42>2.40).

•      SS agents out-do judges, but not psychiatrists. Psychiatrists are not better than judges. (SAS)

Multiple Comparison Procedures: Fisher-Hayter

•      Like the Tukey except that the overall F must be significant and it uses the Studentized Range distribution with a critical value of q/Φ2 for J-1.

Multiple Comparison Procedures: Example

•      For Fisher-Hayter’s MCP on the liar data, F was significant, 3.39>3.09; the same three t statistics:

–   SS vs. judges, t=2.42

–   SS vs. psychiatrists, t=2.05

–   Psychiatrists vs. judges, t=.37

•      Critical value: q/Φ2 but uses J-1

–   For J-1=2, dfW=102 (we’ll use 60), and a=.05, the value of q=2.83. So the t critical value is q/Φ2=2.83/Φ2=2.00.

–   Now both SS vs. judges and SS vs. psychiatrists t’s are significant (reject Ho’s because 2.42>2.00 and 2.05>2.00). The Fisher-Hayter MCP has better power than Tukey’s MCP.

•      SS agents out-do judges and psychiatrists. Psychiatrists are not better than judges. (end of 15)

 

 

Psychology 2113

Two-Way ANOVA

        Introduction

        Logic

        Interaction

        F-tests

Two-Way ANOVA: Introduction

•      The two-way ANOVA uses two factors, variables that combine to form the groups. The factors may or may not be independent variables.

•      The groups formed by combining levels/values of the factors are called cells, and the means of the observations in these cells are called cell means.

•      We have three F-tests in a two-way ANOVA, one for each of the two factors by themselves, and one for the interaction of the two factors.

Two-Way ANOVA: Introduction

•      Example: runners and cyclists randomly assigned to one of three amounts of time to hold a hamstring stretch, tested for flexibility after six weeks. Sport and time are the factors, we call this a 2X3 ANOVA, and there are 6 cells.

•      So we will have an F-ratio for sport, an F-ratio for time, and an F-ratio for the interaction of sport and time.

Two-Way ANOVA: Logic

•      The logic of the two-way ANOVA is the same as that for the one-way: for each of the three F-tests, form an F-ratio of two sample variances. For each F, if Ho is true, both variances should be equal and the average F will be about 1.

•      For each F, if H1 is true,

–   We expect numerator>denominator,

–   We expect average F>1,

–   And we reject Ho if F>Fcrit.

•      The difference is that the two-way ANOVA is more complex: there are three F’s. The effects of the factors are called main effects.

Two-Way ANOVA: Logic, cont.

•      Notation: n=# obs. per cell, J=# levels of A, K=# levels of B, N=nJK.

•      Each of the three F’s is formed as a ratio of two sample variances: the numerator will be the MS for the effect tested, the denominator will be MSW.

•      Hypotheses:

–   For A (e.g. Sport)

•    Ho:m1=m2=…=mJ

•    H1:any differences in mj’s

–   For B (e.g. Time)

•    Ho:m1=m2=…=mK

•    H1:any differences in mk’s

–   For interaction (not easily expressed in terms of m’s),

•    Ho:no interaction effect

•    H1:some interaction effect

Two-Way ANOVA: Interaction

•      Interaction is a unique combination of the factors, a combined effect separate from the main effects; interaction can’t be explained by the factors alone.

•      Maybe for runners, 30s gives the best flexibility, but for cyclists, they need 60s. This is an example of interaction.

•      When cell means are plotted, interaction shows up as line segments that are not parallel.

 

Two-Way ANOVA: Interaction, cont.

•      Another example: Irritable bowel syndrome (IBS) involves abdominal pain, a sudden/urgent need to go to the bathroom, and frequent diarrhea or constipation. A new drug helps only women who are prone to diarrhea, not women with constipation nor men with either condition.

•      Using a rating of the drug that increases as effectiveness of the drug increases, these results would look like this:

 

Two-Way ANOVA: Interaction, cont.

•      Plots of cell means showing the three F-tests (assumes that MSW is small so any observed difference is significant).

Two-Way ANOVA: Interaction, cont.

•      Plots of cell means showing the three F-tests (assumes that MSW is small so any observed difference is significant).

Two-Way ANOVA: F-tests

1. Situation/hypotheses

 

 

2. Test statistic

 

 

3 .Distribution

 

 

4. Assumptions

Two-Way ANOVA: Factors vs. Levels

•      Each two-way ANOVA always has two factors, but each of these factors can have different numbers of levels (levels are the values of the factors). Here are several different two-way layouts:

Two-Way ANOVA: Test Statistics

•      The test statistics are F-ratios,                       

FA=MSA/MSW=(SSA/dfA)/(SSW/dfW), where            dfA=J-1 and dfW=JK(n-1)

FB=MSB/MSW=(SSB/dfB)/(SSW/dfW), where           dfB=K-1 and dfW=JK(n-1)

FAB=MSAB/MSW=(SSAB/dfAB)/(SSW/dfW), where          dfAB=(J-1)(K-1) and dfW=JK(n-1)

•      If n=20, J=3, and K=4, compute the df:

–   dfA=J-1=3-1=2

–   dfB=K-1=4-1=3

–   dfAB=(J-1)(K-1)=(3-1)(4-1)=(2)(3)=6

–   dfW=JK(n-1)=(3)(4)(20-1)=12(19)=228

Two-Way ANOVA: F Distributions

•      Each F-statistic in a two-way ANOVA has it’s own, possible different, F distribution and F critical value.

•       FA is distributed as FJ-1, JK(n-1)

•       FB is distributed as FK-1, JK(n-1)

•      FAB is distributed as F(J-1)(K-1), JK(n-1)

•      Find the three Fcrit values for dfA=2, dfB=3, dfAB=6, and dfW=228 (use dfW=200, see Table A.6).

–   For A, Fcrit=3.04.

–   For B, Fcrit=2.65.

–   For AB, Fcrit=2.14.

Two-Way ANOVA Example: Intro

•      Do study technique (four groups: no notes, student notes, outline framework, and complete outline) and cognitive style (FI=field independent=self-sufficient and provide own structure, FD=field dependent = need outside structure) impact scores on a 20-item multiple choice test? Within cognitive style, students were randomly assigned to study technique, 13 per cell, and all listened to the same taped lecture over the material on the quiz. Here’s the 2X4 layout.

Two-Way ANOVA Example: Results

ANOVA Summary Table

Source                       SS                df            MS             F               p    (SAS)

A=Cog. Styles         25.009             1         25.009         7.78       .0064

B=Study Tech.      320.182             3       106.727       33.22       .0001

AB=interaction        27.260             3           9.086         2.83       .0426

Within                   308.462            96           3.213

Total                     680.913          103

 

For example, FA=MSA/MSW=(SSA/dfA)/(SSW/dfW), where dfA=J-1 and dfW=JK(n-1). So FA=(25.009/1)/(308.462/96)=25.009/3.213=7.78.

Two-Way ANOVA Example: FAB

•      For the study-techniques/cognitive-style study, how would the results for FAB be reported? Many journals would say, “For the test scores, the interaction of study technique and cognitive style was significant, F3,96=2.83, p=.0426.”

•      Let’s see why we came to this conclusion, using both types of decision rules.

–   Critical value decision rule: for df of 3 and 96 (use 80), Fcrit is 2.72, so we reject Ho because 2.83>2.72.

–   p-value decision rule: the p-value from the ANOVA table was .0426, so we reject Ho because .0426<.05.

•      Now, how do we interpret these results?

Two-Way ANOVA Example: FAB

•    A significant interaction says the effects of one factor on the dependent variable depend on the level of the other factor. That is, the main effect results may not be consistent across levels of the other main effect.

•    Here, the effect (on test scores) of study technique depends on level of cognitive style: for FI students, any note-taking is better than no notes. For FD students, only the two outline methods are better than no notes.

Two-Way ANOVA Example: FAB

•    Or, interaction says that FI differs from FD (significantly) only at Student Notes. Both FI and FD do well with outlines and poorly with No Notes, but only FI students do well with their own notes.

•    Note that the significant main effect results are modified: FI is better than FD only for Student Notes, and, Student Notes is better than No Notes only for FI students. Also, you need MCPs to test these means.

Two-Way ANOVA Example: FAB

•    The significant interaction is a red flag, warning that the main effect results may not be consistent across levels of the other main effect.

 

 

Psychology 2113

Nonparametric Methods

        Introduction

        Chi-Square Test

        Rank Tests

Nonparametric Methods: Introduction

•      All of the inferential statistics you have learned so far have tested hypotheses about parameters and made a normality assumption. These test statistics are called parametric methods. Now we will learn some new statistics called nonparametric methods.

Nonparametric Methods: Introduction

•      NP methods that are good for the nonparametric hypotheses detect any difference in the populations, such as middle, spread, skewness, kurtosis, or any combination. F-tests, t-tests, and the test of correlation all were able to detect only one specific parameter or difference in parameters.

•      Some NP methods are actually sensitive to parameter differences, even though they were designed to test the more general nonparametric hypotheses.

•      NP methods do not assume normality, but do make some assumption about independence. Also, they assume the underlying distribution of the data is continuous.

Nonparametric Methods: Introduction

•      Statisticians use qualitative and quantitative to describe data. Psychologists often use scales of measurement.

•      Scales of measurement:

–   Nominal: name only, e.g. gender, M F can be any numbers that are different, M=0 and F=1, or M=3 and F=2.

–   Ordinal: name and rank, e.g. football rankings, 1 is higher ranked than 2, 2 than 3, etc.

–   Interval: name, rank, and equal intervals, e.g. temperature in C°, 20°C-10°C is 10°C, as is 40°C-30°C.

–   Ratio: name, rank, equal intervals, and a true zero point, e.g. height, because zero means absence of height.

Nonparametric Methods: Introduction

•      When do you use NP methods? Four issues have been raised in the literature on NP tests.

–   Hypotheses: Use NP methods (like the Chi-Square test) if you want to test hypotheses that are truly nonparametric, e.g. Ho:distribution1=distribution2.

–   Assumptions: Use NP methods if the normality assumption is known to be violated by having a mixed distribution, with 5-10% of the scores as outliers in one tail of the distribution of the population.

–   Scale of measurement: Use NP methods (like the Chi-Square test) if you have a nominal scale of measurement, or, perhaps one of the rank tests if you have ordinal data.

–   Sample size (N): Often given as a consideration in deciding between NP and parametric tests, N is not an issue. Unequal n’s problems plague the NP tests as well, and small N is no better with NP than parametric tests.

Nonparametric Methods: c2 Test, Introduction

•      The c2 test for contingency tables is a NP method that is good for nonparametric hypotheses and can detect any difference in the populations, such as middle, spread, skewness, kurtosis, or any combination.

•      The c2 test for contingency tables does not assume normality.

•      It is used if the data are qualitative, if participants are placed into categories, if we have categorical variables, or if frequencies are involved. These typically go together.

Nonparametric Methods: c2 Test, Example

•      Who initiates touch in a public setting, M or F? Does this depend on whether the relationship is young or old? This is a problem for the c2 test for contingency tables.

Nonparametric Methods: c2 Test

1. Situation/hypotheses

 

 

 

2. Test statistic

 

 

3 .Distribution

 

4. Assumptions

Nonparametric Methods: c2 Test, Hypotheses

•      The null hypothesis for the c2 test for contingency tables may be stated in one of two equivalent ways:

–   Independence of the two categorical variables, for example, Ho:who initiates touch is independent of age of the relationship.

–   Equality of distributions for levels of one of the categorical variables, e.g. Ho:distributionyoung=distributionold.

Nonparametric Methods: c2 Test, E’s

•      E=(row frequency)(column frequency)/N.

•      For the <1 yr. and F cell, compute E. E=(146)(101)/219=67.33. Do the same for all other cells.

Nonparametric Methods: c2 Test Statistic

•      The c2=S[(O-E)2/E], where O is the observed frequency for a cell and E is the expected frequency for a cell. Note that there is a separate (O-E)2/E for each cell, and then these are added together.

•      For the question of independence of age of relationship and who initiates touch, we have c2=S[(O-E)2/E]                                                        =(60-67.33)2/67.33+(86-78.67)2/78.67                  +(41-33.67)2/33.67+(32-39.33)2/39.33    =.7987+.6836+1.5974+1.3672                           =4.45.

Nonparametric Methods: c2 Test, Distribution

•      The c2 distribution is positively skewed, has a minimum of zero, and one parameter, df.

Nonparametric Methods: c2 Test, Distribution

•      Does who initiates touch in a public setting, M or F, depend on whether the relationship is young or old? 

Nonparametric Methods: c2 Test, Distribution

•      c2=4.45, so reject Ho:independence because 4.45>3.84. We believe that who initiates touch depends on the length of the relationship.

•      Note that we have a one-tailed test even though the hypothesis is non-directional.

Nonparametric Methods: Rank Tests

•      Nonparametric statistics based on ranks have the following common characteristics:

–   They use the sum of the ranks

–   They have common assumptions of independence and a continuous underlying distribution, and the rank tests for two and J independent samples have an implicit assumption of equal variances.

–   Simplicity

–   Power: if the normality assumption is met, rank tests are up to 96% as powerful as their parametric counterpart, but potentially much more powerful if normality is not met.

–   They are sensitive to difference in middle (specifically, medians) or to monotonic relationships.

Nonparametric Methods: Rank Tests

•      The following table shows analogous parametric and nonparametric statistics based on ranks for several situations.

 

OU Home | Disclaimer | Copyright | Equal Opportunity | OU Web Policy