 Home About Me disabled former page Arthur Articles

# Statistics

The following represents Arthur's opinions only and not necessarily those of Christie.

 Computer Help Personal Music Statistics Pain Pathophysiology Theory of Personality

SAT SUMX = 5846, SUMX^2 = 3428148

A) Mean = 5846 / 10 = 584.6

B) S = ((SUMX^2-(((SUMX)^2)/N))/(N-1))^.5

S = ((3428148-(((5846)^2)/10))/(10-1))^.5

S = ((3428148-3417571.6)/9)^.5

S = 1175.12^.5

S = 34.28

C) 95%CI = +- t(.05)SsubXBAR + XBAR

SSubXBAR = (s)/(n^.5) = 34.28/3.16 = 10.85

95%CI = +- t(.05)10.85 + 584.6 = +-(2.2622*10.85)+ 584.6

95%CI = 560.07 - 609.13 or the whole scores of 560-609.  542 is not in the expected range of the population mean, thus the course has a statistical significance of improving the scores.

By using a confidence interval of 95%, we can be 95% sure that the population mean is contained within 560.06 - 609.14.   Since a mean of 542 is not within this range, we have can conclude that the new course did indeed improve the SAT verbal scores.

By performing a one-tailed t test we can analyze this:

Set H0: u0<=u

Set H1: u>u0

in plain language:  our null hypothesis is that the sample mean is equal to or less than the population mean.  Our alternative hypothesis is that the sample mean is greater than the population mean. t = (584.6-542)/(10.841) = 3.93

t(.01),df=9 = 2.82, p<.01

Rejection Rule:   For H1:m > m0, reject H0 if tcomp > tcrit

We reject H0.

The new course did improved the SAT verbal scores

Question #2

Descriptive Statistics

 N Mean Std. Deviation Variance Yellow 5 20.6000 4.39318 19.300 Green 4 29.2500 12.09339 146.250 Valid N (listwise) 4

Our Null hypothesis:  H0 = u1 - u2 = 0 (there is no difference in the scores of the yellow group and green group)

Our alpha level is .05, we are looking for a 95% CI

Since N is different for each group, we use the formula:

s sub Xbar sub 1 minus X bar sub 2 = ((((N1-1)(s2sub1) + (N2-1)(S2sub2))/(N1+N2-2))((1/N1)+(1/N2)))^.5

OR: (s2pooled(1/N1 + 1/N2))^.5

S2sub1 = 19.3

S2sub2 = 146.25

N1=5

N2=4

So we get: s2pooled = ((4*19.3)+(3*146.25))/7

s2pooled = 515.95/7 = 73.71

s sub Xbar sub 1 minus X bar sub 2 = (73.71(1/5 + 1/4))^.5 = 5.76, or the standard error difference

Now find t sub xbar sub 1 minus xbar sub 2 = (Xbar1 - Xbar2)/s sub Xbar sub 1 minus X bar sub 2

=20.60-29.25/5.76 = -1.50

Using table B, we find a critical value for T where df = N1+N2-2, or 7

Using the 5% level for t(.05) we get 2.36

Rejection Rule:  Reject H0 if |tcomp| >= tcrit

|tcomp| = 1.5 which is less than 2.36, or tcrit, therefore we do not reject H0, meaning that we cannot conclude the yellow paper caused subjects to score lower.  The answer is no

T-Test, Question 2

Group Statistics

 Color N Mean Std. Deviation Std. Error Mean Value Yellow 5 20.6000 4.39318 1.96469 Green 4 29.2500 12.09339 6.04669

Independent Samples Test

 Levene's Test for Equality of Variances t-test for Equality of Means F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference Lower Upper Value 3.385 .108 -1.502 7 .177 -8.65000 5.75919 -22.26831 4.96831

Equal variances assumed

Question #3                         Statistics

 DietA DietB DietC N 8 10 10 Mean 8.1250 9.0000 10.7000 Median 8.0000 9.0000 11.0000 Mode 7.00(a) 9.00 7.00(a) Std. Deviation 2.35660 2.30940 2.75076 Variance 5.554 5.333 7.567

a  Multiple modes exist. The smallest value is shown

For Diet A, the most appropriate descriptive statistic is the mean.  Since multiple modes exist, the mode is not very useful.  The median is slightly lower than the mean, giving it a slight positive skew, hence the mean is the most useful.

For Diet B, the median, mean and mode are all equal, hence all useful.

For Diet C, there is a slight negative skew.  Multiple modes exist, so the mode is not useful.  Because the skew is only slight, the mean is the most useful.

Histograms are the best graphs for these diets.

Graph of Diet A Diet B Diet C #3, Oneway ANOVA

Build a matrix of values based on the data for Diet A, Diet B, Diet C

Ho: m1 = m2 = m3

Ha: At least one mean differs from the others.

Rejection Rule:  Reject Ho if p-value < .05   SPSS ANOVA

Weight_Loss

 Sum of Squares df Mean Square F Sig. Between Groups 31.454 2 15.727 2.537 .099 Within Groups 154.975 25 6.199 Total 186.429 27

P value = .099 > .05 --> Accept H0.

F(2,25) = 3.39 , p = .099

There is no significant difference in mean wieght loss among the different diets

#3 b) Post Hoc Tests

Multiple Comparisons

Dependent Variable: Weight_Loss

LSD

 (I) Diet_Type (J) Diet_Type Mean Difference (I-J) Std. Error Sig. 95% Confidence Interval Lower Bound Upper Bound Diet A Diet B -.87500 1.18101 .466 -3.3073 1.5573 Diet C -2.57500(*) 1.18101 .039 -5.0073 -.1427 Diet B Diet A .87500 1.18101 .466 -1.5573 3.3073 Diet C -1.70000 1.11346 .139 -3.9932 .5932 Diet C Diet A 2.57500(*) 1.18101 .039 .1427 5.0073 Diet B 1.70000 1.11346 .139 -.5932 3.9932

*  The mean difference is significant at the .05 level.

There was a significant difference between Diet A and Diet C

#3 b) A graph of the means #4, Two Way ANOVA, Type I Sum of Squares

A

B

C

M

Sx11

28

Sx12

57

Sx13

38

SA1

123

Sx211

210

Sx212

547

Sx213

490

SA21

1247 7 9.5 12.667 9.462

n11

4

n12

6

n13

3

nA1

13

F

Sx21

37

Sx22

28

Sx23

76

SA2

141

Sx221

357

Sx222

206

Sx223

870

SA22

1433 9.25 7 10.857 9.4

n21

4

n22

4

n23

7

nA2

15

SB1

65

SB2

85

SB3

114

SXT

264

SB21

567

SB22

753

SB23

1360

SX2T

2680 8.125 8.5 11.4 9.429

nB1

8

nB2

10

nB3

10

N

28 SSbet = (28^2/4)+(57^2/6)+(38^2/3)+(37^2/4)+(28^2/4)+(76^2/7)-(264^2/28)

= 196+541.5+481.333+342.25+196+825.143-2489.143 = 93.083 SSa = (123^2/13)+(141^2/15)-(264^2/28)

= 1163.769+1325.4 - 2489.143 = 0.026 = (65^2/8)+(85^2/10)+(114^2/10)-(264^2/28)

= 528.125+722.5+1299.6-2489.143 = 61.082 SSa*b = 93.083-0.026-61.082 = 31.975 SSt = 2680 - (264^2)/28 = 2680 - 2489.14285714286 = 190.857

SSw = SSt - SSbet

= 190.857 - 93.083 = 97.774

dfbet = K - 1 = 5

K = total number of groups

dfa = A - 1 = 1

A = #of cells in A

dfb = B - 1 = 2

B = #of cells in B

dft = N - 1 = 28-1 = 27

dfa*b = 1*2 = 2

MSbet = SSbet/dfbet = 93.083/5 = 18.617

MSa = SSa/dfa = 0.026/1 = 0.026

MSb = SSb/dfb = 61.082/2 = 30.541

MSa*b = Ssa*b/dfa*b = 31.975/2 = 15.988

MSw = SSw/dfw = 97.774/22 = 4.444

Fa = MSa/MSw = 0.026/4.444 = 0.006 (df=1,22)

Fb = MSb/MSw = 30.541/4.444 = 6.872 (df=2,22)

Fa*b = Msa*b/MSw = 15.988/ = 3.598 (df=2,22)

 Two-Way ANOVA Summary Table Source SS df MS F p Between Groups (Cells) 93.08 5 18.62 A 0.026 1 0.026 0.006 (df=1,22) F(1,22)=0.006, p>.05 B 61.08 2 30.54 6.872 (df=2,22) F(2,22)=6.872, p<.01 A x B 31.98 2 15.99 3.598 (df=2,22) F(2,22)=3.598, p<.05 Within Groups 97.77 22 4.444 Total 190.9 27

F(1,22)=0.006, p>.05

There is no significant population mean difference among Gender

F(2,22)=6.872, p<.01

There is a significant population mean difference in the diets

F(2,22)=3.598, p<.05

There is a significant interaction between diets and gender

Tests of Between-Subjects Effects

Dependent Variable: Weight_Loss

 Source Type I Sum of Squares df Mean Square F Sig. Corrected Model 93.083(a) 5 18.617 4.189 .008 Intercept 2489.143 1 2489.143 560.080 .000 Gender .026 1 .026 .006 .939 Diet 65.377 2 32.689 7.355 .004 Gender * Diet 27.680 2 13.840 3.114 .064 Error 97.774 22 4.444 Total 2680.000 28 Corrected Total 190.857 27

a  R Squared = .488 (Adjusted R Squared = .371)

Because problem #4 contains different values of n for each group, the Type III sum of squares as computed in SPSS does not match the manual sweep-matrix as taught in class.  So I used a Type I sum of squares (highlighted) which more closely resembles the manual calculations.

I am including the TYPE III SPSS mean of squares results for this question below, with different answers due to the different results of the same data, neither of which matches manual computations:

Tests of Between-Subjects Effects

Dependent Variable: Weight_Loss

 Source Type III Sum of Squares df Mean Square F Sig. Corrected Model 93.083(a) 5 18.617 4.189 .008 Intercept 2273.558 1 2273.558 511.571 .000 Gender 3.045 1 3.045 .685 .417 Diet 72.486 2 36.243 8.155 .002 Gender * Diet 27.680 2 13.840 3.114 .064 Error 97.774 22 4.444 Total 2680.000 28 Corrected Total 190.857 27

a  R Squared = .488 (Adjusted R Squared = .371)

F(1,22)=0.685, p>.05

There is no significant population mean difference among Gender

F(2,22)=8.155, p<.01

There is a significant population mean difference in the diets

F(2,22)=3.114, p>.05

There is no significant interaction between diets and gender

LSD TEST, POST HOC

Pairwise Comparisons

Dependent Variable: Weight_Loss

 (I) Gender (J) Gender Mean Difference (I-J) Std. Error Sig.(a) 95% Confidence Interval for Difference(a) Lower Bound Upper Bound Male Female .687 .829 .417 -1.033 2.406 Female Male -.687 .829 .417 -2.406 1.033

Based on estimated marginal means

a  Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).

There is no significant mean difference between Gender

Pairwise Comparisons

Dependent Variable: Weight_Loss

 (I) Diet (J) Diet Mean Difference (I-J) Std. Error Sig.(a) 95% Confidence Interval for Difference(a) Lower Bound Upper Bound Diet_A Diet_B -.125 1.009 .903 -2.218 1.968 Diet_C -3.637(*) 1.041 .002 -5.797 -1.477 Diet_B Diet_A .125 1.009 .903 -1.968 2.218 Diet_C -3.512(*) .996 .002 -5.577 -1.446 Diet_C Diet_A 3.637(*) 1.041 .002 1.477 5.797 Diet_B 3.512(*) .996 .002 1.446 5.577

Based on estimated marginal means

*  The mean difference is significant at the .05 level.

a  Adjustment for multiple comparisons: Least Significant Difference (equivalent to no adjustments).

There is no significant difference between Diet A and Diet B

There is a significant difference between Diet A and Diet C

There is a significant difference between Diet C and Diet B

 Two-Way ANOVA Summary Table Source SS df MS F p Between Groups (Cells) 456.8 1 456.8 A 311.1 1 311.1 34.541 (df=1,10) F(1,10)=34.541, p<.01 B 73.14 1 73.14 8.12 (df=1,10) F(1,10)=8.12, p<.05 A x B 72.49 1 72.49 8.047 (df=1,10) F(1,10)=8.047, p<.05 Within Groups 90.08 10 9.008 Total 546.9 13

Dependent Variable: Time

 Source Type I Sum of Squares df Mean Square F Sig. Corrected Model 456.774(a) 3 152.258 16.902 .000 Intercept 2263.143 1 2263.143 251.228 .000 Gender 73.143 1 73.143 8.119 .017 Dress 275.149 1 275.149 30.544 .000 Gender * Dress 108.482 1 108.482 12.042 .006 Error 90.083 10 9.008 Total 2810.000 14 Corrected Total 546.857 13

a  R Squared = .835 (Adjusted R Squared = .786)