PolarSPARC

Introduction to Statistics - Part 7


Bhaskar S 09/17/2021


In Part 6 of the series, we explored the Hypothesis Testing for proportions, for two independent samples z-test, and for two independent samples t-test.

In this part of the series, we will continue our journey with Hypothesis Testing for two dependent samples t-test and for two independent samples z-test for difference of proportions.

Dependent Samples - Two-Means t-Test

Many statistical applications collect data samples from either the same population or from two populations that have a natural pairing relationship. These samples are known as paired or matched samples. The use of data pairs occurs naturally in situations where the samples are measured both before and after some event. For example, one may want to make inferences about the mean weight loss for members of a health club after they have gone through a weight loss program for a certain period of time.

The two-means paired t-test method is used to compare the means of two dependent samples and determine if whether there is a difference between these means. The following are some of the requirements for performing the two-means paired t-test for the dependent samples:

When the above conditions are satisfied, the following are the steps to perform the two-means dependent samples t-test:

Let us now solve a problem for the two-means paired t-test.


Example-1 A doctor wishes to see if a patient's cholesterol level will change by prescribing a certain medication. Six subjects are pretested and their readings are 210, 235, 208, 190, 172, and 244 respectively. They are prescribed the medication for a 6-week period after which they are tested again and their readings are 190, 170, 210, 188, 173, and 228 respectively. Can it be concluded that the cholesterol level has been changed at a \(\alpha = 0.10\).

Null hypothesis (the cholesterol medication has no effect): \(H_0: \mu_d = 0\)

Alternate hypothesis (the cholesterol medication has is effective): \(H_a: \mu_d \ne 0\)

Compute of the differences between the paired entries as follows:

Before After d
210 190 20
235 170 65
208 210 -2
190 188 2
172 173 -1
244 228 16

Compute the mean of the paired differences \(\bar{d}\) = \(\Large{\frac{\Sigma{d}}{n}}\) = \(\Large{\frac{(20 + 65 -2 + 2 -1 + 16)}{6]}}\) = 16.7

Compute the standard deviation of the paired differences \(s_d = \Large{\sqrt{\frac{\Sigma{(d - \bar{d})^2}}{n - 1}}}\) using the values from the table below:

\(d - \bar{d}\) \((d - \bar{d})^2\)
20 - 16.7 \(+3.3^2\) = 10.89
65 - 16.7 \(+48.3^2\) = 2391.21
-2 - 16.7 \(-18.7^2\) = 349.69
2 - 16.7 \(-14.7^2\) = 216.09
-1 - 16.7 \(-15.7^2\) = 246.49
16 - 16.7 \(-0.7^2\) = 0.49

Compute the standard deviation of the paired differences \(s_d = \Large{\sqrt{\frac{\Sigma{(d - \bar{d})^2}}{n - 1}}}\) = \(\Large{\frac{(10.89 + 2391.21 + 349.69 + 216.09 + 246.49 + 0.49)}{(6 - 1)}}\) \(\approx\) 25.4

  • Compute the test statistic \(t = \Large{\frac{\bar{d} - \mu_d}{s_d/\sqrt{n}}}\) = \(\Large{\frac{16.7 - 0}{25.4/\sqrt{6}}}\) \(\approx\) 1.610

  • In this situation, the hypothesis test is deciding if there is a difference in the mean. Hence, this is a two-tailed test.

    Given the significance level \(\alpha = 0.10\) and d.f = n - 1 = 6 - 1 = 5.

    For \(\alpha = 0.10\), the critical value from the t-table for d.f. = 5 is \(t_c \approx 2.015\).

    Since the computed standardized test statistic (t) is below the critical value \(t_c\), we FAIL to reject the null hypotesis \(H_0\).

    Therefore, at the 0.10 significance level, the sample data provides not enough evidence to conclude the claim that the prescribed medication changes a patient's cholesterol level.


    Hypothesis Testing - Two Population Proportions

    Often times, we conduct hypothesis tests to determine differences between two population proportions. In the following section(s), we will use the following notation:

    Independent Samples - Two-Proportion z-Test

    A two-sample z-test is used to test the difference between two population proportions using two independent samples. The following are some of the requirements for performing the two-proportion z-test:

    When the above conditions are satisfied, the following are the steps to perform the two-proportion z-test:

    Let us now solve a problem for the two-proportion z-test.


    Example-2 A researcher wants to estimate the difference between the percentages of users of two toothpastes who will never switch to another toothpaste. In a sample of 500 users of Toothpaste A, 100 said that they will never switch to another toothpaste. In another sample of 400 users of Toothpaste B, 68 said that they will never switch to another toothpaste. At a 1% significance level, can we conclude that the proportion of users of Toothpaste A who will never switch to another toothpaste is greater than the proportion of users of Toothpaste B who will never switch to another toothpaste.

    Given facts: \(n_1 = 500\), \(n_2 = 400\), \(x_1 = 100\), and \(x_2 = 68\).

    Sample proportion for Toothpaste A: \(\hat{p_1} = \Large{\frac{x_1}{n_1}}\) = \(\Large{\frac{100}{500}}\) = 0.20.

    Sample proportion for Toothpaste B: \(\hat{p_2} = \Large{\frac{x_2}{n_2}}\) = \(\Large{\frac{68}{400}}\) = 0.17.

    Null hypothesis: \(H_0: p_1 = p_2\) OR \(p_1 - p_2 = 0\).

    Alternate hypothesis: \(H_a: p_1 \gt p_2\) OR \(p_1 - p_2 \gt 0\).

    In this situation, the hypothesis test is deciding if the proportion for Toothpaste A is greater than the proportion for Toothpaste B. Hence, this is a right-tailed test.

    Given the significance level \(\alpha = 0.01\).

    Compute the weighted estimate of the population proportions \(\bar{p} = \Large{\frac{x_1 + x_2}{n_1 + n_2}}\) = \(\Large{\frac{100 + 68}{500 + 400}}\) = 0.187. Also, \(\bar{q} = 1 - \bar{p}\) = 1 - 0.187 = 0.813.

    Compute the test statistic z = \(\Large{\frac{(\hat{p_1} - \hat{p_2}) - (p_1 - p_2)}{\sqrt{\bar{p}\bar{q}(1/n_1+1/n_2)}}}\) = \(\Large{\frac{(0.20 - 0.17) - (0)}{\sqrt{0.187 * 0.813 * (1/500+1/400)}}}\) \(\approx 1.15\).

    For \(\alpha = 0.01\), the critical value from the z-table is \(z_c \approx 2.33\).

    Since the computed standardized test statistic (z) is below the critical value \(z_c\), we FAIL to reject the null hypotesis \(H_0\).

    Therefore, at the 1% significance level, the sample data does not provide sufficient evidence to indicate that the proportion of users of Toothpaste A who will never switch to another toothpaste is greater than the proportion of users of Toothpaste B who will never switch to another toothpaste.


    References

    Introduction to Statistics - Part 6

    Introduction to Statistics - Part 5

    Introduction to Statistics - Part 4

    Introduction to Statistics - Part 3

    Introduction to Statistics - Part 2

    Introduction to Statistics - Part 1

    Introduction to Probability

    Introduction to Permutation & Combinations


    © PolarSPARC