## Hypothesis Testing

Today I decided to write about **hypothesis testing**, which is actually one of the most fundamental concepts in statistics. I learned it in my freshman year, but I have so often encountered hypothesis testing again in other classes, especially in my econometrics class. =) Now I am writing it from my memories (since I no longer have my statistics book with me) as well as doing a little research here and there…

**

In its simplest form, hypothesis testing sort of means we want to **estimate** a parameter (of a population) from our sample.

Since we can’t really collect our data from the whole population, we often randomly collect a small sample that has at least 30 observations. Of course, the larger the sample is, the better our estimate will be. And don’t forget the *assumption* that our population is **normally distributed**.

First of all, we need to set up our **null hypothesis (H**_{0}**)** and** alternative hypothesis (H**_{1}**).**

**H**_{1}** NEVER contains the equal sign. **It can contain “not equal” sign, “greater than” sign, or “smaller than” sign. Normally H1 is what we are interested in testing.

**Ho ALWAYS contains the equal sign. **We often hope to reject Ho.

*****For example, we randomly collect overall GPA from 1,000 junior business students and we have the sample mean of 3.32. There is a claim that the GPA of junior business students across the US to be 3.48. Conduct a hypothesis testing to see whether or not the claim is valid.

(I just made up the example)

*Our sample*: 1,000 junior business students

*Our population*: junior business students across the US

H_{0} µ = 3.48

H_{1} µ ≠ 3.48 (Here we use the “*not equal*” sign, since we are interested to see if the population GPA is different from the claim, which is 3.48. If we are interested to see if the population GPA is greater than 3.48, we would use “>” sign, and vice versa.)

The **“not equal” sign indicates a two-tailed test**, while the **“smaller than” sign indicates a left-tailed test**, and the **“greater than” sign indicates a right-tailed test**.

α= .05 (α is called the **significance level**, which represents the probability that we reject the null hypothesis when it is true, also so-called “Type I error“. *We would like our α to be as small as possible*, and the most common one is .05). Then we will look up the t-table in order to find our critical value. In our example, we are to conduct **a two-tailed testing**, so we actually have to look up **the critical value of α/2**=.025

Test statistic

x-bar: sample mean

μ: population mean

SE (standard error). One simple way to think of SE is to consider it as the **standard deviation of the sample**. It can be calculated by s = sqrt [ Σ ( x_{i} – x-bar )^{2} / ( n – 1 ) ] using our data set. We plug in every data for x_{i, } take that value subtract x-bar (sample mean), square the result, (we do that for every single data), then add all of them together (that’s what the Σ means), then take that result divide by n-1 (n is the number of observations). After that, we take the square root of our result. And here comes our SE.

**If we know the standard deviation of the population **(normally we don’t), we can use this formula **SE****= s/sqrt(n)**** **(s stands for population standard deviation, sqrt means square root).

Okay, now we arrive at our **t-statistic**.

There are **2 ways** to test the null hypothesis, we can either **compare the test statistic with the critical value**, or **compare the p-value with our α** (*p-value* can be understood as the *α level of the t-statistic*).

–Method 1: **Compare t-statistic with the critical value**

Here is one useful website http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm

In short, we would **reject** the null hypothesis if our **t-statistic** in its **absolute value** is **GREATER than the critical value**. **Otherwise, **we **FAIL to reject the null hypothesis**.

–Method 2: **Compare our p-value with α.**

If **p-value** is **SMALLER than alpha**, we would **reject the null hypothesis**. Most computer packages will compute the p-value for us, assuming a two-sided test. If we really want a one-sided alternative, we just need to divide the two-sided p-value by 2.

If **p-value** is **GREATER than alpha**, we would **FAIL to reject the null hypothesis**.

***Note**: We would **NEVER **say that **we ACCEPT the null hypothesis**. We only say that **we DO NOT reject **(or **FAIL to reject**) **the null hypothesis**.

After that, there comes our conclusion. =)

Back to our example,

if we **reject the null hypothesis**, we will conclude that *“the average overall GPA of junior business students is statistically different from 3.48 with .05 significance level.”* Or there is another way to say it,

*“we have*

**enough evidence**to conclude that the average overall GPA of junior business students is**statistically different**from 3.48 with .05 significance level.”or

if we **FAIL to reject the null hypothesis**, we will conclude that *“the average overall GPA of junior business students is statistically indifferent from 3.48 with .05 significance level.”* Or there is another way to say it,

*“we*

**do not have enough evidence**to conclude that the average overall GPA of junior business students is statistically different from 3.48 with .05 significance level.”****

There are different types of hypothesis testing, but the steps are pretty much the same (The main differences are the Ho and H1, also the formulas for test statistic).

Step 1: set up the null hypothesis and alternative hypothesis

Step 2: find critical value from alpha (if not given, assume alpha to be 0.05)

Step 3: calculate the test statistic (in our example, it is the one-sample t-test for means)

Step 4: compare the test statistic to critical value, or p-value to alpha

Step 5: conclusion

~~~~~~~

Here is a vocabulary list in hypothesis testing that I have found to be useful

**Vocabulary**

**Null hypothesis (H0) –**A statement that declares the observed difference is due to “chance.” It is the hypothesis the researcher hopes to reject.

**Alternative hypothesis (H1)**– The opposite of the null hypothesis. The hypothesis the researcher hopes to bolster.

**Alpha (α) –**The probability the researcher is willing to take in falsely rejecting a true null hypothesis.

**Test statistic –**A statistic used to test the null hypothesis.

**P-value –**A probability statement that answers the question “If the null hypothesis were true, what is the probability of observing the current data or data that is more extreme than the current data?.” It is the probability of the data conditional on the truth of H0. It is NOT the probability that the null hypothesis is true.

**Type I error –**a rejection of a true null hypothesis; a “false alarm.”

**Type II error**– a retention of an incorrect null hypothesis; “failure to sound the alarm.”

**Confidence (1 – α) –**the complement of alpha.

**Beta (β) –**the probability of a type II error; probability of a retaining a false null hypothesis.

**Power (1 – β) –**the complement of β; the probability of avoiding a type II error; the probability of rejecting a false null hypothesis.

(Source: http://www.sjsu.edu/faculty/gerstman/StatPrimer/hyp-test.pdf)

A list of statistics formulas http://stattrek.com/Lesson1/Formulas.aspx?Tutorial=Stat