**

In its simplest form, hypothesis testing sort of means we want to **estimate** a parameter (of a population) from our sample.

Since we can’t really collect our data from the whole population, we often randomly collect a small sample that has at least 30 observations. Of course, the larger the sample is, the better our estimate will be. And don’t forget the *assumption* that our population is **normally distributed**.

First of all, we need to set up our **null hypothesis (H**_{0}**)** and** alternative hypothesis (H**_{1}**).**

**H**_{1}** NEVER contains the equal sign. **It can contain “not equal” sign, “greater than” sign, or “smaller than” sign. Normally H1 is what we are interested in testing.

**Ho ALWAYS contains the equal sign. **We often hope to reject Ho.

*****For example, we randomly collect overall GPA from 1,000 junior business students and we have the sample mean of 3.32. There is a claim that the GPA of junior business students across the US to be 3.48. Conduct a hypothesis testing to see whether or not the claim is valid.

(I just made up the example)

*Our sample*: 1,000 junior business students

*Our population*: junior business students across the US

H_{0} µ = 3.48

H_{1} µ ≠ 3.48 (Here we use the “*not equal*” sign, since we are interested to see if the population GPA is different from the claim, which is 3.48. If we are interested to see if the population GPA is greater than 3.48, we would use “>” sign, and vice versa.)

The **“not equal” sign indicates a two-tailed test**, while the **“smaller than” sign indicates a left-tailed test**, and the **“greater than” sign indicates a right-tailed test**.

α= .05 (α is called the **significance level**, which represents the probability that we reject the null hypothesis when it is true, also so-called “Type I error“. *We would like our α to be as small as possible*, and the most common one is .05). Then we will look up the t-table in order to find our critical value. In our example, we are to conduct **a two-tailed testing**, so we actually have to look up **the critical value of α/2**=.025

Test statistic

x-bar: sample mean

μ: population mean

SE (standard error). One simple way to think of SE is to consider it as the **standard deviation of the sample**. It can be calculated by s = sqrt [ Σ ( x_{i} – x-bar )^{2} / ( n – 1 ) ] using our data set. We plug in every data for x_{i, } take that value subtract x-bar (sample mean), square the result, (we do that for every single data), then add all of them together (that’s what the Σ means), then take that result divide by n-1 (n is the number of observations). After that, we take the square root of our result. And here comes our SE.

**If we know the standard deviation of the population **(normally we don’t), we can use this formula **SE****= s/sqrt(n)**** **(s stands for population standard deviation, sqrt means square root).

Okay, now we arrive at our **t-statistic**.

There are **2 ways** to test the null hypothesis, we can either **compare the test statistic with the critical value**, or **compare the p-value with our α** (*p-value* can be understood as the *α level of the t-statistic*).

–Method 1: **Compare t-statistic with the critical value**

Here is one useful website http://www.itl.nist.gov/div898/handbook/eda/section3/eda3672.htm

In short, we would **reject** the null hypothesis if our **t-statistic** in its **absolute value** is **GREATER than the critical value**. **Otherwise, **we **FAIL to reject the null hypothesis**.

–Method 2: **Compare our p-value with α.**

If **p-value** is **SMALLER than alpha**, we would **reject the null hypothesis**. Most computer packages will compute the p-value for us, assuming a two-sided test. If we really want a one-sided alternative, we just need to divide the two-sided p-value by 2.

If **p-value** is **GREATER than alpha**, we would **FAIL to reject the null hypothesis**.

***Note**: We would **NEVER **say that **we ACCEPT the null hypothesis**. We only say that **we DO NOT reject **(or **FAIL to reject**) **the null hypothesis**.

After that, there comes our conclusion. =)

Back to our example,

if we **reject the null hypothesis**, we will conclude that *“the average overall GPA of junior business students is statistically different from 3.48 with .05 significance level.”* Or there is another way to say it,

or

if we **FAIL to reject the null hypothesis**, we will conclude that *“the average overall GPA of junior business students is statistically indifferent from 3.48 with .05 significance level.”* Or there is another way to say it,

****

There are different types of hypothesis testing, but the steps are pretty much the same (The main differences are the Ho and H1, also the formulas for test statistic).

Step 1: set up the null hypothesis and alternative hypothesis

Step 2: find critical value from alpha (if not given, assume alpha to be 0.05)

Step 3: calculate the test statistic (in our example, it is the one-sample t-test for means)

Step 4: compare the test statistic to critical value, or p-value to alpha

Step 5: conclusion

~~~~~~~

Here is a vocabulary list in hypothesis testing that I have found to be useful

(Source: http://www.sjsu.edu/faculty/gerstman/StatPrimer/hyp-test.pdf)

A list of statistics formulas http://stattrek.com/Lesson1/Formulas.aspx?Tutorial=Stat

]]>The **F-test** is to test whether or not **a group of variables** has an effect on y, meaning we are to test if these variables are **jointly significant**.

Looking at the **t-ratios** for *“bavg,” “hrunsyr,”* and “rbisyr,” we can see that **none of them is individually statistically different from 0.** However, in this case, we are not interested in their

SSR_{UR }= 183.186327 (SSR of Unrestricted Model)

SSR_{R}=198.311477 (SSR of Restricted Model)

SSR stands for **Sum of Squares of Residuals**. *Residual is the difference between the actual y and the predicted y from the model.* Therefore, the smaller SSR is, the better the model is.

From the data above, we can see that *after we drop the group of variables* (bavg,” “hrunsyr,” and “rbisyr”), *SSR increases from 183 to 198*, which is about 8.2%. Therefore, we can conclude that we should keep those 3 variables.

**q**: number of restriction (the number of independent variables are dropped). In this case, q=3.

**k**: number of independent variables

q: numerator degrees of freedom

n-k-1: denominator degrees of freedom

In order to find Critical F, we can look up the F table. I also have found a convenient website for critical-F value http://www.danielsoper.com/statcalc/calc04.aspx.

We can calculate F in STATA by using the command

**test bavg hrunsyr brisyr**

Here is the output

Our F statistic is **9.55**.

******NOTE******: When we calculate F test, **we need to make sure that ****our unrestricted and restricted models are from the same set of observations**. We can check by looking at **the number of observations** in each model and make sure they are the same. Sometimes there are *missing values* in our data, so *there may be fewer observations in the unrestricted model *(since we account for more variables)* than in the restricted model* (using fewer variables).

In our example, our observations are **353** for both unrestricted and restricted models.

**If the number of observations differs, we have to re-estimate the restricted model** (the models after dropped some variables) using the same observations used to estimate unrestricted model (the original model).

Back to our example, **if our observations were different in the two models**, we would

**if bavg~=.** means if bavg is not missing,

**if bavg~=. & hrunsyr ~=. & rbisyr~=.** means if bavg, hrunsyr, rbisyr are **ALL** not missing (notice the “**&” **sign). That means even if one value of either one variable is missing, STATA will not take that observation into account while generating the regression.

========================================

==============================

===============

There is one special case of F-test that we want to test the **overall significance **of a model. In other words, we want to know **if the regression model is useful at all**, or we would need to throw it out and consider other variables. This is rarely the case, though.

**~~~~~~~~~~~000~~~~~~~~~~~**

This was such a painful and lengthy post. It has so many formulas that I had to do it in Microsoft Words and then convert it into several pictures…. I hope I made sense, though. =)

Let’s just keep in mind that **the F test is for joint significance**. That means we want to see whether or not **a group of variables should be kept in the model. **

Also, unlike the t distribution (bell shaped curve), **F distribution is skewed to the right, **with the smallest value is 0. Therefore, we would **reject the null hypothesis** if **F-statistic** (from the formula) **is greater than critical-F** (from the F table).

**H _{0}:_{ } β_{1 }= β_{2}**

_{ }Or **H _{0}: β_{1 }= 10β_{2}**

Our hypothesis can be pretty much anything, as long as β_{1 }and β_{2} has linear relationship.

**Note** that we are to test whether or not **the effects of the two x variables** **on y** have **a linear relationship**, **NOT** the **linear relationship** between **the two x variables** **on each other** (that is the case of *perfect multicollinearity*).

For example, we are interested in testing

**H _{0}:_{ } β_{1 }= β_{2}**

**H _{1}:_{ } β_{1 }≠ β_{2}**

_{
}

**FIRST METHOD**

Set **θ = β _{1} – β_{2}**, then we will have

**H _{0}:_{ } θ = 0**

H_{1}:_{ } θ_{ }≠ 0

–**Set α** (if not given, assume it to be .05)

–**Find critical value**: df=n-k-1 (k is the number of x variables), then use the t-table to find critical value.

–**Calculate test statistic**:

*(That output above was an example from my class notes.)*

Therefore,

–**Decision**: to reject H_{0 }or not (by comparing t^{0}_{ }with the critical value)

–**Conclusion**:

If we **reject H _{0}**

If we **fail to reject H _{0}**

=========

Crazy enough, huh? There is another method that may look easier:

**SECOND METHOD**

-Set **θ = β _{1} – β_{2}**, then

_{
}

_{ }

-Substitute **β _{1 }**in our original model by

_{
}

y= β_{0 }+ **β _{1}**x

y= β_{0 }+ **(θ + β _{2})**x

Now our 3 variables in the model are **x _{1, }**

_{
}

-Construct a new variable that is the sum of x_{1 }and x_{2 }(in STATA) by using the command

gen totx12 = x_{1 }+ x_{2}

_{“totx12” is just the name of the new variable.}

_{
}

_{ }

-Run the regression of y on x_{1}, totx12, and x_{3}

_{
}

-Test:

**H _{0}:_{ } θ = 0**

H_{1}:_{ } θ_{ }≠ 0

Now we can look at the t-ratio or p-value of the coefficient on x_{1} (**coefficient on x _{1 } is now θ**) then make our decision whether or not to reject H

**-Statistical Significance:** We will look at the **t-tests** or **p-values** to determine whether or not to reject the null hypothesis (which says that the parameter is equal to 0) at a certain level of significance.

+ Statistical significance can be driven from a large estimate or a small standard error (which may result from a large sample size, meaning there are more variance in x variables)

+ A lack of statistical significance may be driven from small sample size or multicollinearity (meaning that there are correlations between x variables)

**-Economic significance**: we will look at the **magnitude** and the **sign** of the estimated coefficient. If the number turns out to be so small, that x variable does not really affect y.

Here is one example from my class notes:

*#car*s: numbers of cars

*inctotal*: total annual income (in dollars)

*familysize*: the size of family (number of people)

*age*: measured in years

In the example above, all the parameters are **statistically significant** (the **t-ratios** are 8.81, 9.91, 2.26, 3.36, which is reasonable for us the *reject the null hypothesis* in a significance test).

However, the** magnitude** of “age” (measured in years) is really **small **(.0046911), which means a year increase in age will increase the number of cars owned by .0046911, on average, all else equal. In other words, all else equal, on average, in order to the number of cars owned to increase by 1, age must increase by over 200 years (=1/.0046911), which does not sound realistic!!

In conclusion, **even though the estimated coefficient of “age” is statistically significant****, it is NOT economically significant****.**

**Note: **The **magnitude **of “inctotal” is a small number too (.0000691), but it makes some sense. That means all else equal, a dollar increase in total income will increase the number of cars owed by .0000691, on average. That sounds possible, since *if you had an extra dollar per year, you would not think of buying another car*!! Well, it still does not make good sense to you, let’s put it this way: all else equal, on average, in order for a person to own another car, his/her annual income needs to increase by about $14,471 (=1/.0000691). Sounds reasonable, right? Therefore, **the estimated coefficient of “inctotal” is both statistically and economically significant**.

We need to watch out for the **units** as well. A small number doesn’t necessarily imply an economic insignificance.

So, what is statistics all about?

-A branch of applied mathematics concerned with the collection and interpretation of quantitative data and the use of probability theory to estimate population parameters (www.wordnetweb.princeton.edu/perl/webwn)

-The science of making effective use of numerical data relating to groups of individuals or experiments. (http://en.wikipedia.org/wiki/Statistics)

Statistics is math, yes, but it is much more simple math, compared to calculus, differential equations, linear algebra, or discrete mathematics that I have no idea what they are for and about. I am not saying statistics is a super easy subject, since I have only learned the most basic concepts in business statistics so far in my undergraduate studies.

For me, statistics is like playing with data, with numbers, in order to get some useful information. More often, it is hard to avoid bias in statistics or econometrics, but let’s not talk about that right now.

Looking into a statistics textbook, we would see more words rather than equations or formula. Yes, doing a statistics problem is like doing a math word problem. The data contains number, yes, but most of the time we will have to interpret the results in words. Don’t be scared, yet. It is not like we are all required to write an intensive report in every statistics problem. =P

Doesn’t sound quite interesting, huh? I hope my further posts in statistics from a student point of view would somehow make statistics sound less boring to you. =)

]]>