## December 23, 2015

### MB0050 [Research Methodology] Set1 Q2

Q2. In the context of hypothesis testing, briefly explain the difference between
a) Null and alternative hypothesis
b) Type 1 and type 2 error
c) Two tailed and one tailed test
d) Parametric and non parametric tests

Ans:
a) The logic of traditional hypothesis testing requires that we set up two competing statements or hypotheses referred to as the null hypothesis and the alternative hypothesis. These hypotheses are mutually exclusive and exhaustive.

Ho: The finding occurred by chance
H1: The finding did not occur by chance

The null hypothesis is then assumed to be true unless we find evidence to the contrary. If we find that the evidence is just too unlikely given the null hypothesis, we assume the alternative hypothesis is more likely to be correct. In "traditional statistics" a probability of something occurring of less than .05 (= 5% = 1 chance in 20) is conventionally considered "unlikely".

b) When an observer makes a Type I error in evaluating a sample against its parent population, he or she is mistakenly thinking that a statistical difference exists when in truth there is no statistical difference (or, to put another way, the null hypothesis should not be rejected but was mistakenly rejected). For example, imagine that a pregnancy test has produced a "positive" result (indicating that the woman taking the test is pregnant); if the woman is actually not pregnant though, then we say the test produced a "false positive" (assuming the null hypothesis, Ho, was that she is not pregnant). A Type II error, or a "false negative", is the error of failing to reject a null hypothesis when the alternative hypothesis is the true state of nature. For example, a type II error occurs if a pregnancy test reports "negative" when the woman is, in fact, pregnant.

From the Bayesian point of view, a type one error is one that looks at information that should not substantially change one's prior estimate of probability, but does. A type two error is that one looks at information which should change one's estimate, but does not. (Though the null hypothesis is not quite the same thing as one's prior estimate, it is, rather, one's pro forma prior estimate.)

Rejecting a null-hypothesis when it should not have been rejected creates a type I error.  failing to reject a null-hypothesis when it should have been rejected creates a type II error. (In either case, a wrong decision or error in judgment has occurred.) Decision rules (or tests of hypotheses), in order to be good, must be designed to minimize errors of decision. Minimizing errors of decision is not a simple issue—for any given sample size the effort to reduce one type of error generally results in increasing the other type of error. Based on the real-life application of the error, one type may be more serious than the other. (In such cases, a compromise should be reached in favour of limiting the more serious type of error.) The only way to minimize both types of error is to increase the sample size, and this may or may not be feasible.

Hypothesis testing is the art of testing whether a variation between two sample distributions can be explained by chance or not. In many practical applications type I errors are more delicate than type II errors. In these cases, care is usually focused on minimizing the occurrence of this statistical error. Suppose, the probability for a type I error is 1% , then there is a 1% chance that the observed variation is not true. This is called the level of significance. While 1% might be an acceptable level of significance for one application, a different application can require a very different level. For example, the standard goal of six sigma is to achieve precision to 4.5 standard deviations above or below the mean. This means that only 3.4 parts per million are allowed to be deficient in a normally distributed process. The probability of type I error is generally denoted with the Greek letter alpha, α.

To state it simply, a type I error can usually be interpreted as a false alarm or under-active specificity. A type II error could be similarly interpreted as an oversight, but is more akin to a lapse in attention or under-active sensitivity. The probability of type II error is generally denoted with the Greek letter beta, β.

c) There are two different types of tests that can be performed. A one-tailed test looks for an increase or decrease in the parameter whereas a two-tailed test looks for any change in the parameter (which can be any change- increase or decrease).

We can perform the test at any level (usually 1%, 5% or 10%). For example, performing the test at a 5% level means that there is a 5% chance of wrongly rejecting H0.

If we perform the test at the 5% level and decide to reject the null hypothesis, we say "there is significant evidence at the 5% level to suggest the hypothesis is false".

### One-Tailed Test

We choose a critical region. In a one-tailed test, the critical region will have just one part (the red area below). If our sample value lies in this region, we reject the null hypothesis in favour of the alternative.

Suppose we are looking for a definite decrease. Then the critical region will be to the left. Note, however, that in the one-tailed test the value of the parameter can be as high as you like.

### Two-Tailed Test

In a two-tailed test, we are looking for either an increase or a decrease. So, for example, H0 might be that the mean is equal to 9 (as before). This time, however, H1 would be that the mean is not equal to 9. In this case, therefore, the critical region has two parts:
d) There are two types of test data and consequently different types of analysis. As the table below shows, parametric data has an underlying normal distribution which allows for more conclusions to be drawn as the shape can be mathematically described. Anything else is non-parametric.
 Parametric Non-parametric Assumed distribution Normal Any Assumed variance Homogeneous Any Typical data Ratio or Interval Ordinal or Nominal Data set relationships Independent Any Usual central measure Mean Median Benefits Can draw more conclusions Simplicity; Less affected by outliers Tests Choosing Choosing parametric test Choosing a non-parametric test Correlation test Pearson Spearman Independent measures, 2 groups Independent-measures t-test Mann-Whitney test Independent measures, >2 groups One-way, independent-measures ANOVA Kruskal-Wallis test Repeated measures, 2 conditions Matched-pair t-test Wilcoxon test Repeated measures, >2 conditions One-way, repeated measures ANOVA Friedman's test

As the table shows, there are different tests for parametric and non-parametric data.