An Ftest is any statistical test in which the test statistic has an Fdistribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled. Exact "Ftests" mainly arise when the models have been fitted to the data using least squares. The name was coined by George W. Snedecor, in honour of Sir Ronald A. Fisher. Fisher initially developed the statistic as the variance ratio in the 1920s.^{[1]}
YouTube Encyclopedic

1/5Views:485 00257 30541 34021 931823 797

✪ ANOVA 3: Hypothesis test with Fstatistic  Probability and Statistics  Khan Academy

✪ Test Statistics: Crash Course Statistics #26

✪ The Meaning of an FTest

✪ Hypothesis Testing by Hand: An fTest for the Differences bewteen Two Population Variances  Part 1

✪ How To Calculate and Understand Analysis of Variance (ANOVA) F Test.
Transcription
In the last couple of videos we first figured out the TOTAL variation in these 9 data points right here and we got 30, that's our Total Sum of Squares. Then we asked ourselves, how much of that variation is due to variation WITHIN each of these groups, versus variation BETWEEN the groups themselves? So, for the variation within the groups we have our Sum of Squares within. And there we got 6. And then the balance of this, 30, the balance of this variation, came from variation between the groups, and we calculated it, We got 24. What I want to do in this video, is actually use this type of information, essentially these statistics we've calculated, to do some inferential statistics, to come to some time of conclusion, or maybe not to come to some type of conclusion. What I want to do is to put some context around these groups. We've been dealing with them abstractly right now, but you can imagine these are the results of some type of experiment. Let's say that I gave 3 different types of pills or 3 different types of food to people taking a test. And these are the scores on the test. So this is food 1, food 2, and then this over here is food 3. And I want to figure out if the type of food people take going into the test really affect their scores? If you look at these means, it looks like they perform best in group 3, than in group 2 or 1. But is that difference purely random? Random chance? Or can I be pretty confident that it's due to actual differences in the population means, of all of the people who would ever take food 3 vs food 2 vs food 1? So, my question here is, are the means and the true population means the same? This is a sample mean based on 3 samples. But if I knew the true population means So my question is: Is the mean of the population of people taking Food 1 equal to the mean of Food 2? Obviously I'll never be able to give that food to every human being that could ever live and then make them all take an exam. But there is some true mean there, it's just not really measurable. So my question is "this" equal to "this" equal to the mean 3, the true population of mean 3. And my question is, are these equal? Because if they're not equal, that means that the type of food given does have some type of impact on how people perform on a test. So let's do a little hypothesis test here. Let's say that my null hypothesis is that the means are the same. Food doesn't make a difference. "food doesn't make a difference" and that my Alternate hypothesis is that it does. "It does." and the way of thinking about this quantitatively is that if it doesn't make a difference, the true population means of the groups will be the same. The true population mean of the group that took food 1 will be the same as the group that took food 2, which will be the same as the group that took food 3. If our alternate hypothesis is correct, then these means will not be all the same. How can we test this hypothesis? So we're going to assume the null hypothesis, which is what we always do when we are hypothesis testing, we're going to assume our null hypothesis. And then essentially figure out, what are the chances of getting a certain statistic this extreme? And I haven't even defined what that statistic is. So we're going to definewe're going to assume our null hypothesis, and then we're going to come up with a statistic called the F statistic. So our F statistic which has an F distributionand we won't go real deep into the details of the F distribution. But you can already start to think of it as the ratio of two Chisquared distributions that may or may not have different degrees of freedom. Our F statistic is going to be the ratio of our Sum of Squares between the samples Sum of Squares between divided by, our degrees of freedom between and this is sometimes called the mean squares between, MSB, that, divided by the Sum of Squares within, so that's what I had done up here, the SSW in blue, divided by the SSW divided by the degrees of freedom of the SSwithin, and that was m (n1). Now let's just think about what this is doing right here. If this number, the numerator, is much larger than the denominator, then that tells us that the variation in this data is due mostly to the differences between the actual means and its due less to the variation within the means. That's if this numerator is much bigger than this denominator over here. So that should make us believe that there is a difference in the true population mean. So if this number is really big, it should tell us that there is a lower probability that our null hypothesis is correct. If this number is really small and our denominator is larger, that means that our variation within each sample, makes up more of the total variation than our variation between the samples. So that means that our variation within each of these samples is a bigger percentage of the total variation versus the variation between the samples. So that would make us believe that "hey! ya know... any difference we see between the means is probably just random." And that would make it a little harder to reject the null. So let's actually calculate it. So in this case, our SSbetween, we calculated over here, was 24. and we had 2 degrees of freedom. And our SSwithin was 6 and we had how many degrees of freedom? Also, 6. 6 degrees of freedom. So this is going to be 24/2 which is 12, divided by 1. Our F statistic that we've calculated is going to be 12. F stands for Fischer who is the biologist and statistician who came up with this. So our F statistic is going to be 12. We're going to see that this is a pretty high number. Now, one thing I forgot to mention, with any hypothesis test, we're going to need some type of significance level. So let's say the significance level that we care about, for our hypothesis test, is 10%. 0.10  which means that if we assume the null hypothesis, there is less than a 10% chance of getting the result we got, of getting this F statistic, then we will reject the null hypothesis. So what we want to do is figure out a critical F statistic value, that getting that extreme of a value or greater, is 10% and if this is bigger than our critical F statistic value, then we're going to reject the null hypothesis, if it's less, we can't reject the null. So I'm not going to go into a lot of the guts of the F statistic, but we can already appreciate that each of these Sum of squares has a Chisquared distribution. "This" has a Chisquared distribution, and "this" has a different Chisquared distribution This is a Chisquared distribution with 2 degrees of freedom, this is a Chisquared distribution withAnd we haven't normalized it and all of that but roughly a Chi squared distribution with 6 degrees of freedom. So the F distribution is actually the ratio of two Chisquared distributions And I got thisthis is a screenshot from a professor's course at UCLA, I hope they don't mind, I need to find us an F table for us to look into. But this is what an F distribution looks like. And obviously it's going to look different depending on the df of the numerator and the denominator. There's two df to think about, the numerator degrees of freedom and the denominator degrees of freedom With that said, let's calculate the critical F statistic, for alpha is equal to 0.10, and you're actually going to see different F tables for each different alpha, where our numerator df is 2, and our denominator df is 6. So this table that I got, this whole table is for an alpha of 10% or 0.10, and our numerator df was 2 and our denominator was 6. So our critical F value is 3.46. So our critical F value is 3.46this value right over here is 3.46 The value that we got based on our data is much larger than this, WAY above it. It's going to have a very, very small p value. The probability of getting something this extreme, just by chance, assuming the null hypothesis, is very low. It's way bigger than our critical F statistic with a 10% significance level. So because of that we can reject the null hypothesis. Which leads us to believe, "you know what, there probably IS a difference in the population means." Which tells us there probably is a difference in performance on an exam if you give them the different foods.
Contents
Common examples
Common examples of the use of Ftests include the study of the following cases:
 The hypothesis that the means of a given set of normally distributed populations, all having the same standard deviation, are equal. This is perhaps the bestknown Ftest, and plays an important role in the analysis of variance (ANOVA).
 The hypothesis that a proposed regression model fits the data well. See Lackoffit sum of squares.
 The hypothesis that a data set in a regression analysis follows the simpler of two proposed linear models that are nested within each other.
In addition, some statistical procedures, such as Scheffé's method for multiple comparisons adjustment in linear models, also use Ftests.
Ftest of the equality of two variances
The Ftest is sensitive to nonnormality.^{[2]}^{[3]} In the analysis of variance (ANOVA), alternative tests include Levene's test, Bartlett's test, and the Brown–Forsythe test. However, when any of these tests are conducted to test the underlying assumption of homoscedasticity (i.e. homogeneity of variance), as a preliminary step to testing for mean effects, there is an increase in the experimentwise Type I error rate.^{[4]}
Formula and calculation
Most Ftests arise by considering a decomposition of the variability in a collection of data in terms of sums of squares. The test statistic in a Ftest is the ratio of two scaled sums of squares reflecting different sources of variability. These sums of squares are constructed so that the statistic tends to be greater when the null hypothesis is not true. In order for the statistic to follow the Fdistribution under the null hypothesis, the sums of squares should be statistically independent, and each should follow a scaled χ²distribution. The latter condition is guaranteed if the data values are independent and normally distributed with a common variance.
Multiplecomparison ANOVA problems
The Ftest in oneway analysis of variance is used to assess whether the expected values of a quantitative variable within several predefined groups differ from each other. For example, suppose that a medical trial compares four treatments. The ANOVA Ftest can be used to assess whether any of the treatments is on average superior, or inferior, to the others versus the null hypothesis that all four treatments yield the same mean response. This is an example of an "omnibus" test, meaning that a single test is performed to detect any of several possible differences. Alternatively, we could carry out pairwise tests among the treatments (for instance, in the medical trial example with four treatments we could carry out six tests among pairs of treatments). The advantage of the ANOVA Ftest is that we do not need to prespecify which treatments are to be compared, and we do not need to adjust for making multiple comparisons. The disadvantage of the ANOVA Ftest is that if we reject the null hypothesis, we do not know which treatments can be said to be significantly different from the others, nor, if the Ftest is performed at level α, can we state that the treatment pair with the greatest mean difference is significantly different at level α.
The formula for the oneway ANOVA Ftest statistic is
or
The "explained variance", or "betweengroup variability" is
where denotes the sample mean in the ith group, is the number of observations in the ith group, denotes the overall mean of the data, and denotes the number of groups.
The "unexplained variance", or "withingroup variability" is
where is the j^{th} observation in the i^{th} out of groups and is the overall sample size. This Fstatistic follows the Fdistribution with degrees of freedom and under the null hypothesis. The statistic will be large if the betweengroup variability is large relative to the withingroup variability, which is unlikely to happen if the population means of the groups all have the same value.
Note that when there are only two groups for the oneway ANOVA Ftest, where t is the Student's statistic.
Regression problems
Consider two models, 1 and 2, where model 1 is 'nested' within model 2. Model 1 is the restricted model, and model 2 is the unrestricted one. That is, model 1 has p_{1} parameters, and model 2 has p_{2} parameters, where p_{1} < p_{2}, and for any choice of parameters in model 1, the same regression curve can be achieved by some choice of the parameters of model 2.
One common context in this regard is that of deciding whether a model fits the data significantly better than does a naive model, in which the only explanatory term is the intercept term, so that all predicted values for the dependent variable are set equal to that variable's sample mean. The naive model is the restricted model, since the coefficients of all potential explanatory variables are restricted to equal zero.
Another common context is deciding whether there is a structural break in the data: here the restricted model uses all data in one regression, while the unrestricted model uses separate regressions for two different subsets of the data. This use of the Ftest is known as the Chow test.
The model with more parameters will always be able to fit the data at least as well as the model with fewer parameters. Thus typically model 2 will give a better (i.e. lower error) fit to the data than model 1. But one often wants to determine whether model 2 gives a significantly better fit to the data. One approach to this problem is to use an Ftest.
If there are n data points to estimate parameters of both models from, then one can calculate the F statistic, given by
where RSS_{i} is the residual sum of squares of model i. If the regression model has been calculated with weights, then replace RSS_{i} with χ^{2}, the weighted sum of squared residuals. Under the null hypothesis that model 2 does not provide a significantly better fit than model 1, F will have an F distribution, with (p_{2}−p_{1}, n−p_{2}) degrees of freedom. The null hypothesis is rejected if the F calculated from the data is greater than the critical value of the Fdistribution for some desired falserejection probability (e.g. 0.05). The Ftest is a Wald test.
References
 ^ Lomax, Richard G. (2007). Statistical Concepts: A Second Course. p. 10. ISBN 0805858504.
 ^ Box, G. E. P. (1953). "NonNormality and Tests on Variances". Biometrika. 40 (3/4): 318–335. doi:10.1093/biomet/40.34.318. JSTOR 2333350.
 ^ Markowski, Carol A; Markowski, Edward P. (1990). "Conditions for the Effectiveness of a Preliminary Test of Variance". The American Statistician. 44 (4): 322–326. doi:10.2307/2684360. JSTOR 2684360.
 ^ Sawilowsky, S. (2002). "Fermat, Schubert, Einstein, and Behrens–Fisher: The Probable Difference Between Two Means When σ_{1}^{2} ≠ σ_{2}^{2}". Journal of Modern Applied Statistical Methods. 1 (2): 461–472. Archived from the original on 20150403. Retrieved 20150330.
Further reading
 Fox, Karl A. (1980). Intermediate Economic Statistics (Second ed.). New York: John Wiley & Sons. pp. 290–310. ISBN 0882755218.
 Johnston, John (1972). Econometric Methods (Second ed.). New York: McGrawHill. pp. 35–38.
 Kmenta, Jan (1986). Elements of Econometrics (Second ed.). New York: Macmillan. pp. 147–148. ISBN 0023650702.
 Maddala, G. S.; Lahiri, Kajal (2009). Introduction to Econometrics (Fourth ed.). Chichester: Wiley. pp. 155–160. ISBN 9780470015124.
External links
 Testing utility of model – Ftest
 Table of Ftest critical values
 Free calculator for Ftesting
 The Ftest for Linear Regression
 The F distribution and the basic principle behind ANOVAs
 Econometrics lecture (topic: hypothesis testing) on YouTube by Mark Thoma