To install click the Add extension button. That's it.

The source code for the WIKI 2 extension is being checked by specialists of the Mozilla Foundation, Google, and Apple. You could also do it yourself at any point in time.

Kelly Slayton
Congratulations on this excellent venture… what a great idea!
Alexander Grigorievskiy
I use WIKI 2 every day and almost forgot how the original Wikipedia looks like.
What we do. Every page goes through several hundred of perfecting techniques; in live mode. Quite the same Wikipedia. Just better.

From Wikipedia, the free encyclopedia

An F-test is any statistical test in which the test statistic has an F-distribution under the null hypothesis. It is most often used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled. Exact "F-tests" mainly arise when the models have been fitted to the data using least squares. The name was coined by George W. Snedecor, in honour of Sir Ronald A. Fisher. Fisher initially developed the statistic as the variance ratio in the 1920s.[1]

YouTube Encyclopedic

  • 1/5
    485 002
    57 305
    41 340
    21 931
    823 797
  • ✪ ANOVA 3: Hypothesis test with F-statistic | Probability and Statistics | Khan Academy
  • ✪ Test Statistics: Crash Course Statistics #26
  • ✪ The Meaning of an F-Test
  • ✪ Hypothesis Testing by Hand: An f-Test for the Differences bewteen Two Population Variances - Part 1
  • ✪ How To Calculate and Understand Analysis of Variance (ANOVA) F Test.


In the last couple of videos we first figured out the TOTAL variation in these 9 data points right here and we got 30, that's our Total Sum of Squares. Then we asked ourselves, how much of that variation is due to variation WITHIN each of these groups, versus variation BETWEEN the groups themselves? So, for the variation within the groups we have our Sum of Squares within. And there we got 6. And then the balance of this, 30, the balance of this variation, came from variation between the groups, and we calculated it, We got 24. What I want to do in this video, is actually use this type of information, essentially these statistics we've calculated, to do some inferential statistics, to come to some time of conclusion, or maybe not to come to some type of conclusion. What I want to do is to put some context around these groups. We've been dealing with them abstractly right now, but you can imagine these are the results of some type of experiment. Let's say that I gave 3 different types of pills or 3 different types of food to people taking a test. And these are the scores on the test. So this is food 1, food 2, and then this over here is food 3. And I want to figure out if the type of food people take going into the test really affect their scores? If you look at these means, it looks like they perform best in group 3, than in group 2 or 1. But is that difference purely random? Random chance? Or can I be pretty confident that it's due to actual differences in the population means, of all of the people who would ever take food 3 vs food 2 vs food 1? So, my question here is, are the means and the true population means the same? This is a sample mean based on 3 samples. But if I knew the true population means-- So my question is: Is the mean of the population of people taking Food 1 equal to the mean of Food 2? Obviously I'll never be able to give that food to every human being that could ever live and then make them all take an exam. But there is some true mean there, it's just not really measurable. So my question is "this" equal to "this" equal to the mean 3, the true population of mean 3. And my question is, are these equal? Because if they're not equal, that means that the type of food given does have some type of impact on how people perform on a test. So let's do a little hypothesis test here. Let's say that my null hypothesis is that the means are the same. Food doesn't make a difference. "food doesn't make a difference" and that my Alternate hypothesis is that it does. "It does." and the way of thinking about this quantitatively is that if it doesn't make a difference, the true population means of the groups will be the same. The true population mean of the group that took food 1 will be the same as the group that took food 2, which will be the same as the group that took food 3. If our alternate hypothesis is correct, then these means will not be all the same. How can we test this hypothesis? So we're going to assume the null hypothesis, which is what we always do when we are hypothesis testing, we're going to assume our null hypothesis. And then essentially figure out, what are the chances of getting a certain statistic this extreme? And I haven't even defined what that statistic is. So we're going to define--we're going to assume our null hypothesis, and then we're going to come up with a statistic called the F statistic. So our F statistic which has an F distribution--and we won't go real deep into the details of the F distribution. But you can already start to think of it as the ratio of two Chi-squared distributions that may or may not have different degrees of freedom. Our F statistic is going to be the ratio of our Sum of Squares between the samples-- Sum of Squares between divided by, our degrees of freedom between and this is sometimes called the mean squares between, MSB, that, divided by the Sum of Squares within, so that's what I had done up here, the SSW in blue, divided by the SSW divided by the degrees of freedom of the SSwithin, and that was m (n-1). Now let's just think about what this is doing right here. If this number, the numerator, is much larger than the denominator, then that tells us that the variation in this data is due mostly to the differences between the actual means and its due less to the variation within the means. That's if this numerator is much bigger than this denominator over here. So that should make us believe that there is a difference in the true population mean. So if this number is really big, it should tell us that there is a lower probability that our null hypothesis is correct. If this number is really small and our denominator is larger, that means that our variation within each sample, makes up more of the total variation than our variation between the samples. So that means that our variation within each of these samples is a bigger percentage of the total variation versus the variation between the samples. So that would make us believe that "hey! ya know... any difference we see between the means is probably just random." And that would make it a little harder to reject the null. So let's actually calculate it. So in this case, our SSbetween, we calculated over here, was 24. and we had 2 degrees of freedom. And our SSwithin was 6 and we had how many degrees of freedom? Also, 6. 6 degrees of freedom. So this is going to be 24/2 which is 12, divided by 1. Our F statistic that we've calculated is going to be 12. F stands for Fischer who is the biologist and statistician who came up with this. So our F statistic is going to be 12. We're going to see that this is a pretty high number. Now, one thing I forgot to mention, with any hypothesis test, we're going to need some type of significance level. So let's say the significance level that we care about, for our hypothesis test, is 10%. 0.10 -- which means that if we assume the null hypothesis, there is less than a 10% chance of getting the result we got, of getting this F statistic, then we will reject the null hypothesis. So what we want to do is figure out a critical F statistic value, that getting that extreme of a value or greater, is 10% and if this is bigger than our critical F statistic value, then we're going to reject the null hypothesis, if it's less, we can't reject the null. So I'm not going to go into a lot of the guts of the F statistic, but we can already appreciate that each of these Sum of squares has a Chi-squared distribution. "This" has a Chi-squared distribution, and "this" has a different Chi-squared distribution This is a Chi-squared distribution with 2 degrees of freedom, this is a Chi-squared distribution with--And we haven't normalized it and all of that-- but roughly a Chi squared distribution with 6 degrees of freedom. So the F distribution is actually the ratio of two Chi-squared distributions And I got this--this is a screenshot from a professor's course at UCLA, I hope they don't mind, I need to find us an F table for us to look into. But this is what an F distribution looks like. And obviously it's going to look different depending on the df of the numerator and the denominator. There's two df to think about, the numerator degrees of freedom and the denominator degrees of freedom With that said, let's calculate the critical F statistic, for alpha is equal to 0.10, and you're actually going to see different F tables for each different alpha, where our numerator df is 2, and our denominator df is 6. So this table that I got, this whole table is for an alpha of 10% or 0.10, and our numerator df was 2 and our denominator was 6. So our critical F value is 3.46. So our critical F value is 3.46--this value right over here is 3.46 The value that we got based on our data is much larger than this, WAY above it. It's going to have a very, very small p value. The probability of getting something this extreme, just by chance, assuming the null hypothesis, is very low. It's way bigger than our critical F statistic with a 10% significance level. So because of that we can reject the null hypothesis. Which leads us to believe, "you know what, there probably IS a difference in the population means." Which tells us there probably is a difference in performance on an exam if you give them the different foods.


Common examples

Common examples of the use of F-tests include the study of the following cases:

In addition, some statistical procedures, such as Scheffé's method for multiple comparisons adjustment in linear models, also use F-tests.

F-test of the equality of two variances

The F-test is sensitive to non-normality.[2][3] In the analysis of variance (ANOVA), alternative tests include Levene's test, Bartlett's test, and the Brown–Forsythe test. However, when any of these tests are conducted to test the underlying assumption of homoscedasticity (i.e. homogeneity of variance), as a preliminary step to testing for mean effects, there is an increase in the experiment-wise Type I error rate.[4]

Formula and calculation

Most F-tests arise by considering a decomposition of the variability in a collection of data in terms of sums of squares. The test statistic in a F-test is the ratio of two scaled sums of squares reflecting different sources of variability. These sums of squares are constructed so that the statistic tends to be greater when the null hypothesis is not true. In order for the statistic to follow the F-distribution under the null hypothesis, the sums of squares should be statistically independent, and each should follow a scaled χ²-distribution. The latter condition is guaranteed if the data values are independent and normally distributed with a common variance.

Multiple-comparison ANOVA problems

The F-test in one-way analysis of variance is used to assess whether the expected values of a quantitative variable within several pre-defined groups differ from each other. For example, suppose that a medical trial compares four treatments. The ANOVA F-test can be used to assess whether any of the treatments is on average superior, or inferior, to the others versus the null hypothesis that all four treatments yield the same mean response. This is an example of an "omnibus" test, meaning that a single test is performed to detect any of several possible differences. Alternatively, we could carry out pairwise tests among the treatments (for instance, in the medical trial example with four treatments we could carry out six tests among pairs of treatments). The advantage of the ANOVA F-test is that we do not need to pre-specify which treatments are to be compared, and we do not need to adjust for making multiple comparisons. The disadvantage of the ANOVA F-test is that if we reject the null hypothesis, we do not know which treatments can be said to be significantly different from the others, nor, if the F-test is performed at level α, can we state that the treatment pair with the greatest mean difference is significantly different at level α.

The formula for the one-way ANOVA F-test statistic is


The "explained variance", or "between-group variability" is

where denotes the sample mean in the i-th group, is the number of observations in the i-th group, denotes the overall mean of the data, and denotes the number of groups.

The "unexplained variance", or "within-group variability" is

where is the jth observation in the ith out of groups and is the overall sample size. This F-statistic follows the F-distribution with degrees of freedom and under the null hypothesis. The statistic will be large if the between-group variability is large relative to the within-group variability, which is unlikely to happen if the population means of the groups all have the same value.

Note that when there are only two groups for the one-way ANOVA F-test, where t is the Student's  statistic.

Regression problems

Consider two models, 1 and 2, where model 1 is 'nested' within model 2. Model 1 is the restricted model, and model 2 is the unrestricted one. That is, model 1 has p1 parameters, and model 2 has p2 parameters, where p1 < p2, and for any choice of parameters in model 1, the same regression curve can be achieved by some choice of the parameters of model 2.

One common context in this regard is that of deciding whether a model fits the data significantly better than does a naive model, in which the only explanatory term is the intercept term, so that all predicted values for the dependent variable are set equal to that variable's sample mean. The naive model is the restricted model, since the coefficients of all potential explanatory variables are restricted to equal zero.

Another common context is deciding whether there is a structural break in the data: here the restricted model uses all data in one regression, while the unrestricted model uses separate regressions for two different subsets of the data. This use of the F-test is known as the Chow test.

The model with more parameters will always be able to fit the data at least as well as the model with fewer parameters. Thus typically model 2 will give a better (i.e. lower error) fit to the data than model 1. But one often wants to determine whether model 2 gives a significantly better fit to the data. One approach to this problem is to use an F-test.

If there are n data points to estimate parameters of both models from, then one can calculate the F statistic, given by

where RSSi is the residual sum of squares of model i. If the regression model has been calculated with weights, then replace RSSi with χ2, the weighted sum of squared residuals. Under the null hypothesis that model 2 does not provide a significantly better fit than model 1, F will have an F distribution, with (p2p1np2) degrees of freedom. The null hypothesis is rejected if the F calculated from the data is greater than the critical value of the F-distribution for some desired false-rejection probability (e.g. 0.05). The F-test is a Wald test.


  1. ^ Lomax, Richard G. (2007). Statistical Concepts: A Second Course. p. 10. ISBN 0-8058-5850-4.
  2. ^ Box, G. E. P. (1953). "Non-Normality and Tests on Variances". Biometrika. 40 (3/4): 318–335. doi:10.1093/biomet/40.3-4.318. JSTOR 2333350.
  3. ^ Markowski, Carol A; Markowski, Edward P. (1990). "Conditions for the Effectiveness of a Preliminary Test of Variance". The American Statistician. 44 (4): 322–326. doi:10.2307/2684360. JSTOR 2684360.
  4. ^ Sawilowsky, S. (2002). "Fermat, Schubert, Einstein, and Behrens–Fisher: The Probable Difference Between Two Means When σ12 ≠ σ22". Journal of Modern Applied Statistical Methods. 1 (2): 461–472. Archived from the original on 2015-04-03. Retrieved 2015-03-30.

Further reading

External links

This page was last edited on 31 May 2019, at 19:30
Basis of this page is in Wikipedia. Text is available under the CC BY-SA 3.0 Unported License. Non-text media are available under their specified licenses. Wikipedia® is a registered trademark of the Wikimedia Foundation, Inc. WIKI 2 is an independent company and has no affiliation with Wikimedia Foundation.