Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions (common examples of parameters are the mean and variance). Nonparametric statistics is based on either being distributionfree or having a specified distribution but with the distribution's parameters unspecified. Nonparametric statistics includes both descriptive statistics and statistical inference.
YouTube Encyclopedic

1/3Views:164 07613 84362 391

✪ Parametric and Nonparametric Statistical Tests

✪ Nonparametric tests  Sign test, Wilcoxon signed rank, MannWhitney

✪ 1 NonParametric  An Introduction
Transcription
Contents
Definitions
The term "nonparametric statistics" has been imprecisely defined in the following two ways, among others.
 The first meaning of nonparametric covers techniques that do not rely on data belonging to any particular parametric family of probability distributions.
These include, among others:
 distribution free methods, which do not rely on assumptions that the data are drawn from a given parametric family of probability distributions. As such it is the opposite of parametric statistics.
 nonparametric statistics (a statistic is defined to be a function on a sample; no dependency on a parameter).
Order statistics, which are based on the ranks of observations, is one example of such statistics.
The following discussion is taken from Kendall's.^{[1]}
Statistical hypotheses concern the behavior of observable random variables.... For example, the hypothesis (a) that a normal distribution has a specified mean and variance is statistical; so is the hypothesis (b) that it has a given mean but unspecified variance; so is the hypothesis (c) that a distribution is of normal form with both mean and variance unspecified; finally, so is the hypothesis (d) that two unspecified continuous distributions are identical.
It will have been noticed that in the examples (a) and (b) the distribution underlying the observations was taken to be of a certain form (the normal) and the hypothesis was concerned entirely with the value of one or both of its parameters. Such a hypothesis, for obvious reasons, is called parametric.
Hypothesis (c) was of a different nature, as no parameter values are specified in the statement of the hypothesis; we might reasonably call such a hypothesis nonparametric. Hypothesis (d) is also nonparametric but, in addition, it does not even specify the underlying form of the distribution and may now be reasonably termed distributionfree. Notwithstanding these distinctions, the statistical literature now commonly applies the label "nonparametric" to test procedures that we have just termed "distributionfree", thereby losing a useful classification.
 The second meaning of nonparametric covers techniques that do not assume that the structure of a model is fixed. Typically, the model grows in size to accommodate the complexity of the data. In these techniques, individual variables are typically assumed to belong to parametric distributions, and assumptions about the types of connections among variables are also made. These techniques include, among others:
 nonparametric regression, which is modeling whereby the structure of the relationship between variables is treated nonparametrically, but where nevertheless there may be parametric assumptions about the distribution of model residuals.
 nonparametric hierarchical Bayesian models, such as models based on the Dirichlet process, which allow the number of latent variables to grow as necessary to fit the data, but where individual variables still follow parametric distributions and even the process controlling the rate of growth of latent variables follows a parametric distribution.
Applications and purpose
Nonparametric methods are widely used for studying populations that take on a ranked order (such as movie reviews receiving one to four stars). The use of nonparametric methods may be necessary when data have a ranking but no clear numerical interpretation, such as when assessing preferences. In terms of levels of measurement, nonparametric methods result in ordinal data.
As nonparametric methods make fewer assumptions, their applicability is much wider than the corresponding parametric methods. In particular, they may be applied in situations where less is known about the application in question. Also, due to the reliance on fewer assumptions, nonparametric methods are more robust.
Another justification for the use of nonparametric methods is simplicity. In certain cases, even when the use of parametric methods is justified, nonparametric methods may be easier to use. Due both to this simplicity and to their greater robustness, nonparametric methods are seen by some statisticians as leaving less room for improper use and misunderstanding.
The wider applicability and increased robustness of nonparametric tests comes at a cost: in cases where a parametric test would be appropriate, nonparametric tests have less power. In other words, a larger sample size can be required to draw conclusions with the same degree of confidence.
Nonparametric models
Nonparametric models differ from parametric models in that the model structure is not specified a priori but is instead determined from data. The term nonparametric is not meant to imply that such models completely lack parameters but that the number and nature of the parameters are flexible and not fixed in advance.
 A histogram is a simple nonparametric estimate of a probability distribution.
 Kernel density estimation provides better estimates of the density than histograms.
 Nonparametric regression and semiparametric regression methods have been developed based on kernels, splines, and wavelets.
 Data envelopment analysis provides efficiency coefficients similar to those obtained by multivariate analysis without any distributional assumption.
 KNNs classify the unseen instance based on the K points in the training set which are nearest to it.
 A support vector machine (with a Gaussian kernel) is a nonparametric largemargin classifier.
 Method of moments (statistics) with polynomial probability distributions.
Methods
Nonparametric (or distributionfree) inferential statistical methods are mathematical procedures for statistical hypothesis testing which, unlike parametric statistics, make no assumptions about the probability distributions of the variables being assessed. The most frequently used tests include
 Analysis of similarities
 Anderson–Darling test: tests whether a sample is drawn from a given distribution
 Statistical bootstrap methods: estimates the accuracy/sampling distribution of a statistic
 Cochran's Q: tests whether k treatments in randomized block designs with 0/1 outcomes have identical effects
 Cohen's kappa: measures interrater agreement for categorical items
 Friedman twoway analysis of variance by ranks: tests whether k treatments in randomized block designs have identical effects
 Kaplan–Meier: estimates the survival function from lifetime data, modeling censoring
 Kendall's tau: measures statistical dependence between two variables
 Kendall's W: a measure between 0 and 1 of interrater agreement
 Kolmogorov–Smirnov test: tests whether a sample is drawn from a given distribution, or whether two samples are drawn from the same distribution
 Kruskal–Wallis oneway analysis of variance by ranks: tests whether > 2 independent samples are drawn from the same distribution
 Kuiper's test: tests whether a sample is drawn from a given distribution, sensitive to cyclic variations such as day of the week
 Logrank test: compares survival distributions of two rightskewed, censored samples
 Mann–Whitney U or Wilcoxon rank sum test: tests whether two samples are drawn from the same distribution, as compared to a given alternative hypothesis.
 McNemar's test: tests whether, in 2 × 2 contingency tables with a dichotomous trait and matched pairs of subjects, row and column marginal frequencies are equal
 Median test: tests whether two samples are drawn from distributions with equal medians
 Pitman's permutation test: a statistical significance test that yields exact p values by examining all possible rearrangements of labels
 Rank products: detects differentially expressed genes in replicated microarray experiments
 Siegel–Tukey test: tests for differences in scale between two groups
 Sign test: tests whether matched pair samples are drawn from distributions with equal medians
 Spearman's rank correlation coefficient: measures statistical dependence between two variables using a monotonic function
 Squared ranks test: tests equality of variances in two or more samples
 Tukey–Duckworth test: tests equality of two distributions by using ranks
 Wald–Wolfowitz runs test: tests whether the elements of a sequence are mutually independent/random
 Wilcoxon signedrank test: tests whether matched pair samples are drawn from populations with different mean ranks
History
Early nonparametric statistics include the median (13th century or earlier, use in estimation by Edward Wright, 1599; see Median § History) and the sign test by John Arbuthnot (1710) in analyzing the human sex ratio at birth (see Sign test § History).^{[2]}^{[3]}
See also
 CDFbased nonparametric confidence interval
 Parametric statistics
 Resampling (statistics)
 Semiparametric model
Notes
 ^ Stuart A., Ord J.K, Arnold S. (1999), Kendall's Advanced Theory of Statistics: Volume 2A—Classical Inference and the Linear Model, sixth edition, §20.2–20.3 (Arnold).
 ^ Conover, W.J. (1999), "Chapter 3.4: The Sign Test", Practical Nonparametric Statistics (Third ed.), Wiley, pp. 157–176, ISBN 0471160687
 ^ Sprent, P. (1989), Applied Nonparametric Statistical Methods (Second ed.), Chapman & Hall, ISBN 0412449803
General references
 Bagdonavicius, V., Kruopis, J., Nikulin, M.S. (2011). "Nonparametric tests for complete data", ISTE & WILEY: London & Hoboken. ISBN 9781848212695.
 Corder, G. W.; Foreman, D. I. (2014). Nonparametric Statistics: A StepbyStep Approach. Wiley. ISBN 9781118840313.
 Gibbons, Jean Dickinson; Chakraborti, Subhabrata (2003). Nonparametric Statistical Inference, 4th Ed. CRC Press. ISBN 0824740521.
 Hettmansperger, T. P.; McKean, J. W. (1998). Robust Nonparametric Statistical Methods. Kendall's Library of Statistics. 5 (First ed.). London: Edward Arnold. New York: John Wiley & Sons. ISBN 0340549378. MR 1604954. also ISBN 0471194794.
 Hollander M., Wolfe D.A., Chicken E. (2014). Nonparametric Statistical Methods, John Wiley & Sons.
 Sheskin, David J. (2003) Handbook of Parametric and Nonparametric Statistical Procedures. CRC Press. ISBN 1584884401
 Wasserman, Larry (2007). All of Nonparametric Statistics, Springer. ISBN 0387251456.