Table 1:Selected Nonparametric Tests and Techniques

(1) Nonparametric test make less stringent demands of the data. Forstandard parametric procedures to be valid, certain underlying conditionsor assumptions must be met, particularly for smaller sample sizes. Theone-sample t test, for example, requires that the observations be drawnfrom a normally distributed population. For two independent samples, thet test has the additional requirement that the population standarddeviations be equal. If these assumptions/conditions are violated, theresulting P-values and confidence intervals may not betrustworthy3. However, normality is not required for theWilcoxon signed rank or rank sum tests to produce valid inferences aboutwhether the median of a symmetric population is 0 or whether two samplesare drawn from the same population.

Selected Nonparametric Tests and Techniques

That's what tests of statistical significance are all about.

Difference Between Parametric and Nonparametric Test …

Fisherian parametric tests are classified by data miners Nisbet, Elder, and Miner (2009) as first generation statistical methods. While parametric tests are efficient to handle relatively small experimental data sets in academic settings, business and industry, which use huge data sets, admitted that "analysts could bring computers to their 'knees' with the processing of classical statistical analyses" (Nisbet, Elder, & Miner, 2009, p.30). As a remedy, a new approach to decision making was created based on artificial intelligence (AI), which modeled on the human brain rather than on Fisher’s parametric approach. As a result, a new set of non-parametric tools, including neural nets, classification trees, and multiple auto-regressive spine (MARS), was developed for analyzing huge data sets. This cluster of tools is called data mining. Unlike conventional parametric tests that emphasize theoretical explanation, data mining is primarily used by business for . This paradigm shift is reflected by the renaming of SPSS in 2009. After IBM acquires SPSS, SPSS became Predictive Analytical Software (PASW) because data mining and text mining tools had been tightly integrated into formerly SPSS's parametric procedures. But later IBM reverted the name to SPSS. For more information about data mining, please read this.

Parametric and Nonparametric Methods in Statistics

In social sciences, the assumption of independence, which is required by ANOVA and many other parametric procedures, is always violated to some degree. Take Trends for International Mathematics and Science Study (TIMSS) as an example. The TIMSS sample design is a two-stage stratified cluster sampling scheme. In the first stage, schools are sampled with probability proportional to size. Next, one or more intact classes of students from the target grades are drawn at the second stage (Joncas, 2008). Parametric-based ordinary Least Squares (OLS) regression models are valid if and only if the residuals are normally distributed, independent, with a mean of zero and a constant variance. However, TMISS data are collected using a complex sampling method, in which data of one level are nested with another level (i.e. students are nested with classes, classes are nested with schools, schools are nested with nations), and thus it is unlikely that the residuals are independent of each other. If OLS regression is employed to estimate relationships on nested data, the estimated standard errors will be negatively biased, resulting in an overestimation of the statistical significance of regression coefficients. In this case, hierarchical linear modeling (HLM) (Raudenbush & Bryk, 2002) should be employed to specifically tackle the nested data structure. To be more specific, instead of fitting one overall model, HLM takes this nested data structure into account by constructing models at different levels, and thus HLM is also called multilevel modeling.

The merit of HLM does not end here. For analyzing longitudinal data, HLM is considered superior to repeated measures ANOVA because the latter must assume compound symmetry whereas HLM allows the analyst specify many different forms of covariance structure (Littell & Milliken, 2006). Readers are encouraged to read Shin's (2009) concise comparison of repeated measures ANOVA and HLM.

The difference between the predicted y and the observed y is called a residual, or an error term.

Nonparametric statistics - Wikipedia

The purpose of the Mann-Kendall (MK) test is to statistically assess if there is a monotonic upward or downward trend of the variable of interest over time. A monotonic upward (downward) trend means that the variable consistently increases (decreases) through time, but the trend may or may not be linear. The MK test can be used in place of a parametric linear regression analysis, which can be used to test if the slope of the estimated linear regression line is different from zero. The regression analysis requires that the residuals from the fitted regression line be normally distributed; an assumption not required by the MK test, that is, the MK test is a non-parametric (distribution-free) test.

The difference between parametric models and non-parametric models ..

Two of the simplest nonparametric procedures are the sign test andmedian test. The can be used with paired data to test thehypothesis that differences are equally likely to be positive ornegative, (or, equivalently, that the median difference is 0). For smallsamples, an exact test of whether the proportion of positives is 0.5 canbe obtained by using a binomial distribution. For large samples, the teststatistic is


(3) Nonparametric methods provide an air of objectivity when there isno reliable (universally recognized) underlying scale for the originaldata and there is some concern that the results of standard parametrictechniques would be criticized for their dependence on an artificialmetric. For example, patients might be asked whether they feel / / / / . What scores should beassigned to the comfort categories and how do we know whether the outcomewould change dramatically with a slight change in scoring? Some of theseconcerns are blunted when the data are converted to ranks4.