2014년 6월 24일 화요일

Wilcoxon signed-rank test

 Wilcoxon signed-rank test is a non-parametric statistical hypothesis test used when comparing two related samples, matched samples, or repeated measurements on a single sample to assess whether their population mean ranks differ (i.e. it is a paired difference test). 
It can be used as an alternative to the paired Student's t-testt-test for matched pairs, or the t-test for dependent samples when the population cannot be assumed to be normally distributed.
The Wilcoxon signed-rank test is not the same as the Wilcoxon rank-sum test, although both are nonparametric and involve summation of ranks.

The test is named for Frank Wilcoxon (1892–1965) who, in a single paper, proposed both it and the rank-sum test for two independent samples (Wilcoxon, 1945).[2] The test was popularized by Siegel (1956)[3] in his influential text book on non-parametric statistics. Siegel used the symbol T for a value related to, but not the same as, W. In consequence, the test is sometimes referred to as the Wilcoxon T test, and the test statistic is reported as a value of T.

Assumptions
  1. Data are paired and come from the same population.
  2. Each pair is chosen randomly and independently.
  3. The data are measured at least on an ordinal scale, but need not be normal.

Reference:
http://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test

Last observation carried forward (LOCF)

Last observation carried forward (LOCF)

This method is specific to longitudinal data problems. For each individual, missing values are replaced by the last observed value of that variable.


For example: Here the three missing values for unit 1, at times 4, 5 and 6 are replaced by the value at time 3, namely 2.0. Likewise the two missing values for unit 3, at times 5 and 6, are replaced by the value at time 4, which is 3.5. Using LOCF, once the data set has been completed in this way it is analysed as if it were fully observed.


For full longitudinal data analyses this is clearly disastrous: means and covariance structure are seriously distorted.

For single time point analyses the means are still likely to be distorted, measures of precision are wrong and hence inferences are wrong. Note this is true even if the mechanism that causes the data to be missing is completely random.


Reference: www.missingdata.org.uk