Discussion:
Minimum data set size required for Kruskal-Wallis test?
(too old to reply)
Kate J.
2013-05-23 01:11:22 UTC
Permalink
I'm attempting to perform the nonparametric Kruskal-Wallis test (kruskalwallis() function) on 3 sets of data generated from 3 different testing conditions. In the past, I've successfully performed this test on large data sets from a different project (with 100+ members in each set). Currently, each of my sets only has 3 to 5 values. (I'm always comparing sets of equal size.)

The problem: despite my use of previous code that successfully performed Kruskal-Wallis analysis on larger data sets, when I try to perform the same analysis on my current, much smaller data sets, I'm receiving error messages. I'm wondering: is there a minimum set size required to perform Kruskal-Wallis analysis?

Here is my code:

dataSetA = [21.4 27.2 31.8];
dataSetB = [54.0 57.0 59.4];
dataSetC = [30.6 48.2 35.2];

myData = [dataSetA dataSetB dataSetC];
[p,table,stats] = kruskalwallis(mydata)
c1 = multcompare(stats)

The plot that is generated contains only a single boxplot instead of 3 (I know that a boxplot for only 3 values is dicey...), and here is the Matlab screen output:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
p = 1

table =
'Source' 'SS' 'df' 'MS' 'Chi-sq' 'Prob>Chi-sq'
'Columns' [ 0] [14] [ 0] [ 0] [ 1]
'Error' [279] [ 0] [NaN] [] []
'Total' [279] [14] [] [] []

stats =
gnames: '1'
n: [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
source: 'kruskalwallis'
meanranks: 8
sumt: 12

Note: Intervals can be used for testing but are not simultaneous confidence intervals.
??? Subscripted assignment dimension mismatch.

Error in ==> multcompare>makeM at 564
MM(:,2) = sqrt(diag(gcov));

Error in ==> multcompare at 475
[M,MM,hh] = makeM(gmeans, gcov, crit, gnames, mname, dodisp);

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Since collecting this particular data is time-consuming, it would be good to be able to get an idea about approximately how large my data sets will need to be in order for this type of analysis to work (as it *is* possible for me to collect more, if necessary); otherwise, I should consider other forms of statistical analysis.

Thanks in advance for your insights!
Tom Lane
2013-05-24 12:45:53 UTC
Permalink
Post by Kate J.
dataSetA = [21.4 27.2 31.8];
dataSetB = [54.0 57.0 59.4];
dataSetC = [30.6 48.2 35.2];
myData = [dataSetA dataSetB dataSetC];
[p,table,stats] = kruskalwallis(mydata)
You've concatenated three row vectors into a longer row vector. Try this
instead:

myData = [dataSetA' dataSetB' dataSetC'];
Post by Kate J.
c1 = multcompare(stats)
...
Post by Kate J.
??? Subscripted assignment dimension mismatch.
I don't see this when I try it. Maybe it will go away if you make sure to
have three columns in your input matrix.

-- Tom
Kate J.
2013-05-25 00:18:11 UTC
Permalink
Post by Tom Lane
You've concatenated three row vectors into a longer row vector. Try this
myData = [dataSetA' dataSetB' dataSetC'];
Adding the single quotes to format my matrix as columns did indeed work, Tom; no more error messages. Thanks!!
Continue reading on narkive:
Search results for 'Minimum data set size required for Kruskal-Wallis test?' (Questions and Answers)
4
replies
Two samples. Only know sample size and range limits. Significantly different?,?
started 2006-06-30 12:53:58 UTC
mathematics
Loading...