Say I have two vectors- A and B. If I get a low p-value, I would like to check whether this low p-value stands for high median of B compared to A, instead of just different medians (that is, either the median of B is higher than the median of A, or vise versa).
For example:
A=[120 10 201 20 30 12 30 10 2 2 3 5 1] B=[140 400 120 2000 30 40 2000 1000 1000]
I get a p-value of 7.2251e-004.
But if
B=[1 0 0 0 0 0 0 0 0 0 0 0 0]
I also get a low p-value (6.4360e-006).
I would like to get only low p-values when B median is higher then A median. Since I have many calculations, I need it to be automatically in the code, instead of checking every pair of vectors. Do you have any idea how to do that?
Thanks, Michal
No products are associated with this question.
Both ‘one-sided’ (that one median is greater than or less than the other) and ‘two-sided’ (that the medians are different) options are possible. See Item #4 under Assumptions and formal statement of hypotheses in the Wikipedia article on the Mann–Whitney U. There is also an excellent discussion of this on page 3 of The Wilcoxon Rank-Sum Test.
According the the documentation, ranksum returns the two-sided p-value, so make the appropriate calculation to get the one-sided p-value.
The Wikipedia article didn't exactly help me, but the second reference was very helpful.
If I understood correctly- when I use the following command: [p,h,stats] = ranksum(A,B), the "ranksum" value that I get (in the second field in the stats structure) can help me in the following way:
High ranksum value = H1 : A > B
Low ranksum value = H1 : A < B
Is that correct? If it is correct, then what should be the threshold for differentiating between the two options?
Many thanks!
My pleasure!
When I experimented with this a bit, I discovered that the z-statistic (z-value) — the first ‘stats’ field — may be the answer. When A > B, the z-statistic is (+)ve, and when A < B, the z-statistic is (-)ve.
According to the documentation, ‘ranksum’ only computes the z-value for large samples, so if your samples aren't large enough, I suggest simply comparing the medians.
Use the 'tail' option. A careful read of
>> help ranksum
will explain how.
(In the first draft of my answer, I pointed to "doc ranksum" rather than "help ranksum", but it seems that that documentation doesn't list the 'tail' option. Weird.)
Ah. I am using the prerelease of R2012b. It seems that 'tail' option is new.
0 Comments