FR: Mann-Whitney-U test #182

omaus · 2022-01-28T15:36:34Z

Please reference the issue(s) this PR is related to

Tackles #117

Please list the changes introduced in this PR

U test added

Description

Mann-Whitney-U test added to the Testing portfolio. Corresponding unit test as well.

The project builds without problems on your machine
Added unit tests regarding the added features

bvenn · 2022-01-29T19:51:07Z

src/FSharp.Stats/Testing/UTest.fs

+    open FSharp.Stats
+    open FSharp.Stats.Testing
+
+    // TO DO:   Bergmann et al. (2000) showed that there are different implementations of this test that lead to different results.


Please check if your computations are correct before PR can be merged. Additionally, check your results against established packages in R, SPSS or puplications rather than volatile wikipedia chapters.

bvenn · 2022-01-29T19:52:13Z

src/FSharp.Stats/Testing/UTest.fs

+        // let abundance = // method for equal ranks instead of mean ranks when identical values occur.
+        //     sortedMerge
+        //     |> Array.map (
+        //         fun v -> Array.filter (fun v2 -> v2 = v) sortedMerge
+        //         >> Array.length
+        //     )
+        // let myMap = sortedMerge |> Array.mapi (fun i x -> x, i + 2 - Array.item i abundance) |> Map // wrong: must return mean of ranksums with equal ranks, not always the same rank!
+        // let rankedMerge = sortedMerge |> Array.map (fun (v,group) -> float myMap.[(v,group)],v,group)


remove comments

bvenn · 2022-01-29T19:54:30Z

tests/FSharp.Stats.Tests/Testing.fs

+let uTestTests =
+    // taken from https://de.wikipedia.org/wiki/Wilcoxon-Mann-Whitney-Test#Beispiel
+    let testList1 =
+        ([0;400;500;550;600;650;750;800;900;950;1000;1100;1200;1500;1600;1800;1900;2000;2200;3500 ],["M";"W";"M";"W";"M";"W";"M";"M";"W";"W";"M";"M";"W";"M";"W";"M";"M";"M";"M";"M"])


add duplicates to check ranking procedure

bvenn · 2022-01-29T19:55:50Z

src/FSharp.Stats/Testing/UTest.fs

+        let u1 = seq1Length * seq2Length + (seq1Length * (seq1Length + 1.) / 2.) - rankSumSeq1
+        let u2 = seq1Length * seq2Length + (seq2Length * (seq2Length + 1.) / 2.) - rankSumSeq2


temporary results should not be calculated twice

bvenn · 2022-01-29T20:00:12Z

src/FSharp.Stats/Testing/UTest.fs

+            |> Array.filter (fun (rank,v,group') -> group' = group)
+            |> Array.fold (fun state (rank,v,group') -> state + rank) 0.


replace with countBy

bvenn · 2022-01-29T20:01:12Z

src/FSharp.Stats/Testing/UTest.fs

+            |> Array.filter (fun (rank,v,group') -> group' = group)
+            |> Array.fold (fun state (rank,v,group') -> state + rank) 0.


replace with countBy

bvenn · 2022-01-29T20:12:14Z

src/FSharp.Stats/Testing/UTest.fs

+        let sortedMerge = 
+            (seq1 |> Seq.map (fun v -> float v, 0), seq2 |> Seq.map (fun v -> float v, 1)) // 0 = first group; 1 = second group
+            ||> Seq.append
+            |> Seq.sortByDescending (fun (v,groupIndex) -> v)


Is it necessary that the sequences are sorted? The ranking is order-independent and as I can see there is no need for sorting.

bvenn

Please check and correct the comments I suggested.

bvenn · 2022-06-28T08:06:02Z

src/FSharp.Stats/Testing/TestStatistics.fs

+    }
+    let createUTest statistic : UTestTestStatistics =
+        let cdf = Distributions.Continuous.Normal.CDF 0. 1. statistic


The approximation via the z-Distribution is only valid for large samples.

https://en.wikipedia.org/wiki/Mann%E2%80%93Whitney_U_test#Normal_approximation_and_tie_correction

Since the distribution is discrete, I cannot find any other (continuous) approximation than the one via normal distribution.
https://stats.stackexchange.com/a/251734

So, what do you suggest? Delete the whole test because of normal approximation might be too inaccurate at low sample sizes?

We should aim to implement the exact distribution for n+m<100 and the approximation for larger values

see #213 (comment)

daz10000 · 2025-04-02T02:25:27Z

I have been dealing with some issues in a different MannWhitney implementation and I fear that version isn't maintained. Is there anything I can do to help get this across the finish line? Might still be helpful to compare results with that implementation, since it has been otherwise fairly robust. Happy to help with testing and potentially any missing implementation details.

omaus · 2025-04-03T09:39:04Z

@daz10000 Yeah, I didn't work on this for a long time. I am also not really happy about the implementation since there are major issues with it: d25e88a#r908169551 and d25e88a#r795088327.
Otherwise, the PR was/is okay I guess...

Links seem to not work properly (at least on my machine), therefore screenshots:

daz10000 · 2025-04-24T03:12:16Z

I suspect half the work is in the last 10% for these things. Getting a basic version working vs getting something robust. I consider the R versions the gold standard for better or worse. I've come to the conclusion too that I might be abusing the test itself with the data sets I'm using which is probably for better or worse severely stressing implementations.

omaus added 3 commits January 28, 2022 00:12

Add U test ✨

d25e88a

Work on unit test for U test 🚧

1dc5b74

Finish unit test for U test ✅

6194424

bvenn reviewed Jan 29, 2022

View reviewed changes

bvenn requested changes Jan 29, 2022

View reviewed changes

bvenn reviewed Jun 28, 2022

View reviewed changes

bvenn mentioned this pull request Jun 28, 2022

Add Wilcoxon Distribution #213

Open

bvenn mentioned this pull request Sep 20, 2023

Add Wilcoxon Distribution #293

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FR: Mann-Whitney-U test #182

FR: Mann-Whitney-U test #182

Uh oh!

omaus commented Jan 28, 2022

Uh oh!

bvenn Jan 29, 2022

Uh oh!

bvenn Jan 29, 2022

Uh oh!

bvenn Jan 29, 2022

Uh oh!

bvenn Jan 29, 2022

Uh oh!

bvenn Jan 29, 2022

Uh oh!

bvenn Jan 29, 2022

Uh oh!

bvenn Jan 29, 2022

Uh oh!

bvenn left a comment

Uh oh!

bvenn Jun 28, 2022

Uh oh!

omaus Jun 28, 2022

Uh oh!

bvenn Jun 28, 2022

Uh oh!

daz10000 commented Apr 2, 2025 •

edited

Loading

Uh oh!

omaus commented Apr 3, 2025 •

edited

Loading

Uh oh!

daz10000 commented Apr 24, 2025

Uh oh!

Uh oh!

		let u1 = seq1Length * seq2Length + (seq1Length * (seq1Length + 1.) / 2.) - rankSumSeq1
		let u2 = seq1Length * seq2Length + (seq2Length * (seq2Length + 1.) / 2.) - rankSumSeq2

		\|> Array.filter (fun (rank,v,group') -> group' = group)
		\|> Array.fold (fun state (rank,v,group') -> state + rank) 0.

FR: Mann-Whitney-U test #182

Are you sure you want to change the base?

FR: Mann-Whitney-U test #182

Uh oh!

Conversation

omaus commented Jan 28, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bvenn left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

daz10000 commented Apr 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

omaus commented Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daz10000 commented Apr 24, 2025

Uh oh!

Uh oh!

daz10000 commented Apr 2, 2025 •

edited

Loading

omaus commented Apr 3, 2025 •

edited

Loading