From the above plots, we can see that the distribution of diabetes from the inactivity dataset is not normally distributed. In order to know whether there is a significant difference between the two groups or not, we can use another statistical test called Mann-Whitney U Test (Wilcoxon Rank-Sum Test).
Wilcoxon Rank-Sum Test:
- Mann-Whitney U Test does not rely on the assumption of a speicific distribution in the data. So, we can compare samples with non-normally distributed data.
- The test statistic in the Mann-Whitney U Test is the U statistic, which represents the sum of the ranks of the samples. The value of U depends on whether one sample tends to have larger values than the other or vice versa.
Assumptions:
- H0 (Null Hypothesis): There is no significant difference between the two groups.
- H1 (Alternative Hypothesis): There is a significant difference between the two group
- Mann-Whitney U Statistic is relatively high (174721.0), which suggests that one of the groups has consistently larger values and higher ranks compared to the other group.
- A small p-value (2.5437718749751766e-18 < 0.05) indicates that the difference in rank sums between the two groups is statistically significant. we can conclude that there is a significant difference between the two groups.