What scipy statistical test do I use to compare sample means?
Assuming sample sizes are not equal, what test do I use to compare sample means under the following circumstances (please correct if any of the following are incorrect):
Normal Distribution = True and Homogeneity of Variance = True
scipy.stats.ttest_ind(sample_1, sample_2)
Normal Distribution = True and Homogeneity of Variance = False
scipy.stats.ttest_ind(sample_1, sample_2, equal_var = False)
Normal Distribution = False and Homogeneity of Variance = True
scipy.stats.mannwhitneyu(sample_1, sample_2)
Normal Distribution = False and Homogeneity of Variance = False
???
Fast answer:
Normal Distribution = True and Homogeneity of Variance = False and sample sizes > 30-50
scipy.stats.ttest_ind(sample1, sample2, equal_var=False)
Good answer:
If you check the Central limit theorem, it says (from Wikipedia): "In probability theory, the central limit theorem (CLT) states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined (finite) expected value and finite variance, will be approximately normally distributed, regardless of the underlying distribution"
So, although you do not have a normal distributed population, if your sample is big enough (greater than 30 or 50 samples), then the mean of the samples will be normally distributed. So, you can use:
scipy.stats.ttest_ind(sample1, sample2, equal_var=False)
This is a two-sided test for the null hypothesis that 2 independent samples have identical average (expected) values. With the option equal_var = False it performs a Welch's t-test, which does not assume equal population variance.
链接地址: http://www.djcxy.com/p/22120.html上一篇: 从C静态库中删除内部符号