What scipy statistical test do I use to compare sample means?

Assuming sample sizes are not equal, what test do I use to compare sample means under the following circumstances (please correct if any of the following are incorrect):

Normal Distribution = True and Homogeneity of Variance = True

scipy.stats.ttest_ind(sample_1, sample_2)

Normal Distribution = True and Homogeneity of Variance = False

scipy.stats.ttest_ind(sample_1, sample_2, equal_var = False)

Normal Distribution = False and Homogeneity of Variance = True

scipy.stats.mannwhitneyu(sample_1, sample_2)

Normal Distribution = False and Homogeneity of Variance = False

???

Fast answer:

Normal Distribution = True and Homogeneity of Variance = False and sample sizes > 30-50

scipy.stats.ttest_ind(sample1, sample2, equal_var=False)

Good answer:

If you check the Central limit theorem, it says (from Wikipedia): "In probability theory, the central limit theorem (CLT) states that, given certain conditions, the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined (finite) expected value and finite variance, will be approximately normally distributed, regardless of the underlying distribution"

So, although you do not have a normal distributed population, if your sample is big enough (greater than 30 or 50 samples), then the mean of the samples will be normally distributed. So, you can use:

scipy.stats.ttest_ind(sample1, sample2, equal_var=False)

This is a two-sided test for the null hypothesis that 2 independent samples have identical average (expected) values. With the option equal_var = False it performs a Welch's t-test, which does not assume equal population variance.

链接地址: http://www.djcxy.com/p/22120.html

上一篇: 从C静态库中删除内部符号

下一篇: 我用什么scipy统计检验来比较样本的含义?