I recently created a template for performing a 2 sample T-test, to determine if changes in a process are statistically significant. In the process I check to see if the normality and equal variance assumptions are valid. I also provide 3 different T-test depending on the type of data you have.
There are 20 before and 20 after samples taken of student test scores.
You can find the complete Jupyter notebook here.
Null Hypothesis (H0): Scores between the samples is the same. First 20 samples = last 20 samples
Alternative Hypothesis (H1): Scores for the samples is different.
First step in the process after importing the data is creating some visuals to help see the data.
This last chart helps makes visualizing the 2 samples over time easier, in a control chart type format.
After checking for normality and equal variance, I perform the hypothesis test using a function that check the p-value and returns an interpretation.
# Perform t-test
t_score, p_value = st.ttest_ind(a=df_new['first_twenty'],
b=df_new['last_twenty'],
alternative='two-sided') # change this if the hypothesis is greater or less than.
print(f'T-score: {t_score}')
p_value_reader(p_value, alpha=0.05)
Conclusion: Reject the Null Hypothesis, there is a difference in the means.