# Hypothesis Testing (2 Sample T-test)

I recently created a template for performing a 2 sample T-test, to determine if changes in a process are statistically significant. In the process I check to see if the normality and equal variance assumptions are valid. I also provide 3 different T-test depending on the type of data you have.

There are 20 before and 20 after samples taken of student test scores.

You can find the complete Jupyter notebook [here](https://dtucker.xyz/projects/T-test_hypothesis.html).

---

**Null Hypothesis (H0)**: Scores between the samples is the same. First 20 samples = last 20 samples  
**Alternative Hypothesis (H1)**: Scores for the samples is different.

---

First step in the process after importing the data is creating some visuals to help see the data.

[Interactive plotly point plot](https://dtucker.xyz/projects/hypot_point.html)

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1717337594620/922f7f81-a6d2-414d-aa05-76dca62197b5.png align="center")

[Plotly line chart](https://dtucker.xyz/projects/hypot_line.html)

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1717337602570/b7c906c0-4e5f-4958-91c1-d4842a8509d0.png align="center")

[Plotly boxplot](https://dtucker.xyz/projects/hypot_box.html)

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1717337606314/da805653-77ee-4cc2-bdfa-7abdd3d2369b.png align="center")

[Plotly histogram](https://dtucker.xyz/projects/hypot_histogram.html)

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1717337609052/7447aaaf-0c2c-4740-ad2d-38286339f571.png align="center")

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1717337617885/fced55c5-6c3c-4755-a370-4f8ce5bb951a.png align="center")

This last chart helps makes visualizing the 2 samples over time easier, in a control chart type format.

After checking for normality and equal variance, I perform the hypothesis test using a function that check the p-value and returns an interpretation.

```python
# Perform t-test
t_score, p_value = st.ttest_ind(a=df_new['first_twenty'],
                                b=df_new['last_twenty'],
                                alternative='two-sided') # change this if the hypothesis is greater or less than.

print(f'T-score: {t_score}')
p_value_reader(p_value, alpha=0.05)
```

![](https://cdn.hashnode.com/res/hashnode/image/upload/v1717338143485/c3fdb076-f541-4117-9f28-e0a172b88fe7.png align="center")

**Conclusion:** Reject the Null Hypothesis, there is a difference in the means.
