Spaceship Titanic classification model
Random Forest and XGBoost classification prediction python model. Predicting which passengers will be transported to an alternate dimension.

I am an Industrial Engineer utilizing the power of python to gain deeper insights in data.
I am currently learning Deep learning with TensorFlow
Using the Kaggle Spaceship Titanic dataset, I created a machine learning model in python to predict which passengers in the data would be transported to an alternate dimension.
Check out the Kaggle notebook
Data Exploration
The first step was loading and performing some data exploration
A pair plot is a helpful visual to spot correlations.

Here is a visual of the age of those transported

During the data exploration I used the get_dummies function on the 'HomePlanet' and 'Destination' columns. I also performed other methods to clean and transform the data.
df_data_final = pd.get_dummies(df_data_final, columns=['HomePlanet', 'Destination'], drop_first=False)
I created a tableau dashboard to help visualize the data. This wasn't necessary but helped me see and interact with the data.

Creating the model
The target variable is the 'Transported' column, the rest of the columns are the features.
# Define the y (target) variable.
y = df_data_final['Transported']
# Define the X (predictor) variables.
X = df_data_final.drop(['Transported'], axis = 1)
Results
Random Forest Model Accuracy: 0.781048758049678
XGBoost Model Accuracy: 0.7828886844526219
The XGBoost model performed slightly better on the training data.
If you are interested, the Kaggle notebook can be found here



