Catapult Neural Network

Predicting the probability of hitting a target

Project Description

Over several years, mechanical engineering students have collected and recorded data related to design parameters and device performance for a catapult. The task is to create a robust neural network model in Python to predict the likelihood of hitting the target for various input variables.

The following input parameters were recorded for each catapult design:

  • Arm length

  • Ball weight

  • Ball radius

  • Temperature

  • Elastic constant of spring

  • Weight of device

The performance was recorded as either hitting (1) or missing (0) the target. Two datasets were investigated since the catapult design specification was changed at one point.

The complexity of the relationship between all the input parameters and the likelihood of hitting the target is exemplified in the pair plot and correlation heatmap for Dataset1 below.

Below is a snippet of the Python code that was used for Dataset1 of this task (a similar structure was used for Datasat2). Several libraries were also used, including Pandas for data ingestion and manipulation, NumPy for numerical operations and array handling, scikit-learn for machine learning functions and Keras for building the neural network.

Building and training the neural network:

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from keras.models import Sequential 
from keras.layers import Dense
from keras.utils import *
from google.colab import files

uploaded=files.upload()            #upload dataset to colab environment
df = pd.read_csv('dataset1.csv')   #Put the dataset into a dataframe variable

#Input Parameters (unscaled)
armLength=np.array(df['Arm length (m)'][:])
ballWeight=np.array(df['Ball weight (kg)'][:])
ballRadius=np.array(df['Ball radius (mm)'][:])
airTemp=np.array(df['Air temperature (deg C)'][:])
springConst=np.array(df['Spring constant (N per m)'][:])
devWeight=np.array(df['Device weight (kg)'][:])

#Output
output=np.array(df['Target hit'][:])

#These two lines will be useful at the very end for applying the model to the whole dataset.
merged_inputs_all=np.column_stack([armLength, ballWeight, ballRadius, airTemp, springConst, devWeight])
output_all_binary=to_categorical(output)

#Split all the data into training and testing data
armLength_tr, armLength_te, ballWeight_tr, ballWeight_te, ballRadius_tr, ballRadius_te, airTemp_tr, airTemp_te, springConst_tr, springConst_te, devWeight_tr, devWeight_te, output_tr, output_te = train_test_split(armLength, ballWeight, ballRadius, airTemp, springConst, devWeight, output, train_size=0.8)

#Scale each input training parameter (scale using (x-mu)/sigma)
scales=np.ones((2,6))   #array to store scaling parameters. First row has mu, second row has sigma

scales[0,0]=np.mean(armLength_tr)
scales[1,0]=np.std(armLength_tr)
armLength_tr=(armLength_tr-scales[0,0])/scales[1,0]

#Repeat for all input parameters
#.
#.
#.

#Start building the neural network
model = Sequential()

#Layers
model.add(Dense(units=6, activation='relu', input_dim=6)) 
model.add(Dense(units=24, activation='relu')) 
model.add(Dense(units=2, activation='softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='sgd',
              metrics=['accuracy'])

#Fit the model to the training data
output_tr_binary=to_categorical(output_tr)    #Splits the output into two columns, first column=missing, 2nd column=hitting.
merged_inputs_tr=np.column_stack([armLength_tr, ballWeight_tr, ballRadius_tr, airTemp_tr, springConst_tr, devWeight_tr]) #Merge all training inputs into one array.

model.fit(merged_inputs_tr, output_tr_binary, epochs=250, batch_size=32)

Testing the neural network on unseen data:

#Apply model on testing data
#First scale the testing data using the scaling parameters calculated from the training data.
scaledarmLength_te=(armLength_te-scales[0,0])/scales[1,0]
scaledballWeight_te=(ballWeight_te-scales[0,1])/scales[1,1]
scaledballRadius_te=(ballRadius_te-scales[0,2])/scales[1,2]
scaledairTemp_te=(airTemp_te-scales[0,3])/scales[1,3]
scaledspringConst_te=(springConst_te-scales[0,4])/scales[1,4]
scaleddevWeight_te=(devWeight_te-scales[0,5])/scales[1,5]

merged_inputs_te=np.column_stack([scaledarmLength_te, scaledballWeight_te, scaledballRadius_te, scaledairTemp_te, scaledspringConst_te, scaleddevWeight_te]) #Merge all testing inputs into one array.

output_predicted=model.predict(merged_inputs_te)

#Compare predicted output to real recorded output
output_te_binary=to_categorical(output_te)    #Convert the real testing output to two columns so it can be compared to the predicted output which is also two columns.

number_correct = 0
#Need to round because output_predicted is in probabilities but we just want 1s or 0s.
for i in range(len(output_te_binary)):
    if np.round(output_te_binary[i, 0]) == np.round(output_predicted[i, 0]):
        number_correct=number_correct+1

fraction_correct = 100 * number_correct / len(output_predicted)
print('The percentage of correctly predicted outputs (unseen) was: {}%'.format(fraction_correct))

#Calculate performance on seen data too
output_predicted_seen=model.predict(merged_inputs_tr)

number_correct_seen=0
for j in range(len(output_tr_binary)):
    if np.round(output_tr_binary[j, 0]) == np.round(output_predicted_seen[j, 0]):
        number_correct_seen=number_correct_seen+1

fraction_correct_seen = 100 * number_correct_seen / len(output_predicted_seen)
print('The percentage of correctly predicted outputs (seen) was: {}%'.format(fraction_correct_seen))

#Calculate performance on all data (compare to 'checking' script)
scaled_merged_inputs_all = (merged_inputs_all-scales[0,:])/scales[1,:]
output_predicted_all=model.predict(scaled_merged_inputs_all)

number_correct_all=0
for k in range(len(output_all_binary)):
    if np.round(output_all_binary[k, 0]) == np.round(output_predicted_all[k, 0]):
        number_correct_all=number_correct_all+1

fraction_correct_all = 100 * number_correct_all / len(output_predicted_all)
print('The percentage of correctly predicted outputs (all) was: {}%'.format(fraction_correct_all))

#Save the model.
model.save('firgau-carlos-1.h5')        #This saves the model to colab.
files.download('firgau-carlos-1.h5')    #This downloads the model to your personal computer.

#Save the scaling parameters
np.savetxt('firgau-carlos-1.txt', scales)              #This saves it to the Colab interface
files.download('firgau-carlos-1.txt')                  #This downloads to your personal computer.

Results:

  • Dataset1

The input parameters were scaled and 80% of the data were split into a training set, with the remaining points assigned to a testing set - a compromise allowing sufficient training and representative final testing. Two hidden ReLU layers were used, first with 6 units (same number of units as input parameters produced better results) and the second one with 24 units, followed by a final Softmax layer. 250 epochs with batch size of 32 were sufficient (more epochs increased performance for seen data but not for unseen data – i.e. caused overfitting). 94.3% of the unseen data was correctly identified.

  • Dataset2

The model from Dataset1 was adapted for Dataset2. The epochs were increased to 1000 (relatively high since the data was more difficult to fit), which wasn’t sufficient to increase performance for unseen data (it was only improving performance for seen data) so more layers were implemented – four hidden ReLU layers with 6, 18, 36, 18 units respectively, followed by a final Softmax layer. Extensive testing showed that this increasing-then-decreasing number of units in the hidden layers produced the best results, correctly identifying 86.7% of unseen data – a lower success rate than in Dataset1 despite the more complex neural network used.

Previous
Previous

Shell Eco-marathon

Next
Next

Formula Student EV3