From 08813f9b9fc355fa647a63c16151d177b8679e39 Mon Sep 17 00:00:00 2001
From: "Jip J. Dekker" <jip@dekker.li>
Date: Fri, 25 May 2018 14:57:37 +1000
Subject: [PATCH] Move overfitting footnote to it's first occurrence

---
 mini_proj/report/waldo.tex | 27 +++++++++++++--------------
 1 file changed, 13 insertions(+), 14 deletions(-)

diff --git a/mini_proj/report/waldo.tex b/mini_proj/report/waldo.tex
index 4c1881c..36e76bb 100644
--- a/mini_proj/report/waldo.tex
+++ b/mini_proj/report/waldo.tex
@@ -149,9 +149,11 @@
 		(binary) tree. Each non-leaf node contain a selection criteria to its
 		branches. Every leaf node contains the class that will be assigned to the
 		instance if the node is reached. In other training methods, decision trees
-		have the tendency to overfit, but in random forest a multitude of decision
-		tree is trained with a certain degree of randomness and the mean of these
-		trees is used which avoids this problem.
+		have the tendency to overfit\footnote{Overfitting occurs when a model learns
+		from the data too specifically, and loses its ability to generalise its
+		predictions for new data (resulting in loss of prediction accuracy)}, but in
+		random forest a multitude of decision tree is trained with a certain degree
+		of randomness and the mean of these trees is used which avoids this problem.
 
 		\subsection{Neural Network Architectures}
 
@@ -233,17 +235,14 @@
 		models; requiring training on a dataset of typical images. Each network was
 		trained using the preprocessed training dataset and labels, for 25 epochs
 		(one forward and backward pass of all data) in batches of 150. The number of
-		epochs was chosen to maximise training time and prevent
-		overfitting\footnote{Overfitting occurs when a model learns from the data
-		too specifically, and loses its ability to generalise its predictions for
-		new data (resulting in loss of prediction accuracy)} of the training data,
-		given current model parameters. The batch size is the number of images sent
-		through each pass of the network. Using the entire dataset would train the
-		network quickly, but decrease the network's ability to learn unique features
-		from the data. Passing one image at a time may allow the model to learn more
-		about each image, however it would also increase the training time and risk
-		of overfitting the data. Therefore the batch size was chosen to maintain
-		training accuracy while minimising training time.
+		epochs was chosen to maximise training time and prevent overfitting of the
+		training data, given current model parameters. The batch size is the number
+		of images sent through each pass of the network. Using the entire dataset
+		would train the network quickly, but decrease the network's ability to learn
+		unique features from the data. Passing one image at a time may allow the
+		model to learn more about each image, however it would also increase the
+		training time and risk of overfitting the data. Therefore the batch size was
+		chosen to maintain training accuracy while minimising training time.
 
 		\subsection{Neural Network Testing}\label{nnTesting}