Merge branch 'master' of github.com:Dekker1/ResearchMethods
This commit is contained in:
commit
80125706f4
@ -3,3 +3,6 @@ svm,7.871559143066406,0.8446601941747572
|
||||
tree,0.25446152687072754,0.7087378640776699
|
||||
naive_bayes,0.12949371337890625,0.8252427184466019
|
||||
forest,0.2792677879333496,0.9514563106796117
|
||||
lenet,58.12968325614929,0.8980582524271845
|
||||
cnn,113.81168508529663,0.9563106796116505
|
||||
fcn,117.69003772735596,0.9466019417475728
|
|
BIN
mini_proj/report/LeNet.jpg
Normal file
BIN
mini_proj/report/LeNet.jpg
Normal file
Binary file not shown.
After Width: | Height: | Size: 38 KiB |
@ -137,3 +137,8 @@ month={Nov},}
|
||||
pages={2825--2830},
|
||||
year={2011}
|
||||
}
|
||||
@misc{kaggle,
|
||||
title = {Kaggle: The Home of Data Science \& Machine Learning},
|
||||
howpublished = {\url{https://www.kaggle.com/}},
|
||||
note = {Accessed: 2018-05-25}
|
||||
}
|
||||
|
@ -50,7 +50,7 @@
|
||||
Almost every child around the world knows about ``Where's Waldo?'', also
|
||||
known as ``Where's Wally?'' in some countries. This famous puzzle book has
|
||||
spread its way across the world and is published in more than 25 different
|
||||
languages. The idea behind the books is to find the character ``Waldo'',
|
||||
languages. The idea behind the books is to find the character Waldo,
|
||||
shown in \Cref{fig:waldo}, in the different pictures in the book. This is,
|
||||
however, not as easy as it sounds. Every picture in the book is full of tiny
|
||||
details and Waldo is only one out of many. The puzzle is made even harder by
|
||||
@ -64,7 +64,7 @@
|
||||
\includegraphics[scale=0.35]{waldo.png}
|
||||
\centering
|
||||
\caption{
|
||||
A headshot of the character ``Waldo'', or ``Wally''. Pictures of Waldo
|
||||
A headshot of the character Waldo, or Wally. Pictures of Waldo
|
||||
copyrighted by Martin Handford and are used under the fair-use policy.
|
||||
}
|
||||
\label{fig:waldo}
|
||||
@ -82,7 +82,7 @@
|
||||
setting that is humanly tangible. In this report we will try to identify
|
||||
Waldo in the puzzle images using different classification methods. Every
|
||||
image will be split into different segments and every segment will have to
|
||||
be classified as either being ``Waldo'' or ``not Waldo''. We will compare
|
||||
be classified as either being Waldo or not Waldo. We will compare
|
||||
various different classification methods from more classical machine
|
||||
learning, like naive Bayes classifiers, to the currently state of the art,
|
||||
Neural Networks. In \Cref{sec:background} we will introduce the different
|
||||
@ -158,16 +158,30 @@
|
||||
of randomness and the mean of these trees is used which avoids this problem.
|
||||
|
||||
\subsection{Neural Network Architectures}
|
||||
\tab There are many well established architectures for Neural Networks depending on the task being performed.
|
||||
In this paper, the focus is placed on convolution neural networks, which have been proven to effectively classify images \cite{NIPS2012_4824}.
|
||||
One of the pioneering works in the field, the LeNet \cite{726791}architecture, will be implemented to compare against two rudimentary networks with more depth.
|
||||
These networks have been constructed to improve on the LeNet architecture by extracting more features, condensing image information, and allowing for more parameters in the network.
|
||||
The difference between the two network use of convolutional and dense layers.
|
||||
The convolutional neural network contains dense layers in the final stages of the network.
|
||||
The Fully Convolutional Network (FCN) contains only one dense layer for the final binary classification step.
|
||||
The FCN instead consists of an extra convolutional layer, resulting in an increased ability for the network to abstract the input data relative to the other two configurations.
|
||||
\\
|
||||
\todo{Insert image of LeNet from slides if time}
|
||||
|
||||
There are many well established architectures for Neural Networks depending
|
||||
on the task being performed. In this paper, the focus is placed on
|
||||
convolution neural networks, which have been proven to effectively classify
|
||||
images \cite{NIPS2012_4824}. One of the pioneering works in the field, the
|
||||
LeNet \cite{726791}architecture, will be implemented to compare against two
|
||||
rudimentary networks with more depth. These networks have been constructed
|
||||
to improve on the LeNet architecture by extracting more features, condensing
|
||||
image information, and allowing for more parameters in the network. The
|
||||
difference between the two network use of convolutional and dense layers.
|
||||
The convolutional neural network contains dense layers in the final stages
|
||||
of the network. The Fully Convolutional Network (FCN) contains only one
|
||||
dense layer for the final binary classification step. The FCN instead
|
||||
consists of an extra convolutional layer, resulting in an increased ability
|
||||
for the network to abstract the input data relative to the other two
|
||||
configurations. \\
|
||||
|
||||
\begin{figure}[H]
|
||||
\includegraphics[scale=0.50]{LeNet}
|
||||
\centering
|
||||
\captionsetup{width=0.90\textwidth}
|
||||
\caption{Representation of the LeNet Neural Network model architecture including convolutional layers and pooling (subsampling) layers\cite{726791}}
|
||||
\label{fig:LeNet}
|
||||
\end{figure}
|
||||
|
||||
\section{Method} \label{sec:method}
|
||||
|
||||
@ -178,11 +192,11 @@
|
||||
agreement intended to allow users to freely share, modify, and use [a]
|
||||
Database while maintaining [the] same freedom for
|
||||
others"\cite{openData}}hosted on the predictive modeling and analytics
|
||||
competition framework, Kaggle. The distinction between images containing
|
||||
Waldo, and those that do not, was provided by the separation of the images
|
||||
in different sub-directories. It was therefore necessary to preprocess these
|
||||
images before they could be utilized by the proposed machine learning
|
||||
algorithms.
|
||||
competition framework, Kaggle~\cite{kaggle}. The distinction between images
|
||||
containing Waldo, and those that do not, was provided by the separation of
|
||||
the images in different sub-directories. It was therefore necessary to
|
||||
preprocess these images before they could be utilized by the proposed
|
||||
machine learning algorithms.
|
||||
|
||||
\subsection{Image Processing} \label{imageProcessing}
|
||||
|
||||
@ -197,15 +211,15 @@
|
||||
containing the most individual images of the three size groups. \\
|
||||
|
||||
Each of the 64$\times$64 pixel images were inserted into a
|
||||
Numpy~\cite{numpy} array of images, and a binary value was inserted into a
|
||||
NumPy~\cite{numpy} array of images, and a binary value was inserted into a
|
||||
separate list at the same index. These binary values form the labels for
|
||||
each image (``Waldo'' or ``not Waldo''). Color normalization was performed
|
||||
each image (Waldo or not Waldo). Color normalization was performed
|
||||
on each so that artifacts in an image's color profile correspond to
|
||||
meaningful features of the image (rather than photographic method).\\
|
||||
|
||||
Each original puzzle is broken down into many images, and only contains one
|
||||
Waldo. Although Waldo might span multiple 64$\times$64 pixel squares, this
|
||||
means that the ``non-Waldo'' data far outnumbers the ``Waldo'' data. To
|
||||
means that the non-Waldo data far outnumbers the Waldo data. To
|
||||
combat the bias introduced by the skewed data, all Waldo images were
|
||||
artificially augmented by performing random rotations, reflections, and
|
||||
introducing random noise in the image to produce news images. In this way,
|
||||
@ -215,10 +229,10 @@
|
||||
robust methods by exposing each technique to variations of the image during
|
||||
the training phase. \\
|
||||
|
||||
Despite the additional data, there were still ten times more ``non-Waldo''
|
||||
Despite the additional data, there were still ten times more non-Waldo
|
||||
images than Waldo images. Therefore, it was necessary to cull the
|
||||
``non-Waldo'' data, so that there was an even split of ``Waldo'' and
|
||||
``non-Waldo'' images, improving the representation of true positives in the
|
||||
non-Waldo data, so that there was an even split of Waldo and
|
||||
non-Waldo images, improving the representation of true positives in the
|
||||
image data set. Following preprocessing, the images (and associated labels)
|
||||
were divided into a training and a test set with a 3:1 split. \\
|
||||
|
||||
@ -254,7 +268,7 @@
|
||||
To evaluate the performance of the models, we record the time taken by
|
||||
each model to train, based on the training data and the accuracy with which
|
||||
the model makes predictions. We calculate accuracy as
|
||||
\(a = \frac{|correct\ predictions|}{|predictions|} = \frac{tp + tn}{tp + tn + fp + fn}\)
|
||||
\[a = \frac{|correct\ predictions|}{|predictions|} = \frac{tp + tn}{tp + tn + fp + fn}\]
|
||||
where \(tp\) is the number of true positives, \(tn\) is the number of true
|
||||
negatives, \(fp\) is the number of false positives, and \(tp\) is the number
|
||||
of false negatives.
|
||||
|
Reference in New Issue
Block a user