97 lines
3.7 KiB
TeX
97 lines
3.7 KiB
TeX
\documentclass[a4paper]{article}
|
|
% To compile PDF run: latexmk -pdf {filename}.tex
|
|
|
|
\usepackage{graphicx} % Used to insert images into the paper
|
|
\usepackage{float}
|
|
\usepackage[justification=centering]{caption} % Used for captions
|
|
\captionsetup[figure]{font=small} % Makes captions small
|
|
\newcommand\tab[1][0.5cm]{\hspace*{#1}} % Defines a new command to use 'tab' in text
|
|
% Math package
|
|
\usepackage{amsmath}
|
|
% Enable that parameters of \cref{}, \ref{}, \cite{}, ... are linked so that a reader can click on the number an jump to the target in the document
|
|
\usepackage{hyperref}
|
|
%enable \cref{...} and \Cref{...} instead of \ref: Type of reference included in the link
|
|
\usepackage[capitalise,nameinlink]{cleveref}
|
|
% UTF-8 encoding
|
|
\usepackage[T1]{fontenc}
|
|
\usepackage[utf8]{inputenc} %support umlauts in the input
|
|
% Easier compilation
|
|
\usepackage{bookmark}
|
|
\usepackage{natbib}
|
|
|
|
\begin{document}
|
|
\title{Waldo discovery using Neural Networks}
|
|
\author{Kelvin Davis \and Jip J. Dekker\and Anthony Silvestere}
|
|
\maketitle
|
|
|
|
\begin{abstract}
|
|
|
|
\end{abstract}
|
|
|
|
\section{Introduction}
|
|
|
|
\section{Background}
|
|
|
|
This paper is mad \cite{Kotsiantis2007}.
|
|
|
|
\section{Methods}
|
|
|
|
% Kelvin Start
|
|
\subsection{Benchmarking}\label{benchmarking}
|
|
|
|
In order to benchmark the Neural Networks, the performance of these
|
|
algorithms are evaluated against other Machine Learning algorithms. We
|
|
use Support Vector Machines, K-Nearest Neighbours (\(K=5\)), Gaussian
|
|
Naive Bayes and Random Forest classifiers, as provided in Scikit-Learn.
|
|
|
|
\subsection{Performance Metrics}\label{performance-metrics}
|
|
|
|
To evaluate the performance of the models, we record the time taken by
|
|
each model to train, based on the training data and statistics about the
|
|
predictions the models make on the test data. These prediction
|
|
statistics include:
|
|
|
|
\begin{itemize}
|
|
\tightlist
|
|
\item
|
|
\textbf{Accuracy:}
|
|
\[a = \dfrac{|correct\ predictions|}{|predictions|} = \dfrac{tp + tn}{tp + tn + fp + fn}\]
|
|
\item
|
|
\textbf{Precision:}
|
|
\[p = \dfrac{|Waldo\ predicted\ as\ Waldo|}{|predicted\ as\ Waldo|} = \dfrac{tp}{tp + fp}\]
|
|
\item
|
|
\textbf{Recall:}
|
|
\[r = \dfrac{|Waldo\ predicted\ as\ Waldo|}{|actually\ Waldo|} = \dfrac{tp}{tp + fn}\]
|
|
\item
|
|
\textbf{F1 Measure:} \[f1 = \dfrac{2pr}{p + r}\] where \(tp\) is the
|
|
number of true positives, \(tn\) is the number of true negatives,
|
|
\(fp\) is the number of false positives, and \(tp\) is the number of
|
|
false negatives.
|
|
\end{itemize}
|
|
|
|
Accuracy is a common performance metric used in Machine Learning,
|
|
however in classification problems where the training data is heavily
|
|
biased toward one category, sometimes a model will learn to optimize its
|
|
accuracy by classifying all instances as one category. I.e. the
|
|
classifier will classify all images that do not contain Waldo as not
|
|
containing Waldo, but will also classify all images containing Waldo as
|
|
not containing Waldo. Thus we use, other metrics to measure performance
|
|
as well.
|
|
|
|
\emph{Precision} returns the percentage of classifications of Waldo that
|
|
are actually Waldo. \emph{Recall} returns the percentage of Waldos that
|
|
were actually predicted as Waldo. In the case of a classifier that
|
|
classifies all things as Waldo, the recall would be 0. \emph{F1-Measure}
|
|
returns a combination of precision and recall that heavily penalises
|
|
classifiers that perform poorly in either precision or recall.
|
|
% Kelvin End
|
|
|
|
\section{Results}
|
|
|
|
\section{Discussion and Conclusion}
|
|
|
|
\bibliographystyle{humannat}
|
|
\bibliography{references}
|
|
|
|
\end{document}
|