128 lines
4.8 KiB
TeX
128 lines
4.8 KiB
TeX
\documentclass[a4paper]{article}
|
|
% To compile PDF run: latexmk -pdf {filename}.tex
|
|
|
|
\usepackage{graphicx} % Used to insert images into the paper
|
|
\usepackage{float}
|
|
\usepackage[justification=centering]{caption} % Used for captions
|
|
\captionsetup[figure]{font=small} % Makes captions small
|
|
\newcommand\tab[1][0.5cm]{\hspace*{#1}} % Defines a new command to use 'tab' in text
|
|
% Math package
|
|
\usepackage{amsmath}
|
|
% Enable that parameters of \cref{}, \ref{}, \cite{}, ... are linked so that a reader can click on the number an jump to the target in the document
|
|
\usepackage{hyperref}
|
|
%enable \cref{...} and \Cref{...} instead of \ref: Type of reference included in the link
|
|
\usepackage[capitalise,nameinlink]{cleveref}
|
|
% UTF-8 encoding
|
|
\usepackage[T1]{fontenc}
|
|
\usepackage[utf8]{inputenc} %support umlauts in the input
|
|
% Easier compilation
|
|
\usepackage{bookmark}
|
|
\usepackage{natbib}
|
|
|
|
\usepackage{xcolor}
|
|
\newcommand{\todo}[1]{\marginpar{{\textsf{TODO}}}{\textbf{\color{red}[#1]}}}
|
|
|
|
\begin{document}
|
|
\title{What is Waldo?}
|
|
\author{Kelvin Davis \and Jip J. Dekker\and Anthony Silvestere}
|
|
\maketitle
|
|
|
|
\begin{abstract}
|
|
|
|
\end{abstract}
|
|
|
|
\section{Introduction}
|
|
|
|
Almost every child around the world knows about ``Where's Waldo?'', also
|
|
known as ``Where's Wally?'' in some countries. This famous puzzle book has
|
|
spread its way across the world and is published in more than 25 different
|
|
languages. The idea behind the books is to find the character ``Waldo'',
|
|
shown in \Cref{fig:waldo}, in the different pictures in the book. This is,
|
|
however, not as easy as it sounds. Every picture in the book is full of tiny
|
|
details and Waldo is only one out of many. The puzzle is made even harder by
|
|
the fact that Waldo is not always fully depicted, sometimes it is just his
|
|
head or his torso popping out from behind something else. Lastly, the reason
|
|
that even adults will have trouble spotting Waldo is the fact that the
|
|
pictures are full of ``Red Herrings'': things that look like (or are colored
|
|
as) Waldo, but are not actually Waldo.
|
|
|
|
\begin{figure}[ht]
|
|
\includegraphics[scale=0.35]{waldo}
|
|
\centering
|
|
\caption{
|
|
A headshot of the character ``Waldo'', or ``Wally''. Pictures of Waldo
|
|
copyrighted by Martin Handford and are used under the fair-use policy.
|
|
}
|
|
\label{fig:waldo}
|
|
\end{figure}
|
|
|
|
The task of finding Waldo is something that relates to a lot of real life
|
|
image recognition tasks. Fields like mining, astronomy, surveillance,
|
|
radiology, and microbiology often have to analyse images (or scans) to find
|
|
the tiniest details, sometimes undetectable by the human eye. These tasks
|
|
are especially hard when the thing(s) you are looking for are similar to the
|
|
rest of the images. These tasks are thus generally performed using computers
|
|
to identify possible matches.
|
|
|
|
``Where's Waldo?'' offers us a great tool to study this kind of problem in a
|
|
setting that is humanly tangible. In this report we will try to identify
|
|
Waldo in the puzzle images using different classification methods. Every
|
|
image will be split into different segments and every segment will have to
|
|
be classified as either being ``Waldo'' or ``not Waldo''. We will compare
|
|
various different classification methods from more classical machine
|
|
learning, like naive Bayes classifiers, to the currently state of the art,
|
|
Neural Networks. In \Cref{sec:background} we will introduce the different
|
|
classification methods, \Cref{sec:methods} will explain the way in which
|
|
these methods are trained and how they will be evaluated, in
|
|
\Cref{sec:results} will discuss the results, and \Cref{sec:conclusion} will
|
|
offer our final conclusions.
|
|
|
|
\section{Background} \label{sec:background}
|
|
|
|
The classification methods used can separated into two separate groups:
|
|
classical machine learning methods and neural network architectures. Many of
|
|
the classical machine learning algorithms have variations and improvements
|
|
for various purposes; however, for this report we will be using their only
|
|
their basic versions. In contrast, we will use different neural network
|
|
architectures, as this method is currently the most used for image
|
|
classification.
|
|
|
|
\subsection{Classical Machine Learning Methods}
|
|
|
|
\paragraph{Naive Bayes Classifier}
|
|
|
|
\cite{naivebayes}
|
|
|
|
\paragraph{$k$-Nearest Neighbors}
|
|
|
|
($k$-NN) \cite{knn}
|
|
|
|
\paragraph{Support Vector Machine}
|
|
|
|
\cite{svm}
|
|
|
|
\paragraph{Random Forest}
|
|
|
|
\cite{randomforest}
|
|
|
|
\subsection{Neural Network Architectures}
|
|
\todo{Did we only do the three in the end? (Alexnet?)}
|
|
|
|
\paragraph{Convolutional Neural Networks}
|
|
|
|
\paragraph{LeNet}
|
|
|
|
\paragraph{Fully Convolutional Neural Networks}
|
|
|
|
|
|
\section{Methods} \label{sec:methods}
|
|
|
|
\section{Results and Discussion} \label{sec:results}
|
|
|
|
\section{Conclusion} \label{sec:conclusion}
|
|
|
|
\bibliographystyle{alpha}
|
|
\bibliography{references}
|
|
|
|
\end{document}
|