Merge branch 'master' of https://github.com/Dekker1/ResearchMethods
This commit is contained in:
commit
640002813d
@ -258,41 +258,12 @@
|
||||
\subsection{Performance Metrics}\label{performance-metrics}
|
||||
|
||||
To evaluate the performance of the models, we record the time taken by
|
||||
each model to train, based on the training data and statistics about the
|
||||
predictions the models make on the test data. These prediction
|
||||
statistics include:
|
||||
|
||||
\begin{itemize}
|
||||
\item
|
||||
\textbf{Accuracy:}
|
||||
\[a = \dfrac{|correct\ predictions|}{|predictions|} = \dfrac{tp + tn}{tp + tn + fp + fn}\]
|
||||
\item
|
||||
\textbf{Precision:}
|
||||
\[p = \dfrac{|Waldo\ predicted\ as\ Waldo|}{|predicted\ as\ Waldo|} = \dfrac{tp}{tp + fp}\]
|
||||
\item
|
||||
\textbf{Recall:}
|
||||
\[r = \dfrac{|Waldo\ predicted\ as\ Waldo|}{|actually\ Waldo|} = \dfrac{tp}{tp + fn}\]
|
||||
\item
|
||||
\textbf{F1 Measure:} \[f1 = \dfrac{2pr}{p + r}\] where \(tp\) is the
|
||||
number of true positives, \(tn\) is the number of true negatives,
|
||||
\(fp\) is the number of false positives, and \(tp\) is the number of
|
||||
false negatives.
|
||||
\end{itemize}
|
||||
|
||||
\emph{Accuracy} is a common performance metric used in Machine Learning,
|
||||
however in classification problems where the training data is heavily biased
|
||||
toward one category, sometimes a model will learn to optimize its accuracy
|
||||
by classifying all instances as one category. I.e. the classifier will
|
||||
classify all images that do not contain Waldo as not containing Waldo, but
|
||||
will also classify all images containing Waldo as not containing Waldo. Thus
|
||||
we use, other metrics to measure performance as well. \\
|
||||
|
||||
\emph{Precision} returns the percentage of classifications of Waldo that are
|
||||
actually Waldo. \emph{Recall} returns the percentage of Waldos that were
|
||||
actually predicted as Waldo. In the case of a classifier that classifies all
|
||||
things as Waldo, the recall would be 0. \emph{F1-Measure} returns a
|
||||
combination of precision and recall that heavily penalizes classifiers that
|
||||
perform poorly in either precision or recall.
|
||||
each model to train, based on the training data and the accuracy with which
|
||||
the model makes predictions. We calculate accuracy as
|
||||
\(a = \frac{|correct\ predictions|}{|predictions|} = \frac{tp + tn}{tp + tn + fp + fn}\)
|
||||
where \(tp\) is the number of true positives, \(tn\) is the number of true
|
||||
negatives, \(fp\) is the number of false positives, and \(tp\) is the number
|
||||
of false negatives.
|
||||
|
||||
\section{Results} \label{sec:results}
|
||||
|
||||
@ -328,7 +299,11 @@
|
||||
network and traditional machine learning technique}
|
||||
\label{tab:results}
|
||||
\end{table}
|
||||
|
||||
|
||||
We can see by the results that Deep Neural Networks outperform our benchmark
|
||||
classification models, although the time required to train these networks is
|
||||
significantly greater.
|
||||
|
||||
\section{Conclusion} \label{sec:conclusion}
|
||||
|
||||
Image from the ``Where's Waldo?'' puzzle books are ideal images to test
|
||||
|
Reference in New Issue
Block a user