diff --git a/mini_proj/report/waldo.tex b/mini_proj/report/waldo.tex index 379f94a..c543e58 100644 --- a/mini_proj/report/waldo.tex +++ b/mini_proj/report/waldo.tex @@ -252,41 +252,12 @@ \subsection{Performance Metrics}\label{performance-metrics} To evaluate the performance of the models, we record the time taken by - each model to train, based on the training data and statistics about the - predictions the models make on the test data. These prediction - statistics include: - - \begin{itemize} - \item - \textbf{Accuracy:} - \[a = \dfrac{|correct\ predictions|}{|predictions|} = \dfrac{tp + tn}{tp + tn + fp + fn}\] - \item - \textbf{Precision:} - \[p = \dfrac{|Waldo\ predicted\ as\ Waldo|}{|predicted\ as\ Waldo|} = \dfrac{tp}{tp + fp}\] - \item - \textbf{Recall:} - \[r = \dfrac{|Waldo\ predicted\ as\ Waldo|}{|actually\ Waldo|} = \dfrac{tp}{tp + fn}\] - \item - \textbf{F1 Measure:} \[f1 = \dfrac{2pr}{p + r}\] where \(tp\) is the - number of true positives, \(tn\) is the number of true negatives, - \(fp\) is the number of false positives, and \(tp\) is the number of - false negatives. - \end{itemize} - - \emph{Accuracy} is a common performance metric used in Machine Learning, - however in classification problems where the training data is heavily biased - toward one category, sometimes a model will learn to optimize its accuracy - by classifying all instances as one category. I.e. the classifier will - classify all images that do not contain Waldo as not containing Waldo, but - will also classify all images containing Waldo as not containing Waldo. Thus - we use, other metrics to measure performance as well. \\ - - \emph{Precision} returns the percentage of classifications of Waldo that are - actually Waldo. \emph{Recall} returns the percentage of Waldos that were - actually predicted as Waldo. In the case of a classifier that classifies all - things as Waldo, the recall would be 0. \emph{F1-Measure} returns a - combination of precision and recall that heavily penalizes classifiers that - perform poorly in either precision or recall. + each model to train, based on the training data and the accuracy with which + the model makes predictions. We calculate accuracy as + \(a = \frac{|correct\ predictions|}{|predictions|} = \frac{tp + tn}{tp + tn + fp + fn}\) + where \(tp\) is the number of true positives, \(tn\) is the number of true + negatives, \(fp\) is the number of false positives, and \(tp\) is the number + of false negatives. \section{Results} \label{sec:results} @@ -322,7 +293,11 @@ network and traditional machine learning technique} \label{tab:results} \end{table} - + + We can see by the results that Deep Neural Networks outperform our benchmark + classification models, although the time required to train these networks is + significantly greater. + \section{Conclusion} \label{sec:conclusion} Image from the ``Where's Waldo?'' puzzle books are ideal images to test