week 10 pretty much done
This commit is contained in:
parent
f6a0fa74bf
commit
8df54fcbf8
@ -27,18 +27,13 @@
|
||||
\maketitle
|
||||
|
||||
\section{Introduction}
|
||||
For a lot of research comparisons are made between different algorithms. Why
|
||||
is one algorithm better than another? Programming will generally teach you
|
||||
that an algorithm is better if it can be executed faster, but this is not
|
||||
always true. The behaviour of different algorithms must be studied in relation
|
||||
to its input and it gets even more complicated when random values are used. In
|
||||
this assignment we will compare two algorithms for the ``Dawkin's weasel''
|
||||
problem. Both algorithms are based on randomisation: the first algorithm is a
|
||||
simple hill climbing algorithm and the second algorithm is a genetic
|
||||
algorithm.
|
||||
|
||||
\tab
|
||||
For a lot of research comparisons are made between different algorithms. Why is one algorithm better than another? Programming will generally teach you that an algorithm is better if it can be executed faster, but this is not always true. The behaviour of different algorithms must be studied in relation to its input and it gets even more complicated when random values are used. In this assignment we will compare two algorithms for the ``Dawkin's weasel'' problem. Both algorithms are based on randomisation: the first algorithm is a simple hill climbing algorithm and the second algorithm is a genetic algorithm.
|
||||
\\
|
||||
\par
|
||||
In the hill climbing algorithm, letters in a string of characters of fixed length are randomly generated, and any letter(s) in the correct spot are fixed in place for the next iterations (until the complete words is found). The genetic algorithm finds a character string by treating the alphabet of characters as the population, and choosing which parts of the string to propagate based on the fittness of the ``parent'' strings,
|
||||
\\
|
||||
\section{Hill Climbing and Genetic Algorithms}
|
||||
% Describe methods
|
||||
\tab
|
||||
The experiment compared the capability of two algorithms to generate words from scratch. The first algorithm, the hill climbing approach, randomly ``guesses'' each character of the required word, and fixes the ones that are correctly guessed in their respective place. The second approach however, uses a genetic algorithm to generate the words by ``breeding'' the most correct words at each iteration.
|
||||
\\
|
||||
@ -55,17 +50,8 @@
|
||||
\par
|
||||
Both algorithms appear to scale linearly with the number of words (after words of length 2), and the genetic algorithm consistently requires many more time steps (approximately an order of magnitude) than the hill climbing algorithm to find the words. The reason for the spike in the genetic algorithm for words of length 2, as well as the overall relative performance of the algorithm, may be one of the central tenants of the genetic algorithm; to replicate and propagate the correct/desirable features of solutions at each step, to solutions in the proceeding steps. For each iteration of solutions, this means favouring the reoccurrence of correct letters in the following iteration's solutions. This becomes problematic for generating words, as they are typically short (compared to the alphabet size) and do not often contain a high number of repeating letters.
|
||||
\\
|
||||
\textbf{*** How great is the range of variation in the time taken to reach a perfect match? ***}
|
||||
\par
|
||||
In order to assess the \textit{rate} at which each method correctly finds a word, the fitness (percentage of correct letters) of the Hill Climbing algorithm (Figure \ref{fig:fitness1}) and the Genetic Algorithm (Figure \ref{fig:fitness2}) were recorded at every iteration for a four letter word. These plots indicate that for a four letter word, the fitness increases much faster (takes less iterations) for the Hill Climbing algorithm than the Genetic algorithm. Figure \ref{fig:fitness2} exhibits a linear growth, while Figure \ref{fig:fitness1} presents a much steeper linear (near super linear) growth before finding the correct word. This may also be due to the aforementioned property of the genetic algorithm disagreeing with the formulation of many words; the encouragement of repetitive patterns/letters.
|
||||
\\
|
||||
\par
|
||||
\textbf{*** Matching a given string is an artificial problem (we already know the answer). Based on your tests,
|
||||
what can you say about the ability of the two approaches for solving real problems? ***}
|
||||
% explores a larger solution space
|
||||
% answer is not already known in real world mostly
|
||||
% there are repeating patterns in nature
|
||||
% different alphabet
|
||||
In order to assess the \textit{rate} at which each method correctly finds a word, the fitness (percentage of correct letters) of the hill climbing algorithm (Figure \ref{fig:fitness1}) and the genetic algorithm (Figure \ref{fig:fitness2}) were recorded at every iteration for a four letter word. These plots indicate that for a four letter word, the fitness increases much faster (takes less iterations) for the hill climbing algorithm than the genetic algorithm. Figure \ref{fig:fitness2} exhibits a linear growth, while Figure \ref{fig:fitness1} presents a much steeper linear (near super linear) growth before finding the correct word. This may also be due to the aforementioned property of the genetic algorithm disagreeing with the formulation of many words; the encouragement of repetitive patterns/letters.
|
||||
\\
|
||||
\begin{figure}[H]
|
||||
\includegraphics[scale=0.55]{chart-1}
|
||||
@ -82,9 +68,11 @@
|
||||
\caption{Repeated measurements (five) of the fitness of the genetic algorithm (as a percentage of the word) against the number of iterations taken}
|
||||
\label{fig:fitness2}
|
||||
\end{figure}
|
||||
|
||||
|
||||
\section{Conclusion}
|
||||
\textbf{*** Which algorithm performs better on this task? What is your evidence? What do you think makes
|
||||
it perform better? ***}
|
||||
% Make sure Qs are answered
|
||||
This report compares the hill climbing algorithm and the genetic algorithm for the task of generating words from scratch. The number of time steps required to generate word of varying length was repeatedly measured (to gain a more precise result) and graphed. It was found that the hill climbing approach was more suitable to the task, with the number of time steps required to match a string scaling linearly at a much lower rate than using the genetic algorithm.
|
||||
\\
|
||||
\par
|
||||
The encouragement of letter repetition seemed to be a limiting factor for the genetic algorithm. However, this can be interpreted as the result of the artificial conditions imposed during this investigation: that each letter in the constructed strings must be drawn from the complete english alphabet, and that the correct string is known. In the more general case of string finding problems, neither of these conditions may be true. For example, in DNA sequencing for biological problems, the alphabet that comprises a DNA sequence consists of four letters (G, T, C, and A). In this case there are only four letters to draw from, and letters (as well as sequences of letters) often repeat. For this purpose, the genetic algorithm may outperform the hill climbing approach.
|
||||
\\
|
||||
\end{document}
|
||||
|
Reference in New Issue
Block a user