thesis/content/using_the_framework.tex

\chapter{Using the Framework}\label{usingtfw}

In this section I am going to dive into further detail on how client code is supposed
to use the framework, some of the design decisions behind this and how everything is
is integrated into the \code{solvable} Docker image.

To use the framework one has to do several things to get started.
The main points include:
\begin{itemize}
    \item Setting up a development environment
    \item Defining an FSM to describe the flow of the tutorial and implementing proper callbacks
          for this machine, such as ones that display messages to the user
    \item Implementing the required event handlers, which may trigger state transitions in the FSM,
          interact with non-TFW code and do various things that might be needed during an exercise,
          such as compiling code written by the user or running unit tests
    \item Defining what processes should run inside the container besides the things TFW
          starts automatically
    \item Setting up reverse proxying for any user-facing network application such as web servers
\end{itemize}
At first all these tasks can seem quite overwhelming.
Remember that \emph{witchcraft} is what we practice here after all.
To overcome the high initial learning curve of getting familiar with the framework
I have created a repository called \emph{test-tutorial-framework} with the purpose of
providing a project template for developers looking to create challenges using the
framework.
This repository is a really simple client codebase that is suitable for
developing TFW itself as well (a good place to host tests for the framework).

It also provides an ``industry standard'' \code{hack} directory
containing bash scripts that make the development of tutorials and TFW itself very convenient.
These scripts span from bootstrapping a complete development environment in one command,
to building and running challenges based on the framework. 
Let us take a quick look at the \emph{test-tutorial-framework} project to acquire a greater
understanding of how the framework interacts with client code.

\section{Project Structure}

\begin{lstlisting}[
    caption={The project structure of test-tutorial-framework},
    captionpos=b]
.
|--config.yml
|
|--hack/
|   |--tfw.sh
|   |--...
|
|--controller/
|   |--Dockerfile
|   |--...
|
|--solvable/
    |--Dockerfile
    |--...
\end{lstlisting}

\subsection{Avatao Specific Files}
The \code{config.yml} file is an Avatao challenge configuration file,
which is used describe what kind of Docker containers implement a challenge,
what ports do they expose talking what protocols, define the name of the
exercise, it's difficulty, and so on.
Every Avatao challenge must provide such a file.
Another thing that is not even indicated on the structure above is the \code{metadata}
directory, which contains the short and long descriptions of challenges in
Markdown format.

The Tutorial Framework does not use these files in any way whatsoever,
these are only required to make the tutorial function on the Avatao platform.

\subsection{Controller Image}
It was previously mentioned that the \code{controller} Docker image is responsible
for the solution checking of challenges (whether the user has completed the exercise or not).
Currently this image is maintained in the test-tutorial-framework repository.
It is a really simple Python server which functions as a TFW event handler as well.
It subscribes to the FSM update messages
broadcasted by the \code{FSMManagingEventHandler}, we have discussed previously,
this way it is capable of keeping track of the state of the tutorial,
which allows it to detect if the final state of the FSM is reached.

\subsection{Solvable Image}
Currently the Tutorial Framework is maintained in three git repositories:
\begin{description}
      \item[baseimage-tutorial-framework:] Docker baseimage (contains all backend logic)
      \item[frontend-tutorial-framework:] Angular frontend
      \item[test-tutorial-framework:] An example tutorial built using baseimage and frontend
\end{description}
Every tutorial based on the framework must use the TFW baseimage as the parent of
it's own \code{solvable} image, using the \code{FROM}%
\footnote{\href{https://docs.docker.com/engine/reference/builder/\#from}
{https://docs.docker.com/engine/reference/builder/\#from}}
Dockerfile command.
Being an example project of the framework this is the case with
test-tutorial-framework as well.

\section{Details of the Solvable Image}
Let us dive into greater detail on how the \code{solvable} Docker image of the
test-tutorial-framework operates.
The directory structure is as follows:
\begin{lstlisting}
solvable/
|--Dockerfile
|--frontend/
|--supervisor/
|--nginx/
|--src/
\end{lstlisting}
I am going to discuss these one by one.

\subsection{Dockerfile}
Since this is a Docker image it must define a \code{Dockerfile}.
This image always uses the baseimage of the framework as a parent image.
Besides this developers can use this as a regular \code{Dockerfile} to work with
in any way they see fit to implement their tutorial.
This means that developers looking to create content on Avatao, be that
with the Tutorial Framework or without it must be familiar with Docker,
as they will have to set everything up to work inside a container.

\subsection{Frontend}
This directory is designed to contain a clone of the frontend repository.
By default it is empty and it's contents will be put in place during the
setup of the development environment.

\subsection{Supervisor}
As previously mentioned, the framework uses supervisor to run several processes
inside a Docker container.
Usually Docker containers only run a single process and developers simply start
more containers instead of processes if required (and use tools such as docker-compose%
\footnote{\href{https://docs.docker.com/compose/}{https://docs.docker.com/compose/}}
or kubernetes%
\footnote{\href{https://kubernetes.io}{https://kubernetes.io}}
to orchestrate their containers).
This approach is not suitable for TFW, as it would require the framework to orchestrate
Docker containers from inside a container managed by the same Docker daemon, which is
feasible in theory but very hard and unserviceable to do in practice.
This would require doing something like mounting the UNIX domain socket used
to manage the Docker daemon inside a running container managed by that daemon,
which is a fun thing to
play around with in my free time but not something suitable for running in production,
not even mentioning the severe security implications of doing something like that.

Supervisor is a process control system designed to be able to work with
processes on UNIX-like operating systems.
When a tutorial built on TFW is started, a Docker container starts with supervisor running as
PID\footnote{Process ID, on UNIX-like systems the \code{init} program is the first
process started, and who gets PID 1 traditionally.} 1, which in turn starts all the
programs defined in the \code{solvable/supervisor} directory.
Content creators can use supervisor configuration files to define these programs.
For example, a developer would write a file similar to this one and place it into the
\code{solvable/supervisor} directory to run a web server written in Python:
\begin{lstlisting}
[program:yourprogram]
user=user
directory=/home/user/example/
command=python3 server.py
autostart=true
\end{lstlisting}
As mentioned earlier in~\ref{processmanagement}, any program that is started this way
can be managed by the framework using API messages.
All this is possible through using the xmlrpc%
\footnote{\href{https://docs.python.org/3/library/xmlrpc.html}
{https://docs.python.org/3/library/xmlrpc.html}}
API exposed by supervisor, which allows the framework to interact with it to control processes.
This API is quite flexible and can be used to achieve a number of things which would be
clumsy to do without using it (i.e.\ supervisor has a command line utility called
\code{supervisorctl} that exposes similar functionality to the xmlrpc bindings,
but it is better to communicate with the supervisor daemon directly than to
invoke it's command line utility in a separate process when you need something done).

\subsection{Nginx}
For simplicity, exercises based on the framework only expose a single port from the
\code{solvable} container.
This port is required to serve the frontend of the framework.
If this is the case, how do we run additional web applications to showcase vulnerabilities
on during a tutorial?
Since one port can only be bound by one process at a time, we will need to
run a reverse-proxy%
\footnote{\href{https://www.nginx.com/resources/glossary/reverse-proxy-server/}
{https://www.nginx.com/resources/glossary/reverse-proxy-server/}} server inside the
container to
bind the exposed port and redirect traffic to other web servers binding non-exposed ports.

To support this, TFW automatically starts an nginx web server. It uses this nginx
instance to serve the framework frontend as well.
It is possible to supply additional configurations to this server in a convenient manner:
any configuration files placed into the \code{solvable/nginx} directory will be
interpreted by nginx once the container has started.
To set up the reverse-proxying of a web server running on port 3333,
one would write a configuration file similar to this one:
\begin{lstlisting}
location /yoururl {
    proxy_pass http://127.0.0.1:3333;
}
\end{lstlisting}
Now the content served by this web server on port 3333
will be available on the URL \code{<challenge-url>/yoururl} despite that port 3333
does not accept connections from outside the container as it is not exposed.
It is very important to understand, that developers
have to make sure that their web application \emph{behaves well} behind a reverse proxy.
What this means is that they are going to be served from a ``subdirectory'' of the top
level URL\@:
for example \code{/register} will be served under \code{/yoururl/register}.
This means that all links in the final HTML must refer to the proxied URLs, e.g.\
\code{/yoururl/login}, and server-side redirects must point to these correct hrefs as well.
Idiomatically this is usually implemented by supplying a \code{BASEURL}
to the application through an environment variable, so that it is able to set
itself up correctly.

\subsection{Copying Configuration Files}
Behind the curtains, the Tutorial Framework uses some Dockerfile trickery to
facilitate the copying of supervisor and nginx configuration files to their correct
locations.
Normally when one uses the \code{COPY}%
\footnote{\href{https://docs.docker.com/engine/reference/builder/\#copy}
{https://docs.docker.com/engine/reference/builder/\#copy}}
command to create a layer%
\footnote{\href{https://docs.docker.com/storage/storagedriver/}
{https://docs.docker.com/storage/storagedriver/}} in a Docker image,
this action takes place on building that image (i.e.\ in the \emph{build context}
of that image).
This is not good for this use case: when building the framework baseimage,
these configuration files that will be written by content developers using TFW in
the future do not even exist yet.
How could we copy files into an image layer that will be created in the future?

It is possible to use a command called \code{ONBUILD}%
\footnote{\href{https://docs.docker.com/engine/reference/builder/\#onbuild}
{https://docs.docker.com/engine/reference/builder/\#onbuild}}
in the Dockerfile of a baseimage to delay another command
to the point in time where other images will use the baseimage
as a parent with the \code{FROM} command. This makes it possible to execute
commands in the build context of the descendant image.
This is great, because the config files we need \emph{will} exist in the build
context of the \code{solvable} image of test-tutorial-framework.
In practice this looks something like this in the baseimage \code{Dockerfile}:
\begin{lstlisting}
ONBUILD COPY ${BUILD_CONTEXT}/nginx/ ${TFW_NGINX_COMPONENTS}
ONBUILD COPY ${BUILD_CONTEXT}/supervisor/ ${TFW_SUPERVISORD_COMPONENTS}
\end{lstlisting}
It is important to keep in mind however, that the layers created by these
\code{ONBUILD} commands will only be available \emph{after} the \code{FROM}
command is executed when building the child image \emph{in the future}.
This means that if you want to
do something with these files in the baseimage build after they have
been copied, those things must be done in \code{ONBUILD} commands as well.

\subsection{Source Directory}
The \code{src} directory usually holds tutorial-specific code, such as
the implementations of event handlers, the framework FSM, additional web applications
served by the exercise and generally anything that won't fit in the other,
framework-specific directories.
The use of this directory is not mandatory, only a good practice, as developers
are free to implement the non-TFW parts of their exercises as they see fit
(the copying of these files into image layers using \code{solvable/Dockerfile}
is their responsibility as well).

\section{Configuring Built-in Components}

The configuration of built-ins is generally done in two different ways.
For the frontend (Angular) side, developers can edit a \code{config.ts} file,
which is full of key-value pairs of configurable frontend functionality.
These pairs are generally pretty self-documenting:
\lstinputlisting[
    caption={Example of the frontend \code{config.ts} file (shortened to save space)},
    captionpos=b
]{listings/config.ts}
Configuring built-in event handlers is possible by editing the Python file they are
initialized in, which exposes several communicative options through the
\code{__init__()} methods of these event handlers:
\lstinputlisting[
    language=python,
    caption={Example of initializing some event handlers},
    captionpos=b
]{listings/event_handler_main.py}

\section{Implementing a Finite State Machine}

The Tutorial Framework allows developers to define state machines in two ways,
as discussed before.
I am going to display the implementation of the same FSM using these methods
to showcase the capabilities of the framework.

\subsection{YAML based FSM}
YAML\footnote{YAML Ain't Markup Language: \href{http://yaml.org}{http://yaml.org}}
is a human friendly data serialization standard and a superset of JSON\@.
It is possible to use this format to define a state machine like so:
\lstinputlisting[
    caption={A Finite State Machine implemented in YAML},
    captionpos=b
]{listings/test_fsm.yml}
This state machine has two states, state \code{0} and \code{1}.
It defines a single transition between them, \code{step_1}.
On entering state \code{1} the FSM will write a message to the frontend messaging component
by invoking TFW library code using Python.
The transition can only occur if the file \code{allow_step_1} exists.

YAML based state machine implementations also allow the usage of the Jinja2%
\footnote{\href{http://jinja.pocoo.org/docs/2.10/}{http://jinja.pocoo.org/docs/2.10/}}
templating language to substitute variables into the YAML file.
These substitutions are really powerful, as one could even iterate through arrays
or invoke functions that produce strings to be inserted using this method.
This is very similar to how Ansible uses%
\footnote{\href{https://docs.ansible.com/ansible/2.6/user_guide/playbooks_templating.html}
{https://docs.ansible.com/ansible/2.6/user\_guide/playbooks\_templating.html}}
Jinja2, and I was certainly inspired by this
when coming up with this idea.
For example, if we had an FSM with five states, we could use the following
Jinja2 code to generate a transition called \code{step_next} between each state
in a \code{for} cycle:
\begin{lstlisting}
{% for i in range(5) %}
-   trigger: 'step_next'
    source: '{{i}}'
    dest: '{{i+1}}'
{% endfor %}
\end{lstlisting}

\subsection{Python based FSM}
Optionally, the same state machine can be implemented like this in Python using
TFW library code:
\lstinputlisting[
    language=python,
    caption={A Finite State Machine implemented in Python},
    captionpos=b
]{listings/test_fsm.py}

As you can see, both implementations are pretty clean and easy.
The advantage of YAML is that it makes it possible to define callbacks using virtually any
command that is available in the container, which means any
programming language is usable to implement said callbacks.
The advantage of the Python version is that since the framework is being developed in
Python as well, it is going to be easier to interface with library code.

\section{Implementing Event Handlers}

In this section I am going to showcase how implementing event handlers is possible
when using the framework.
I am going to use the Python programming language, but it isn't hard
to create event handlers in other languages, as the only thing
they have to be capable of is communicating with the TFW server using
ZeroMQ sockets, as previously discussed.
The library provided by the framework abstracts low-level socket logic
away by implementing easy to use base classes.
Creating such base classes in a given language shouldn't take longer
than a few hours for an experienced developer.
Our challenge creators have already implemented similar libraries for
Java, JavaScript and C++ as well.
\lstinputlisting[
    language=python,
    caption={A very simple event handler implemented in Python},
    captionpos=b
]{listings/event_handler_example.py}
This simple event handler subscribes to the \code{fsm_update} messages,
then the only thing it does is writing the
messages received to the messages component on the frontend.
When using the TFW library in Python, all classes inheriting from
\code{EventHandlerBase} must implement the \code{handle_event()}
abstract method, which is used to, well, handle events.

\section{Setting Up a Developer Environment}\label{devenv}

To make getting started as smooth as possible I have created
a ``bootstrap'' script which is capable of creating a development environment from
scratch.
This script is distributed as the following bash one-liner:
\begin{lstlisting}[language=bash]
bash -c "$(curl -fsSL https://git.io/vxBfj)"
\end{lstlisting}
This command downloads the script using \code{curl}%
\footnote{\href{https://curl.haxx.se}{https://curl.haxx.se}}, then executes it in bash.
In the open source community it is quite common to distribute installers this way%
\footnote{A good example of this is oh-my-zsh:
\href{https://github.com/robbyrussell/oh-my-zsh}{https://github.com/robbyrussell/oh-my-zsh}},
which might seem a little scary at first, but is not less safe than
downloading and executing a binary installer from a website with a valid TLS certificate, as
\code{curl} will fail with an error message if the certificate is invalid.
This is because both methods place their trust in the PKI~\footnote{Public Key Infrastructure}
to defend against man-in-the-middle%
\footnote{\href{https://www.owasp.org/index.php/Man-in-the-middle_attack}
{https://www.owasp.org/index.php/Man-in-the-middle\_attack}} attacks.
Debating the security of this infrastructure is certainly something that we
as an industry should constantly do, but it is out of the scope of this paper.

Nevertheless I have also created a version of this command that
checks the SHA256 checksum of the bootstrap script before executing it
(I have placed it on several lines to enhance visibility):
\begin{lstlisting}[language=bash]
URL=https://git.io/vxBfj                                             \
SHA=d81057610588e16666251a4167f05841fc8b66ccd6988490c1a2d2deb6de8ffa \
bash -c 'cmd="$(curl -fsSL $URL)" &&                                 \
         [ $(echo "$cmd" | sha256sum | cut -d " " -f1) == $SHA ] &&  \
         echo "$cmd" | bash || echo Checksum mismatch!'
\end{lstlisting}
This essentially downloads the script, places it inside a variable as a string,
then pipes it into a bash interpreter \emph{only if} the checksum
of the downloaded string matches the one provided, otherwise it displays
an error message.
Software projects distributing their product as binary installers often
display such checksums on their download pages with the purpose of potentially
mitigating MITM attacks.

The bootstrap script clones the three TFW repositories and does several steps
to create a working environment into a single directory, that is based on
test-tutorial-framework:
\begin{itemize}
      \item It builds the newest version of the TFW baseimage locally
      \item It pins the version tag of this image in \code{solvable/Dockerfile},
            so that this newly-built version will be used by the tutorial
      \item It places the latest frontend in \code{solvable/frontend} with
            dependencies installed
\end{itemize}
It is important to note that this script \emph{does not} install anything system-wide,
it only works in the directory it is being executed from.
This is a good practice, as many users --- including me --- find scripts that
write files all around the system intrusive if they could provide the same functionality
while working in a single directory.

It is also worth to mention that it would have been a lot easier to simply use Docker Hub%
\footnote{\href{https://hub.docker.com}{https://hub.docker.com}},
but since the code base is currently proprietary we cannot distribute
it using a public medium, and we use our own image registry to store private Docker
images.

\section{Building and Running a Tutorial}

After the environment has been created using the script described in~\ref{devenv},
it is possible to simply use standard Docker commands to build and run the tutorial.
Yet the \code{hack} directory of test-TFW also provides a script called
\code{tfw.sh} that developers prefer to use for building and running their
exercises.
Why is this the case?

\subsection{The Frontend Issue}

To be able to understand this, we will have to gain some understanding on how the
build process of Angular projects work.

When frontend developers work on Angular projects, they usually use the built-in
developer tool of the Angular-CLI%
\footnote{\href{https://cli.angular.io}{https://cli.angular.io}},
\code{ng serve} to build and serve their applications.
The advantage of this tool is that it automatically reloads the frontend
when the code on the disk is changed, and that it is generally very easy to work with.
On the other hand, a disadvantage is that a \code{node_modules} directory
containing all the npm%
\footnote{\href{https://www.npmjs.com}{https://www.npmjs.com}}
dependencies of the project must be present while doing so.
The problem with this is that because the JavaScript ecosystem is a \emph{huge}
mess\cite{NodeModules}, these dependencies can easily get up to
\emph{several hundreds of megabytes} in size.

To solve this issue, when creating production builds,
Angular uses various optimizations such as tree shaking%
\footnote{\href{https://webpack.js.org/guides/tree-shaking/}
{https://webpack.js.org/guides/tree-shaking/}}
to remove all the dependencies that won't be used when running the application%
\footnote{Otherwise it won't be possible to serve these applications efficiently
over the internet.}.
The problem is, that these things can take a \emph{really} long time.
This is why today frontend builds usually take a lot longer then building anything
not involving JavaScript (such as C++, C\# or any other compiled programming language).

This mess presents it's own challenges for the Tutorial Framework as well.
Since hundreds of megabytes of npm dependencies have no place inside Docker images%
\footnote{Or it may take tens of seconds just to send the build context to
the Docker daemon, which means waiting even before the build began.},
by default the framework will only copy the results of a frontend production build
of \code{solvable/frontend} into the image layers.
This slows down the build time of TFW based challenges so much, that instead of like
30 seconds, they could often take 5 to 10 minutes depending on what hardware
you use.

\subsection{The Solution Offered by the Framework}

To circumvent this, it is possible to entirely exclude the Angular frontend from a TFW
build, using build time arguments%
\footnote{In practice this is done by supplying the option
\code{--build-arg NOFRONTEND=1} to Docker.}.
But when doing so, developers would have to run the frontend locally with
the whole \code{node_modules} directory present.
The bootstrap script takes care of putting these dependencies there,
while the \code{tfw.sh} script is capable of starting a development server
to serve the frontend locally using \code{ng serve} besides starting
the Docker container without the frontend.
If this whole thing wasn't complicated enough, since Docker binds the port
the container is going to use, \code{tfw.sh} has to run the Angular dev server on
another port, then use the proxying features of Angular-CLI to forward requests
from this port to the running Docker container when requesting resources
other than the entrypoint to the Angular application.

This is the reason why the frontend is accessible through port \code{4200} (default
port for \code{ng serve}) when using \code{tfw.sh} to start a tutorial, but when running
a self-contained container built with the frontend included it is accessible on port \code{8888}
(the default port TFW uses).

While it also provides lots of other functionality, this is one of the reasons why
the \code{tfw.sh} script is a several hundreds of lines long bash script.
The implementation of making the frontend toggleable during Docker builds requires some
of the \code{ONBUILD} stuff we've discussed earlier:
\begin{lstlisting}[language=bash]
ONBUILD RUN test -z "${NOFRONTEND}"                                &&\
            cd /data && yarn install --frozen-lockfile || :

ONBUILD RUN test -z "${NOFRONTEND}"                                &&\
            cd /data && yarn build --no-progress || :

ONBUILD RUN test -z "${NOFRONTEND}"                                &&\
            mv /data/dist ${TFW_FRONTEND_DIR} && rm -rf /data || :
\end{lstlisting}
Remember that \code{ONBUILD} commands run in the build context of the child image.
What these commands do is they check if the \code{NOFRONTEND} build argument
is present or not, and only deal with the frontend if this argument is not defined.
The \code{|| :} notation in bash basically means ``or true'', which is required
to avoid aborting the build due to the non-zero return code produced
by the \code{test} command if the build arg is defined.

\section{Versioning and Releases}

Currently I use git tags%
\footnote{\href{https://git-scm.com/docs/git-tag}{https://git-scm.com/docs/git-tag}}
to manage releases of the TFW baseimage.
Each new release is a tag digitally signed by me using GnuPG so that
everyone is able to verify that the release is authentic.
The tags are named according to the versioning scheme I have adopted for the project.
I explain this versioning system extensively in a blog post\cite{SemancatVersioning}.

The short version is that we use a solution that is a mix between
semantic versioning%
\footnote{\href{https://semver.org}{https://semver.org}}
and calendar versioning%
\footnote{\href{https://calver.org}{https://calver.org}}.
Our release tags look similar to this one: \code{mainecoon-20180712}
(this was the actual tag of an older release).
The part before the ``\code{-}'' is the major version, which is always named after a breed
of cat, for the additional fun factor, and because we love cats.
The part after that is a timestamp of the day the release was made on.
I only change major versions when I introduce backwards incompatible changes in the
API of the framework, this way developers can trust that releases
with the same majors are compatible with each other in regards to client code.

The \code{master} branches of the frontend-TFW and test-TFW repositories are always
kept compatible with the newest release tag of the baseimage.
This is one of the ways the bootstrap script can operate safely: when cloning the three
repositories, it checks out the newest tag of the baseimage before building it,
ensuring that the newest released version is used for the newly created dev environment.
-												Start reading through text to fix errors

											
										
										
											2018-12-02 23:24:06 +00:00
+								\chapter{Using the Framework}\label{usingtfw}
-												Add 'Using the Framework' chapter skeleton

											
										
										
											2018-12-01 15:57:15 +00:00
 								In this section I am going to dive into further detail on how client code is supposed
-												Continue writing thesis

											
										
										
											2018-12-01 23:43:34 +00:00
+								to use the framework, some of the design decisions behind this and how everything is
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								is integrated into the \code{solvable} Docker image.
-												Add 'Using the Framework' chapter skeleton

											
										
										
											2018-12-01 15:57:15 +00:00
 								To use the framework one has to do several things to get started.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								The main points include:
-												Add 'Using the Framework' chapter skeleton

											
										
										
											2018-12-01 15:57:15 +00:00
+								\begin{itemize}
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								    \item Setting up a development environment
 								    \item Defining an FSM to describe the flow of the tutorial and implementing proper callbacks
 								          for this machine, such as ones that display messages to the user
 								    \item Implementing the required event handlers, which may trigger state transitions in the FSM,
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								          interact with non-TFW code and do various things that might be needed during an exercise,
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								          such as compiling code written by the user or running unit tests
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								    \item Defining what processes should run inside the container besides the things TFW
 								          starts automatically
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								    \item Setting up reverse proxying for any user-facing network application such as web servers
-												Add 'Using the Framework' chapter skeleton

											
										
										
											2018-12-01 15:57:15 +00:00
+								\end{itemize}
-												Continue writing thesis

											
										
										
											2018-12-01 23:43:34 +00:00
+								At first all these tasks can seem quite overwhelming.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								Remember that \emph{witchcraft} is what we practice here after all.
-												Add 'Using the Framework' chapter skeleton

											
										
										
											2018-12-01 15:57:15 +00:00
+								To overcome the high initial learning curve of getting familiar with the framework
 								I have created a repository called \emph{test-tutorial-framework} with the purpose of
 								providing a project template for developers looking to create challenges using the
 								framework.
 								This repository is a really simple client codebase that is suitable for
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								developing TFW itself as well (a good place to host tests for the framework).
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								It also provides an ``industry standard'' \code{hack} directory
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								containing bash scripts that make the development of tutorials and TFW itself very convenient.
-												Continue writing thesis

											
										
										
											2018-12-01 23:43:34 +00:00
+								These scripts span from bootstrapping a complete development environment in one command,
 								to building and running challenges based on the framework.
 								Let us take a quick look at the \emph{test-tutorial-framework} project to acquire a greater
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								understanding of how the framework interacts with client code.
-												Add 'Using the Framework' chapter skeleton

											
										
										
											2018-12-01 15:57:15 +00:00
-												Continue writing thesis

											
										
										
											2018-12-01 23:43:34 +00:00
+								\section{Project Structure}
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\begin{lstlisting}[
 								    caption={The project structure of test-tutorial-framework},
 								    captionpos=b]
 								.
 								|--config.yml
 								|
 								|--hack/
 								|   |--tfw.sh
 								|   |--...
 								|
 								|--controller/
 								|   |--Dockerfile
 								|   |--...
 								|
 								|--solvable/
 								    |--Dockerfile
 								    |--...
 								\end{lstlisting}
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								\subsection{Avatao Specific Files}
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								The \code{config.yml} file is an Avatao challenge configuration file,
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								which is used describe what kind of Docker containers implement a challenge,
 								what ports do they expose talking what protocols, define the name of the
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								exercise, it's difficulty, and so on.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								Every Avatao challenge must provide such a file.
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								Another thing that is not even indicated on the structure above is the \code{metadata}
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								directory, which contains the short and long descriptions of challenges in
 								Markdown format.
 								The Tutorial Framework does not use these files in any way whatsoever,
 								these are only required to make the tutorial function on the Avatao platform.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
 								\subsection{Controller Image}
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								It was previously mentioned that the \code{controller} Docker image is responsible
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								for the solution checking of challenges (whether the user has completed the exercise or not).
 								Currently this image is maintained in the test-tutorial-framework repository.
 								It is a really simple Python server which functions as a TFW event handler as well.
 								It subscribes to the FSM update messages
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								broadcasted by the \code{FSMManagingEventHandler}, we have discussed previously,
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								this way it is capable of keeping track of the state of the tutorial,
 								which allows it to detect if the final state of the FSM is reached.
 								\subsection{Solvable Image}
 								Currently the Tutorial Framework is maintained in three git repositories:
 								\begin{description}
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								      \item[baseimage-tutorial-framework:] Docker baseimage (contains all backend logic)
 								      \item[frontend-tutorial-framework:] Angular frontend
 								      \item[test-tutorial-framework:] An example tutorial built using baseimage and frontend
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\end{description}
 								Every tutorial based on the framework must use the TFW baseimage as the parent of
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								it's own \code{solvable} image, using the \code{FROM}%
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\footnote{\href{https://docs.docker.com/engine/reference/builder/\#from}
 								{https://docs.docker.com/engine/reference/builder/\#from}}
 								Dockerfile command.
 								Being an example project of the framework this is the case with
 								test-tutorial-framework as well.
 								\section{Details of the Solvable Image}
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								Let us dive into greater detail on how the \code{solvable} Docker image of the
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								test-tutorial-framework operates.
 								The directory structure is as follows:
 								\begin{lstlisting}
 								solvable/
 								|--Dockerfile
 								|--frontend/
 								|--supervisor/
 								|--nginx/
 								|--src/
 								\end{lstlisting}
 								I am going to discuss these one by one.
 								\subsection{Dockerfile}
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								Since this is a Docker image it must define a \code{Dockerfile}.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								This image always uses the baseimage of the framework as a parent image.
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								Besides this developers can use this as a regular \code{Dockerfile} to work with
 								in any way they see fit to implement their tutorial.
 								This means that developers looking to create content on Avatao, be that
 								with the Tutorial Framework or without it must be familiar with Docker,
 								as they will have to set everything up to work inside a container.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
 								\subsection{Frontend}
 								This directory is designed to contain a clone of the frontend repository.
 								By default it is empty and it's contents will be put in place during the
 								setup of the development environment.
 								\subsection{Supervisor}
 								As previously mentioned, the framework uses supervisor to run several processes
 								inside a Docker container.
 								Usually Docker containers only run a single process and developers simply start
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								more containers instead of processes if required (and use tools such as docker-compose%
 								\footnote{\href{https://docs.docker.com/compose/}{https://docs.docker.com/compose/}}
 								or kubernetes%
 								\footnote{\href{https://kubernetes.io}{https://kubernetes.io}}
 								to orchestrate their containers).
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								This approach is not suitable for TFW, as it would require the framework to orchestrate
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								Docker containers from inside a container managed by the same Docker daemon, which is
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								feasible in theory but very hard and unserviceable to do in practice.
 								This would require doing something like mounting the UNIX domain socket used
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								to manage the Docker daemon inside a running container managed by that daemon,
 								which is a fun thing to
 								play around with in my free time but not something suitable for running in production,
 								not even mentioning the severe security implications of doing something like that.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
 								Supervisor is a process control system designed to be able to work with
 								processes on UNIX-like operating systems.
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								When a tutorial built on TFW is started, a Docker container starts with supervisor running as
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								PID\footnote{Process ID, on UNIX-like systems the \code{init} program is the first
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								process started, and who gets PID 1 traditionally.} 1, which in turn starts all the
 								programs defined in the \code{solvable/supervisor} directory.
 								Content creators can use supervisor configuration files to define these programs.
 								For example, a developer would write a file similar to this one and place it into the
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								\code{solvable/supervisor} directory to run a web server written in Python:
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\begin{lstlisting}
 								[program:yourprogram]
 								user=user
 								directory=/home/user/example/
 								command=python3 server.py
 								autostart=true
 								\end{lstlisting}
 								As mentioned earlier in~\ref{processmanagement}, any program that is started this way
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								can be managed by the framework using API messages.
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								All this is possible through using the xmlrpc%
 								\footnote{\href{https://docs.python.org/3/library/xmlrpc.html}
 								{https://docs.python.org/3/library/xmlrpc.html}}
 								API exposed by supervisor, which allows the framework to interact with it to control processes.
 								This API is quite flexible and can be used to achieve a number of things which would be
 								clumsy to do without using it (i.e.\ supervisor has a command line utility called
 								\code{supervisorctl} that exposes similar functionality to the xmlrpc bindings,
 								but it is better to communicate with the supervisor daemon directly than to
 								invoke it's command line utility in a separate process when you need something done).
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
 								\subsection{Nginx}
 								For simplicity, exercises based on the framework only expose a single port from the
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								\code{solvable} container.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								This port is required to serve the frontend of the framework.
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								If this is the case, how do we run additional web applications to showcase vulnerabilities
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								on during a tutorial?
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								Since one port can only be bound by one process at a time, we will need to
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								run a reverse-proxy%
 								\footnote{\href{https://www.nginx.com/resources/glossary/reverse-proxy-server/}
 								{https://www.nginx.com/resources/glossary/reverse-proxy-server/}} server inside the
 								container to
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								bind the exposed port and redirect traffic to other web servers binding non-exposed ports.
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								To support this, TFW automatically starts an nginx web server. It uses this nginx
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								instance to serve the framework frontend as well.
 								It is possible to supply additional configurations to this server in a convenient manner:
 								any configuration files placed into the \code{solvable/nginx} directory will be
 								interpreted by nginx once the container has started.
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								To set up the reverse-proxying of a web server running on port 3333,
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								one would write a configuration file similar to this one:
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\begin{lstlisting}
 								location /yoururl {
 								    proxy_pass http://127.0.0.1:3333;
 								}
 								\end{lstlisting}
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								Now the content served by this web server on port 3333
 								will be available on the URL \code{<challenge-url>/yoururl} despite that port 3333
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								does not accept connections from outside the container as it is not exposed.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								It is very important to understand, that developers
 								have to make sure that their web application \emph{behaves well} behind a reverse proxy.
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								What this means is that they are going to be served from a ``subdirectory'' of the top
 								level URL\@:
 								for example \code{/register} will be served under \code{/yoururl/register}.
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								This means that all links in the final HTML must refer to the proxied URLs, e.g.\
 								\code{/yoururl/login}, and server-side redirects must point to these correct hrefs as well.
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								Idiomatically this is usually implemented by supplying a \code{BASEURL}
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								to the application through an environment variable, so that it is able to set
 								itself up correctly.
 								\subsection{Copying Configuration Files}
 								Behind the curtains, the Tutorial Framework uses some Dockerfile trickery to
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								facilitate the copying of supervisor and nginx configuration files to their correct
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								locations.
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								Normally when one uses the \code{COPY}%
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\footnote{\href{https://docs.docker.com/engine/reference/builder/\#copy}
 								{https://docs.docker.com/engine/reference/builder/\#copy}}
 								command to create a layer%
 								\footnote{\href{https://docs.docker.com/storage/storagedriver/}
 								{https://docs.docker.com/storage/storagedriver/}} in a Docker image,
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								this action takes place on building that image (i.e.\ in the \emph{build context}
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								of that image).
 								This is not good for this use case: when building the framework baseimage,
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								these configuration files that will be written by content developers using TFW in
 								the future do not even exist yet.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								How could we copy files into an image layer that will be created in the future?
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								It is possible to use a command called \code{ONBUILD}%
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\footnote{\href{https://docs.docker.com/engine/reference/builder/\#onbuild}
 								{https://docs.docker.com/engine/reference/builder/\#onbuild}}
 								in the Dockerfile of a baseimage to delay another command
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								to the point in time where other images will use the baseimage
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								as a parent with the \code{FROM} command. This makes it possible to execute
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								commands in the build context of the descendant image.
 								This is great, because the config files we need \emph{will} exist in the build
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								context of the \code{solvable} image of test-tutorial-framework.
 								In practice this looks something like this in the baseimage \code{Dockerfile}:
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\begin{lstlisting}
 								ONBUILD COPY ${BUILD_CONTEXT}/nginx/ ${TFW_NGINX_COMPONENTS}
 								ONBUILD COPY ${BUILD_CONTEXT}/supervisor/ ${TFW_SUPERVISORD_COMPONENTS}
 								\end{lstlisting}
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								It is important to keep in mind however, that the layers created by these
 								\code{ONBUILD} commands will only be available \emph{after} the \code{FROM}
 								command is executed when building the child image \emph{in the future}.
 								This means that if you want to
 								do something with these files in the baseimage build after they have
 								been copied, those things must be done in \code{ONBUILD} commands as well.
-												Continue writing thesis

											
										
										
											2018-12-01 23:43:34 +00:00
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\subsection{Source Directory}
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								The \code{src} directory usually holds tutorial-specific code, such as
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								the implementations of event handlers, the framework FSM, additional web applications
 								served by the exercise and generally anything that won't fit in the other,
 								framework-specific directories.
 								The use of this directory is not mandatory, only a good practice, as developers
 								are free to implement the non-TFW parts of their exercises as they see fit
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								(the copying of these files into image layers using \code{solvable/Dockerfile}
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								is their responsibility as well).
-												Continue writing thesis

											
										
										
											2018-12-01 23:43:34 +00:00
-												Continue writing thesis

											
										
										
											2018-12-02 20:01:18 +00:00
+								\section{Configuring Built-in Components}
 								The configuration of built-ins is generally done in two different ways.
 								For the frontend (Angular) side, developers can edit a \code{config.ts} file,
 								which is full of key-value pairs of configurable frontend functionality.
 								These pairs are generally pretty self-documenting:
 								\lstinputlisting[
 								    caption={Example of the frontend \code{config.ts} file (shortened to save space)},
 								    captionpos=b
 								]{listings/config.ts}
 								Configuring built-in event handlers is possible by editing the Python file they are
 								initialized in, which exposes several communicative options through the
 								\code{__init__()} methods of these event handlers:
 								\lstinputlisting[
 								    language=python,
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								    caption={Example of initializing some event handlers},
-												Continue writing thesis

											
										
										
											2018-12-02 20:01:18 +00:00
+								    captionpos=b
 								]{listings/event_handler_main.py}
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								\section{Implementing a Finite State Machine}
-												Continue writing thesis

											
										
										
											2018-12-01 23:43:34 +00:00
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								The Tutorial Framework allows developers to define state machines in two ways,
 								as discussed before.
 								I am going to display the implementation of the same FSM using these methods
 								to showcase the capabilities of the framework.
-												Continue writing thesis

											
										
										
											2018-12-01 23:43:34 +00:00
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\subsection{YAML based FSM}
-												Fix some of the footnotes

											
										
										
											2018-12-03 15:47:09 +00:00
+								YAML\footnote{YAML Ain't Markup Language: \href{http://yaml.org}{http://yaml.org}}
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								is a human friendly data serialization standard and a superset of JSON\@.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								It is possible to use this format to define a state machine like so:
 								\lstinputlisting[
 								    caption={A Finite State Machine implemented in YAML},
 								    captionpos=b
 								]{listings/test_fsm.yml}
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								This state machine has two states, state \code{0} and \code{1}.
 								It defines a single transition between them, \code{step_1}.
 								On entering state \code{1} the FSM will write a message to the frontend messaging component
 								by invoking TFW library code using Python.
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								The transition can only occur if the file \code{allow_step_1} exists.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
 								YAML based state machine implementations also allow the usage of the Jinja2%
 								\footnote{\href{http://jinja.pocoo.org/docs/2.10/}{http://jinja.pocoo.org/docs/2.10/}}
 								templating language to substitute variables into the YAML file.
 								These substitutions are really powerful, as one could even iterate through arrays
 								or invoke functions that produce strings to be inserted using this method.
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								This is very similar to how Ansible uses%
 								\footnote{\href{https://docs.ansible.com/ansible/2.6/user_guide/playbooks_templating.html}
 								{https://docs.ansible.com/ansible/2.6/user\_guide/playbooks\_templating.html}}
 								Jinja2, and I was certainly inspired by this
 								when coming up with this idea.
 								For example, if we had an FSM with five states, we could use the following
 								Jinja2 code to generate a transition called \code{step_next} between each state
 								in a \code{for} cycle:
 								\begin{lstlisting}
 								{% for i in range(5) %}
 								-   trigger: 'step_next'
 								    source: '{{i}}'
 								    dest: '{{i+1}}'
 								{% endfor %}
 								\end{lstlisting}
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
 								\subsection{Python based FSM}
 								Optionally, the same state machine can be implemented like this in Python using
 								TFW library code:
 								\lstinputlisting[
 								    language=python,
 								    caption={A Finite State Machine implemented in Python},
 								    captionpos=b
 								]{listings/test_fsm.py}
 								As you can see, both implementations are pretty clean and easy.
 								The advantage of YAML is that it makes it possible to define callbacks using virtually any
 								command that is available in the container, which means any
 								programming language is usable to implement said callbacks.
 								The advantage of the Python version is that since the framework is being developed in
 								Python as well, it is going to be easier to interface with library code.
-												Continue writing thesis

											
										
										
											2018-12-02 20:01:18 +00:00
+								\section{Implementing Event Handlers}
 								In this section I am going to showcase how implementing event handlers is possible
 								when using the framework.
 								I am going to use the Python programming language, but it isn't hard
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								to create event handlers in other languages, as the only thing
-												Continue writing thesis

											
										
										
											2018-12-02 20:01:18 +00:00
+								they have to be capable of is communicating with the TFW server using
 								ZeroMQ sockets, as previously discussed.
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								The library provided by the framework abstracts low-level socket logic
-												Continue writing thesis

											
										
										
											2018-12-02 20:01:18 +00:00
+								away by implementing easy to use base classes.
 								Creating such base classes in a given language shouldn't take longer
 								than a few hours for an experienced developer.
 								Our challenge creators have already implemented similar libraries for
 								Java, JavaScript and C++ as well.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\lstinputlisting[
 								    language=python,
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								    caption={A very simple event handler implemented in Python},
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								    captionpos=b
-												Continue writing thesis

											
										
										
											2018-12-02 20:01:18 +00:00
+								]{listings/event_handler_example.py}
 								This simple event handler subscribes to the \code{fsm_update} messages,
 								then the only thing it does is writing the
 								messages received to the messages component on the frontend.
 								When using the TFW library in Python, all classes inheriting from
 								\code{EventHandlerBase} must implement the \code{handle_event()}
 								abstract method, which is used to, well, handle events.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								\section{Setting Up a Developer Environment}\label{devenv}
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
 								To make getting started as smooth as possible I have created
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								a ``bootstrap'' script which is capable of creating a development environment from
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								scratch.
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								This script is distributed as the following bash one-liner:
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\begin{lstlisting}[language=bash]
 								bash -c "$(curl -fsSL https://git.io/vxBfj)"
 								\end{lstlisting}
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								This command downloads the script using \code{curl}%
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								\footnote{\href{https://curl.haxx.se}{https://curl.haxx.se}}, then executes it in bash.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								In the open source community it is quite common to distribute installers this way%
-												Fix some of the footnotes

											
										
										
											2018-12-03 15:47:09 +00:00
+								\footnote{A good example of this is oh-my-zsh:
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\href{https://github.com/robbyrussell/oh-my-zsh}{https://github.com/robbyrussell/oh-my-zsh}},
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								which might seem a little scary at first, but is not less safe than
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								downloading and executing a binary installer from a website with a valid TLS certificate, as
 								\code{curl} will fail with an error message if the certificate is invalid.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								This is because both methods place their trust in the PKI~\footnote{Public Key Infrastructure}
 								to defend against man-in-the-middle%
 								\footnote{\href{https://www.owasp.org/index.php/Man-in-the-middle_attack}
 								{https://www.owasp.org/index.php/Man-in-the-middle\_attack}} attacks.
 								Debating the security of this infrastructure is certainly something that we
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								as an industry should constantly do, but it is out of the scope of this paper.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
 								Nevertheless I have also created a version of this command that
 								checks the SHA256 checksum of the bootstrap script before executing it
 								(I have placed it on several lines to enhance visibility):
 								\begin{lstlisting}[language=bash]
 								URL=https://git.io/vxBfj                                             \
 								SHA=d81057610588e16666251a4167f05841fc8b66ccd6988490c1a2d2deb6de8ffa \
 								bash -c 'cmd="$(curl -fsSL $URL)" &&                                 \
 								         [ $(echo "$cmd" | sha256sum | cut -d " " -f1) == $SHA ] &&  \
 								         echo "$cmd" | bash || echo Checksum mismatch!'
 								\end{lstlisting}
 								This essentially downloads the script, places it inside a variable as a string,
 								then pipes it into a bash interpreter \emph{only if} the checksum
 								of the downloaded string matches the one provided, otherwise it displays
 								an error message.
 								Software projects distributing their product as binary installers often
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								display such checksums on their download pages with the purpose of potentially
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								mitigating MITM attacks.
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
 								The bootstrap script clones the three TFW repositories and does several steps
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								to create a working environment into a single directory, that is based on
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								test-tutorial-framework:
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\begin{itemize}
 								      \item It builds the newest version of the TFW baseimage locally
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								      \item It pins the version tag of this image in \code{solvable/Dockerfile},
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								            so that this newly-built version will be used by the tutorial
-												Use lstinline to display inline code instead of texttt

											
										
										
											2018-12-02 15:44:31 +00:00
+								      \item It places the latest frontend in \code{solvable/frontend} with
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								            dependencies installed
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\end{itemize}
 								It is important to note that this script \emph{does not} install anything system-wide,
 								it only works in the directory it is being executed from.
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								This is a good practice, as many users --- including me --- find scripts that
 								write files all around the system intrusive if they could provide the same functionality
 								while working in a single directory.
-												Continue writing thesis

											
										
										
											2018-12-01 23:43:34 +00:00
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								It is also worth to mention that it would have been a lot easier to simply use Docker Hub%
-												Continue writig thesis

											
										
										
											2018-12-02 15:02:56 +00:00
+								\footnote{\href{https://hub.docker.com}{https://hub.docker.com}},
 								but since the code base is currently proprietary we cannot distribute
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								it using a public medium, and we use our own image registry to store private Docker
 								images.
 								\section{Building and Running a Tutorial}
 								After the environment has been created using the script described in~\ref{devenv},
 								it is possible to simply use standard Docker commands to build and run the tutorial.
 								Yet the \code{hack} directory of test-TFW also provides a script called
 								\code{tfw.sh} that developers prefer to use for building and running their
 								exercises.
 								Why is this the case?
 								\subsection{The Frontend Issue}
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								To be able to understand this, we will have to gain some understanding on how the
 								build process of Angular projects work.
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
 								When frontend developers work on Angular projects, they usually use the built-in
 								developer tool of the Angular-CLI%
 								\footnote{\href{https://cli.angular.io}{https://cli.angular.io}},
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								\code{ng serve} to build and serve their applications.
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								The advantage of this tool is that it automatically reloads the frontend
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								when the code on the disk is changed, and that it is generally very easy to work with.
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								On the other hand, a disadvantage is that a \code{node_modules} directory
 								containing all the npm%
 								\footnote{\href{https://www.npmjs.com}{https://www.npmjs.com}}
 								dependencies of the project must be present while doing so.
 								The problem with this is that because the JavaScript ecosystem is a \emph{huge}
 								mess\cite{NodeModules}, these dependencies can easily get up to
 								\emph{several hundreds of megabytes} in size.
 								To solve this issue, when creating production builds,
 								Angular uses various optimizations such as tree shaking%
 								\footnote{\href{https://webpack.js.org/guides/tree-shaking/}
 								{https://webpack.js.org/guides/tree-shaking/}}
 								to remove all the dependencies that won't be used when running the application%
 								\footnote{Otherwise it won't be possible to serve these applications efficiently
-												Fix some of the footnotes

											
										
										
											2018-12-03 15:47:09 +00:00
+								over the internet.}.
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								The problem is, that these things can take a \emph{really} long time.
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								This is why today frontend builds usually take a lot longer then building anything
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								not involving JavaScript (such as C++, C\# or any other compiled programming language).
 								This mess presents it's own challenges for the Tutorial Framework as well.
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								Since hundreds of megabytes of npm dependencies have no place inside Docker images%
 								\footnote{Or it may take tens of seconds just to send the build context to
-												Fix some of the footnotes

											
										
										
											2018-12-03 15:47:09 +00:00
+								the Docker daemon, which means waiting even before the build began.},
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								by default the framework will only copy the results of a frontend production build
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								of \code{solvable/frontend} into the image layers.
 								This slows down the build time of TFW based challenges so much, that instead of like
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+seconds, they could often take 5 to 10 minutes depending on what hardware
 								you use.
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
 								\subsection{The Solution Offered by the Framework}
 								To circumvent this, it is possible to entirely exclude the Angular frontend from a TFW
 								build, using build time arguments%
 								\footnote{In practice this is done by supplying the option
-												Fix some of the footnotes

											
										
										
											2018-12-03 15:47:09 +00:00
+								\code{--build-arg NOFRONTEND=1} to Docker.}.
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								But when doing so, developers would have to run the frontend locally with
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
+								the whole \code{node_modules} directory present.
 								The bootstrap script takes care of putting these dependencies there,
 								while the \code{tfw.sh} script is capable of starting a development server
 								to serve the frontend locally using \code{ng serve} besides starting
 								the Docker container without the frontend.
 								If this whole thing wasn't complicated enough, since Docker binds the port
-												Mostly finish writing thesis

											
										
										
											2018-12-03 15:38:22 +00:00
+								the container is going to use, \code{tfw.sh} has to run the Angular dev server on
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								another port, then use the proxying features of Angular-CLI to forward requests
 								from this port to the running Docker container when requesting resources
 								other than the entrypoint to the Angular application.
-												Continue writing thesis

											
										
										
											2018-12-02 17:02:48 +00:00
 								This is the reason why the frontend is accessible through port \code{4200} (default
 								port for \code{ng serve}) when using \code{tfw.sh} to start a tutorial, but when running
 								a self-contained container built with the frontend included it is accessible on port \code{8888}
 								(the default port TFW uses).
 								While it also provides lots of other functionality, this is one of the reasons why
 								the \code{tfw.sh} script is a several hundreds of lines long bash script.
 								The implementation of making the frontend toggleable during Docker builds requires some
 								of the \code{ONBUILD} stuff we've discussed earlier:
 								\begin{lstlisting}[language=bash]
 								ONBUILD RUN test -z "${NOFRONTEND}"                                &&\
 								            cd /data && yarn install --frozen-lockfile || :
 								ONBUILD RUN test -z "${NOFRONTEND}"                                &&\
 								            cd /data && yarn build --no-progress || :
 								ONBUILD RUN test -z "${NOFRONTEND}"                                &&\
 								            mv /data/dist ${TFW_FRONTEND_DIR} && rm -rf /data || :
 								\end{lstlisting}
 								Remember that \code{ONBUILD} commands run in the build context of the child image.
 								What these commands do is they check if the \code{NOFRONTEND} build argument
 								is present or not, and only deal with the frontend if this argument is not defined.
 								The \code{|| :} notation in bash basically means ``or true'', which is required
 								to avoid aborting the build due to the non-zero return code produced
 								by the \code{test} command if the build arg is defined.
-												Continue writing thesis

											
										
										
											2018-12-02 20:01:18 +00:00
 								\section{Versioning and Releases}
 								Currently I use git tags%
 								\footnote{\href{https://git-scm.com/docs/git-tag}{https://git-scm.com/docs/git-tag}}
 								to manage releases of the TFW baseimage.
 								Each new release is a tag digitally signed by me using GnuPG so that
 								everyone is able to verify that the release is authentic.
 								The tags are named according to the versioning scheme I have adopted for the project.
 								I explain this versioning system extensively in a blog post\cite{SemancatVersioning}.
 								The short version is that we use a solution that is a mix between
 								semantic versioning%
 								\footnote{\href{https://semver.org}{https://semver.org}}
 								and calendar versioning%
 								\footnote{\href{https://calver.org}{https://calver.org}}.
 								Our release tags look similar to this one: \code{mainecoon-20180712}
 								(this was the actual tag of an older release).
 								The part before the ``\code{-}'' is the major version, which is always named after a breed
 								of cat, for the additional fun factor, and because we love cats.
 								The part after that is a timestamp of the day the release was made on.
 								I only change major versions when I introduce backwards incompatible changes in the
 								API of the framework, this way developers can trust that releases
-												Fix errors shown by grammar and LaTeX checkers

											
										
										
											2018-12-03 18:14:02 +00:00
+								with the same majors are compatible with each other in regards to client code.
-												Continue writing thesis

											
										
										
											2018-12-02 20:01:18 +00:00
 								The \code{master} branches of the frontend-TFW and test-TFW repositories are always
 								kept compatible with the newest release tag of the baseimage.
 								This is one of the ways the bootstrap script can operate safely: when cloning the three
 								repositories, it checks out the newest tag of the baseimage before building it,
 								ensuring that the newest released version is used for the newly created dev environment.