441 lines
23 KiB
TeX
441 lines
23 KiB
TeX
\chapter{Framework Architecture}\label{architecture}
|
|
|
|
This chapter discusses the design of the framework and the technological details
|
|
behind its architecture.
|
|
First, I am going to explain what the purpose of this architecture is and what
|
|
the functions it aims to provide are,
|
|
then I'll clarify what technologies are used --- and how --- to satisfy and implement said
|
|
functionality.
|
|
|
|
\section{Core Technology}
|
|
|
|
It is important to understand that the Tutorial Framework is currently implemented as
|
|
two Docker images:
|
|
\begin{itemize}
|
|
\item the \code{solvable} image is responsible for running the framework and the client
|
|
code depending on it
|
|
\item the \code{controller} image is responsible for solution checking: to figure out
|
|
whether the user has successfully completed the tutorial or not
|
|
\end{itemize}
|
|
During most of this chapter I am going to be discussing the \code{solvable} Docker image,
|
|
with the exception of Section~\ref{solutioncheck}, where I will dive into how the
|
|
\code{controller} image is implemented.
|
|
|
|
The most important feature of the framework is its messaging system.
|
|
Basically what we need is a system where processes running inside a Docker container
|
|
would be allowed to communicate with each other.
|
|
This task is very easy to solve, with lots of possible solutions
|
|
(named pipes, sockets or shared memory to name a few).
|
|
The hard part is that frontend components running inside a web browser --- which could
|
|
potentially be located on the other side of the planet%
|
|
\footnote{Potentially introducing all sorts of issues regarding latency.} --- would
|
|
also need to partake in said communication.
|
|
So what we need to create is something of a hybrid between an IPC system and something
|
|
that can communicate with JavaScript running in a browser connected to it.
|
|
The solution the framework uses is a proxy server, which connects to frontend components
|
|
on one side and handles interprocess communication on the other side.
|
|
This way the server is capable of proxying messages between the two sides, enabling
|
|
communication between them.
|
|
Notice that this way what we have is essentially an IPC%
|
|
\footnote{Interprocess communication} system in which a web application
|
|
can ``act like'' it was running on the backend in a sense: it is easily able to
|
|
communicate with processes running there, while in reality the web application
|
|
is running in the browser of the user, on a completely different machine and it uses
|
|
some means of communication that is routed through the public internet to achieve this
|
|
effect.
|
|
|
|
\begin{note}
|
|
The core idea and initial implementation of this server comes from Bálint Bokros,
|
|
which was later redesigned and fully rewritten by me to allow for greater flexibility
|
|
(such as connecting to more than a single browser at a time, different messaging modes,
|
|
message authentication, restoration of frontend state, a complete overhaul of the
|
|
state tracking system and the possibility for solution checking among other things).
|
|
If you are explicitly interested in the differences between the original POC implementation
|
|
(which is out of scope for this thesis due to length constraints) and the current
|
|
framework please consult Bálint's excellent paper and Bachelor's thesis on it\cite{BokaThesis}.
|
|
\end{note}
|
|
|
|
Now let us take a closer look at the technology used to implement such a server and
|
|
some of the design decisions behind this:
|
|
|
|
\subsection{Connecting to the Frontend}
|
|
|
|
The old way of creating dynamic web pages was AJAX%
|
|
\footnote{AJAX stands for Asynchronous JavaScript And XML, despite usually not having
|
|
anything to do with XML in practice.}
|
|
polling, which is basically sending
|
|
HTTP requests to a server at regular intervals from JavaScript to update the contents
|
|
of your website (and as such requiring to go over the whole TCP handshake and the
|
|
HTTP request-response on each update).
|
|
This has been superseded by WebSockets around 2011, which provide a full-duplex
|
|
communication channel over TCP between your browser and the server.
|
|
This is done by initiating a protocol handshake using the \code{Connection: Upgrade}
|
|
HTTP header, which establishes a permanent socket connection between the browser
|
|
and the server.
|
|
This allows for communication with lower overhead and latency facilitating efficient
|
|
real-time applications, which were not always possible to create before due to
|
|
the overheads%
|
|
\footnote{In some applications this overhead could be bigger than the actual data sent,
|
|
such as signaling.} introduced by AJAX polling.
|
|
|
|
The Tutorial Framework uses WebSockets to connect to its web frontend.
|
|
The TFW proxy server is capable to connecting to an arbitrary number of WebSockets,
|
|
which allows the framework to simultaneously connect to components running in
|
|
separate browser windows and tabs, or even
|
|
in different browsers altogether (such as opening a terminal in Chrome and an IDE in Firefox).
|
|
|
|
\subsection{Interprocess Communication}
|
|
|
|
To handle communication with processes running inside the container TFW utilizes
|
|
the asynchronous distributed messaging called library ZeroMQ%
|
|
\footnote{\href{http://zeromq.org}{http://zeromq.org}} or ZMQ as short.
|
|
The rationale behind this is that unlike other messaging systems such as
|
|
RabbitMQ%
|
|
\footnote{\href{https://www.rabbitmq.com}{https://www.rabbitmq.com}} or Redis%
|
|
\footnote{\href{https://redis.io}{https://redis.io}},
|
|
ZMQ does not require a message broker daemon to be running in the background at all times
|
|
and as such has a much lower memory footprint while still providing various messaging
|
|
patterns and bindings for almost any widely used programming language.
|
|
Another --- yet unutilized --- capability of this solution is that since ZMQ is capable
|
|
of using simple TCP sockets, we could even communicate with processes running on remote
|
|
hosts using the current architecture of the framework.
|
|
|
|
There are various lower level and higher-level alternatives for IPC other than
|
|
ZMQ which were also considered during the design process of the framework at some point.
|
|
A few examples of top contenders and reasons for not using them in the end:
|
|
\begin{itemize}
|
|
\item The handling of raw TCP sockets would involve lot's of boilerplate logic that
|
|
already have quality implementations in messaging libraries: i.e.\ making sure that
|
|
all bytes are sent or received both require constantly checking the return values of the
|
|
libc \code{send()} and \code{recv()} system calls%
|
|
\footnote{Developers forget this very often, resulting in almost untraceable bugs
|
|
that seem to occur randomly.},
|
|
while ZMQ takes care of this
|
|
extra logic involved and even provides higher-level messaging patterns such as
|
|
subscribe-publish, which would need to be implemented on top of raw sockets again.
|
|
\item Using something like gRPC\footnote{\href{https://grpc.io}{https://grpc.io}}
|
|
or plain HTTP (both of which
|
|
are considered to be higher-level than ZMQ sockets) would require
|
|
all processes partaking in the communication to be HTTP servers themselves,
|
|
which would make the framework
|
|
less lightweight and flexible: socket communication with or without ZMQ does not
|
|
force you to write synchronous or asynchronous code, whereas common HTTP servers
|
|
are either async%
|
|
\footnote{Async servers use the \code{select} or \code{epoll} system calls among others
|
|
to avoid blocking on IO and handle concurrent requets.} or thread-per-connection%
|
|
\footnote{Thread-per-connection servers spawn multiple processes and threads to handle requests
|
|
simultaneously.} in nature, which extorts certain design choices on code
|
|
built on them%
|
|
\footnote{Writing async code forces you to avoid blocking calls, whereas with threads you have to
|
|
watch out for common multithreading problems, like race conditions or deadlocks}.
|
|
\end{itemize}
|
|
|
|
\section{Architectural Overview}
|
|
|
|
Now being familiar with the technological basis of the framework we can now
|
|
discuss it in more detail.
|
|
|
|
\pic{figures/tfw_architecture.png}{An overview of the Tutorial Framework}
|
|
|
|
Architecturally TFW consists of four main components:
|
|
\begin{itemize}
|
|
\item \textbf{Event handlers}: processes running in a Docker container
|
|
\item \textbf{Frontend}: web application running in the browser of the user
|
|
\item \textbf{TFW (proxy) server}: responsible for message routing/proxying
|
|
between the frontend and event handlers
|
|
\item \textbf{TFW FSM}: a finite state machine responsible for tracking user progress,
|
|
that is implemented as an event handler called \code{FSMManagingEventHandler}
|
|
\end{itemize}
|
|
Note that it is important to keep in mind that as I've mentioned previously,
|
|
the TFW server and event handlers reside in the \code{solvable} Docker container.
|
|
They all run in separate processes and only communicate with each other using ZeroMQ sockets.
|
|
|
|
In the following sections I am going to explain each of the main components in
|
|
greater detail, as well as how they interact with each other,
|
|
their respective responsibilities,
|
|
some of the design choices behind them and more.
|
|
|
|
\subsection{TFW Message Format}
|
|
|
|
All components in the Tutorial Framework use JSON%
|
|
\footnote{JavaScript Object Notation: \href{https://www.json.org}{https://www.json.org}}
|
|
messages to communicate with each other.
|
|
These messages must also comply some simple rules specific to the framework.
|
|
Let's inspect further what a valid TFW message might look like:
|
|
|
|
\begin{lstlisting}[captionpos=b,caption={The TFW JSON message format}]
|
|
{
|
|
"key": ...an identifier used for addressing...,
|
|
"data":
|
|
{
|
|
...
|
|
optional JSON object carrying arbitrary data
|
|
...
|
|
},
|
|
"trigger": ...optional state change action...,
|
|
"signature": ...optional HMAC signature for authenticated messages...,
|
|
"seq": ...sequence number that is automatically inserted by the TFW server...
|
|
}
|
|
\end{lstlisting}
|
|
|
|
All valid messages \emph{must} include a \code{key} field as this is used by the
|
|
framework for addressing: event handlers and frontend components subscribe to one
|
|
or more of these \code{key}s and only receive%
|
|
\footnote{In reality they do receive them, just like how network interfaces receive all
|
|
ethernet frames, they just choose ignore the ones not concerning them.}
|
|
messages with \code{key}s that they have
|
|
subscribed to.
|
|
It is possible to send a message with an empty key, however these messages will not
|
|
be forwarded by the TFW server (but will reach it, so in case the target of a message
|
|
is the TFW server exclusively, an empty \code{key} can may used).
|
|
|
|
The rest of the fields are optional, but most messages will carry something
|
|
in their \code{data} field.
|
|
The purpose \code{trigger} and \code{signature} fields will be detailed
|
|
at a later point in this paper.
|
|
|
|
\subsection{Networking Details}
|
|
|
|
The default behavior of the TFW server is that it forwards all messages from coming from
|
|
the frontend to the event handlers and vice versa.
|
|
So messages coming from the WebSockets of the frontend are forwarded to event handlers
|
|
via ZMQ and messages received on ZMQ from event handlers are forwarded to
|
|
the frontend via WebSockets.
|
|
|
|
The TFW server is also capable of ``reflecting'' messages back to the side they were
|
|
received from (to facilitate event handler to event handler communication for instance),
|
|
or broadcast messages to all components.
|
|
This is possible by embedding a whole TFW message in the \code{data} field of
|
|
an outer wrapper message with a special \code{key} that signals to the TFW server that
|
|
this message requires special attention.
|
|
An example of this would be:
|
|
\begin{lstlisting}[captionpos=b,caption={Broadcasting and mirroring TFW messages}]
|
|
}
|
|
"key": "broadcast", // or "mirror"
|
|
"data":
|
|
{
|
|
...
|
|
The whole message you want to broadcast or mirror
|
|
(with it's own "key" and "data" fields)
|
|
...
|
|
}
|
|
}
|
|
\end{lstlisting}
|
|
Any invalid messages (not valid JSON or no \code{key} field present) are ignored
|
|
by the TFW server.
|
|
|
|
\subsection{Event Handlers}
|
|
|
|
Event handlers are processes running in the \code{solvable} Docker container
|
|
that subscribe to specific message types using ZeroMQ sockets.
|
|
As discussed earlier, using ZeroMQ allows developers to implement event handlers
|
|
in a wide variety of programming languages.
|
|
This is very important for the framework, as content creators often create
|
|
challenges that are very specific to a language, for example the showcasing
|
|
of a security vulnerability in an older version of the Java standard library.
|
|
|
|
These event handlers are used to write most of the code developers wish to
|
|
integrate with the framework.
|
|
Let's say that a tutorial asks the user to fix some piece of legacy C code.
|
|
In this case, a challenge developer could implement an event handler that runs
|
|
some unit tests to determine whether the user was successful in fixing
|
|
the code or not, then advance the tutorial or invoke other event handlers
|
|
based on this knowledge.
|
|
An event handler such as this could be invoked by sending a message to it
|
|
at any time when the running of the tests would be required.
|
|
|
|
An interesting thing to mention is that there \emph{could} be event handlers which
|
|
broadcast messages with a \code{key} that they are also subscribed to.
|
|
This can disrupt their behavior in weird ways if they are not prepared to
|
|
deal with their own ``echoes''.
|
|
The framework offers a solution for this by providing a special
|
|
event handler type, which is capable of filtering out its own broadcasts.
|
|
The way they do this is by caching the checksum of every message they broadcast,
|
|
and ignore the first message that comes back with the same checksum.
|
|
|
|
\subsection{Frontend}
|
|
|
|
This is a web application that runs in the browser of the user and uses
|
|
multiple WebSockets to connect to the TFW server.
|
|
Due to rapidly increasing complexity, the original implementation (written in
|
|
plain JavaScript with jQuery%
|
|
\footnote{\href{https://jquery.com}{https://jquery.com}} and Bootstrap%
|
|
\footnote{\href{https://getbootstrap.com}{https://getbootstrap.com}}) was becoming
|
|
unmaintainable and the usage of some frontend framework became justified.
|
|
|
|
Several choices were considered, with the main contenders being:
|
|
\begin{itemize}
|
|
\item Angular\footnote{\href{https://angular.io}{https://angular.io}}
|
|
\item React\footnote{\href{https://reactjs.org}{https://reactjs.org}}
|
|
\item Vue.js\footnote{\href{https://vuejs.org}{https://vuejs.org}}
|
|
\end{itemize}
|
|
After comparing the above frameworks we've decided to work with Angular for
|
|
several reasons.
|
|
One being that Angular is essentially a complete platform that is very well
|
|
suitable for building complex architecture into a single page application.
|
|
Other reasons included that the frontend of the Avatao platform is also written
|
|
in Angular (bonus points for experienced team members in the company).
|
|
Another good thing going for it is that Angular forces you to use TypeScript%
|
|
\footnote{\href{https://www.typescriptlang.org}{https://www.typescriptlang.org}}
|
|
which tries to remedy some of the issues\cite{JavaScript}
|
|
with JavaScript by being a language that transpiles to JavaScript while
|
|
strongly encouraging things like static typing or Object Oriented Principles.
|
|
|
|
\pic{figures/tfw_frontend.png}{The Current Angular Frontend of the Tutorial Framework}
|
|
|
|
\subsection{Built-in Event Handlers and Frontend Components}
|
|
|
|
A good chunk of the framework codebase is a bunch of pre-made, built-in components
|
|
that implement commonly required functionality for developers to use.
|
|
These components usually involve an event handler and an Angular component
|
|
communicating with each other to realize some sort of functionality.
|
|
An example would be the built-in code editor of the framework
|
|
(visible on the right side of Figure~\ref{figures/tfw_frontend.png}).
|
|
This code editor essentially is a Monaco editor%
|
|
\footnote{\href{https://microsoft.github.io/monaco-editor/}
|
|
{https://microsoft.github.io/monaco-editor/}}
|
|
instance integrated into Angular and upgraded with the capability to
|
|
exchanges messages with an event handler to save, read and edit files
|
|
that reside in the writable file system of the \code{solvable}
|
|
Docker container.
|
|
|
|
All of the built-ins come with a full API documentation explaining what they do
|
|
on receiving specific messages, and what kind of messages they may emit on different events.
|
|
This greatly expands the capabilities of the framework, since it allows
|
|
developers to do things including, but not limited to:
|
|
\begin{itemize}
|
|
\item making the code editor automatically appear in sections
|
|
of the tutorial where the user needs to use it, then disappear
|
|
when it is no longer needed to conserve space
|
|
\item inject commands into the user's terminal
|
|
\item hook callbacks to run code on messages emitted from components to
|
|
detect events, such as
|
|
to detect if the user has clicked a button or executed a command
|
|
in the terminal
|
|
\item monitor the logs (stdout or stderr) of a given process in real time
|
|
\end{itemize}
|
|
Every pre-made component is designed with the mindset to allow flexible
|
|
and creative usage by developers, with the added possibility of future extensions.
|
|
Often when developers require certain new features, they open an issue on
|
|
the git repository of the framework for me to review and possibly implement
|
|
later.
|
|
|
|
One example would be when a developer wanted to automatically advance the tutorial
|
|
when the user has entered a specific string into a file.
|
|
This one didn't even require a new feature: I recommended him to implement an event
|
|
handler listening to the messages of the built-in file editor, filter the messages
|
|
which contain file content that is being sent to be written to disk, and simply
|
|
search these messages for the given string.
|
|
|
|
The exact capabilities of these built-in components will be explained in greater
|
|
detail in Chapter~\ref{atouroftfw}.
|
|
Developers who are well-aware of these capabilities are able to use the framework in extremely
|
|
creative ways allowing for very interesting functionality, such as the above example.
|
|
The components of TFW can often be combined to work together in unexpected, yet useful
|
|
ways, similarly how command-line utilities on UNIX-like systems do.
|
|
|
|
\subsection{TFW Finite State Machine}
|
|
|
|
An important requirement we have specified during~\ref{requirements} was that
|
|
the framework must be capable of tracking user progress.
|
|
TFW allows developers to define a \emph{finite state machine}
|
|
which is capable of describing the desired ``story'' of a learning exercise.
|
|
The states of the machine could be certain points in time during the completion of the
|
|
tutorial envisioned and transitions could be events that influence the
|
|
state, such as the editing of files, execution of commands and so on.
|
|
|
|
Take the fixing of a SQL Injection%
|
|
\footnote{\href{https://www.owasp.org/index.php/SQL_Injection}
|
|
{https://www.owasp.org/index.php/SQL\_Injection}}
|
|
vulnerability as an example.
|
|
Let's assume, that the source code is vulnerable to a SQL injection attack
|
|
because it tries to compose a query with string concatenation instead of
|
|
using a prepared statement provided by the database library.
|
|
A challenge developer could implement an FSM in the framework that looks like this:
|
|
|
|
\pic[width=.6\textwidth]{figures/tfw_fsm.png}{An example for a finite state machine in TFW}
|
|
|
|
In case the source file has been edited, the unit test cases designed to detect
|
|
whether the code is vulnerable or not are invoked.
|
|
Depending on the results three cases are possible:
|
|
|
|
\begin{description}
|
|
\item[All test cases have succeeded:] If all the tests cases have ran successfully,
|
|
then the user has managed
|
|
to fix the code properly and we can display a congratulating message accordingly.
|
|
\item[All test cases have failed:] In this case the submitted solution is incorrect
|
|
and we should offer some hints, so that the user can try again more effectively,
|
|
optionally displaying more and more hints with each successive failure.
|
|
\item[Some test cases have succeeded:] It is possible that the based on the test cases
|
|
that have succeeded and failed we can determine that the user has tried to blacklist
|
|
certain SQL keywords. This is a common, but incorrect ``solution'' of fixing a SQL
|
|
injection vulnerability. Now we can explain to users why their solution is wrong,
|
|
and give them helpful tips.
|
|
\end{description}
|
|
|
|
This example shows how content creators can create tutorials that could behave
|
|
in many different ways based on what the user does.
|
|
In high-quality challenges developers can implement several ``paths'' to
|
|
a successful completion.
|
|
This is a very engaging feature that offers an immersive learning experience for
|
|
users, which many solutions for distance education lack so often.
|
|
|
|
Developers can use a YAML file or write Python code to implement finite
|
|
state machines in TFW\@. This is going to be further detailed in
|
|
Chapter~\ref{usingtfw}.
|
|
In the implementation of state machines it is also possible to subscribe callbacks to be
|
|
invoked on certain events regarding the machine, such as before and after
|
|
state transitions, or on entering and exiting a state.
|
|
It is \emph{very} important to be aware of these callbacks, as much of the
|
|
actual tutorial logic is often going to be implemented in these.
|
|
|
|
Architecturally a built-in event handler called \code{FSMManagingEventHandler}
|
|
is capable of managing the FSM defined by clients.
|
|
The responsibilities of said event handler include:
|
|
\begin{itemize}
|
|
\item Attempting to step the state machine (one can write preconditions that are
|
|
required to succeed for the transition to take place)
|
|
\item Broadcasting FMS update messages on events involving the state machine,
|
|
such as successful transitions
|
|
\end{itemize}
|
|
|
|
The \code{trigger} field of a message can be used to step the framework FSM
|
|
if all preconditions are met.
|
|
The way this works is if the TFW server encounters a message with a
|
|
\code{trigger} defined, it notifies the event handler managing
|
|
the state machine so it can attempt activating said \code{trigger}.
|
|
|
|
Since messages in the system can come from unauthenticated sources (such as the frontend),
|
|
it is possible to
|
|
enforce the authentication of privileged messages, such as messages containing a \code{trigger}.
|
|
The framework allows trusted code to access a cryptographic key stored on the file system
|
|
with proper permissions, which
|
|
can be used to digitally sign messages (this is what the \code{signature} message
|
|
field is designed for) using HMAC%
|
|
\footnote{Hash-based message authentication code}.
|
|
In this case the TFW server will only forward the privileged messages that
|
|
have a valid signature, and the event handler managing the state machine
|
|
will also validate the signature of messages it receives
|
|
(and sign the updates it broadcasts as well, so that other components can verify that
|
|
they come from a trusted source).
|
|
|
|
\subsection{Solution checking}\label{solutioncheck}
|
|
|
|
Traditionally most challenges on the Avatao platform implement a Docker image called
|
|
\code{controller}, which is responsible for detecting the successful
|
|
solution of a challenge.
|
|
When using the Tutorial Framework, a pre-implemented \code{controller}
|
|
image is available, which listens to messages emitted by the
|
|
framework FSM, and detects if the final state defined by developers is reached.
|
|
This means that if content creators implement a proper FSM, the solution checking
|
|
does not require any more effort from their part and will work automatically.
|
|
|
|
It is also worth to note that the authentication of privileged messages
|
|
makes the Tutorial Framework suitable for implementing
|
|
traditional hacking challenges, such as exercises developed for CTF%
|
|
\footnote{A ``capture the flag'' game is a competition designed for professionals
|
|
--- or just people interested in the field --- to sharpen their skills in IT security.
|
|
Avatao often organizes similar events.}
|
|
events, as the controller image is also capable of verifying the authenticity of
|
|
FSM update messages via inspecting their signatures.
|