Fix errors shown by grammar and LaTeX checkers

This commit is contained in:
Kristóf Tóth 2018-12-03 19:14:02 +01:00
parent 40aa0d9f2f
commit 31803cd421
6 changed files with 141 additions and 141 deletions

View File

@ -9,7 +9,7 @@ based on the TFW architecture themselves.
A \emph{very important} point to keep in mind is that most of this
exercise-specific logic will be implemented in \textbf{FSM callbacks} and
custom \textbf{event handlers}.
The whole framework is built in a way to faciliate this process and developers
The whole framework is built in a way to facilitate this process and developers
who understand this mindset almost always find it a breeze to create great
content using TFW\@.
@ -23,7 +23,7 @@ for instance the built-in code editor requires a frontend component and an event
handler to function properly, while the component responsible for
drawing out and managing frontend components implements no
event handler, so it purely exists in Angular.
An other example of a purely frontend component would be the messages component,
Another example of a purely frontend component would be the messages component,
which is used to display messages to the user.
In the Tutorial Framework most of the built-ins define APIs, which are TFW messages
@ -61,7 +61,7 @@ Some components emit or broadcast messages on specific events, for instance the
"from_state": ...string...,
"to_state": ...string...,
"trigger": ...string...,
"timestamp": ...unix timestamp...
"timestamp": ...UNIX timestamp...
}
...
}
@ -81,7 +81,7 @@ The framework must allow content creators to communicate with the user,
and provide some mechanism to enable ``talking'' to them.
This is the responsibility of the \emph{messages} component, which
provides a chatbox-like element on the frontend.
The simplest form of communication it accomodates it the insertion of text
The simplest form of communication it accommodates it the insertion of text
into the chatbox through API messages.
This component always expects messages it receives to be in Markdown%
\footnote{\href{https://daringfireball.net/projects/markdown/}
@ -111,7 +111,7 @@ Similarly to a real chat application, some
The timing of pauses and messages is based on the \emph{WPM} --- or Words Per Minute ---
set by developers according to their specific requirements.
This creates an experience similar to chatting with someone in real time, as the time
it takes for each message to be displayed depends on the lenght of the previous message.
it takes for each message to be displayed depends on the length of the previous message.
This illusion is made possible through appropriate \code{setTimeout()} calls in
TypeScript and some elementary math to calculate the proper delays in milliseconds based on
message lengths:
@ -121,10 +121,10 @@ message lengths:
\[ timeoutSeconds = lastMessageLength / charactersPerSeconds \]
\[ timeoutMilliseconds = timeoutSeconds * 1000 \]
The value 5 comes from the fact that on average english words are 5
The value 5 comes from the fact that on average English words are 5
characters long according to some studies.
This value could be made configurable in the future, but currently there
are no plans to make non-english challenges.
are no plans to make non-English challenges.
\section{IDE Component}\label{idecomponent}
@ -139,27 +139,27 @@ To implement this IDE%
I have integrated the open source Monaco editor developed by Microsoft into the
Angular web application TFW uses as a frontend, and added functionality to it
that allows the editor to integrate with the framework.
This involves commnication with an event handler dedicated to this feature,
This involves communication with an event handler dedicated to this feature,
which is capable of reading and writing files to disk, while sending and receiving
editor content from the frontend component.
The interaction of this event handler and the Monaco editor provides a seamless
editing experience, featuring autosave at configurable intervals, code completion,
editing experience, featuring auto-save at configurable intervals, code completion,
automatic code coloring for several programming languages and more.
Perhaps the most ``magical'' feature of this editor is that if any process
in the Docker container writes a file that is being displayed in the editor,
the contents of that file are automatially refreshed without any user
the contents of that file are automatically refreshed without any user
interaction whatsoever.
Besides that, if a file is created in the directory the editor is configured
to display, that file is automatially displayed on a new tab in the IDE.
to display, that file is automatically displayed on a new tab in the IDE\@.
This allows for really interesting demo opportunities.
Lets say I create a file using the terminal on the frontend by executing the
Let's say I create a file using the terminal on the frontend by executing the
command \code{touch file.txt}. A new tab on the editor automatically
appears. If I select it I can confirm that I have successfully created an
empty file.
After this let's run a \code{while} cycle in the command line which
peroadically appends some text to \code{file.txt}:
periodically appends some text to \code{file.txt}:
\begin{lstlisting}[captionpos=b,caption={Bash while cycle writing to a file periodically},
language=bash]
while true
@ -170,7 +170,7 @@ done
\end{lstlisting}
The results speak for themselves:
\pic{figures/ide_demo.png}{The editor demo involving automatic file refreshing}
As you can see, the file contents are automatially updated as the bash script appends
As you can see, the file contents are automatically updated as the bash script appends
to the file.
This feature is implemented by using the inotify API%
\footnote{\href{http://man7.org/linux/man-pages/man7/inotify.7.html}
@ -217,7 +217,7 @@ terminal emulator to do so.
This component has a tiny server process which is managed by a TFW event handler.
This small server is responsible for spawning bash sessions and
unix pseudoterminals (or \code{pty}s) in the \code{solvable} Docker
UNIX pseudoterminals (or \code{pty}s) in the \code{solvable} Docker
container.
It is also responsible for connecting the master end of the \code{pty} to the
emulator running in the browser and the slave end to the bash session it has
@ -225,8 +225,8 @@ spawned.
This way users are able to work in the shell displayed on the frontend just like
they would on their home machines, which allows for great tutorials
explaining topics that involve the usage of a shell.
Note that this allows us to cover an extremely wide variety of topics using TFW:
from compiling shared libraries for development, to using cryptographic FUSE filesystems
Note that this allows us to cover an extremely wide variety of topics using TFW\@:
from compiling shared libraries for development, to using cryptographic FUSE file systems
for enhanced privacy%
\footnote{\href{https://github.com/rfjakob/gocryptfs}{https://github.com/rfjakob/gocryptfs}},
or automating cloud infrastructure using Ansible%
@ -248,7 +248,7 @@ container using an interactive bash session.
This is not an easy thing to accomplish without relying on some sort of heavyweight
monitoring solution such as Sysdig%
\footnote{\href{https://sysdig.comq}{https://sysdig.com}}.
I deemed most simiar systems a huge overkill to implement this functionality, and their
I deemed most similar systems a huge overkill to implement this functionality, and their
memory footprints are not something we could afford here%
\footnote{These containers will be spawned on a per-user basis, so we must be as
conservative with memory as possible.}.
@ -273,8 +273,8 @@ but that should not be an issue as this is not a feature that is intended to be
used in competitive environments (and if the users of a tutorial intentionally
break the system under themselves, well, good for them).
An other advantage of this method is that this can be applied to any interactive
application that supports logging commands executed in them in some way or an other.
Another advantage of this method is that this can be applied to any interactive
application that supports logging commands executed in them in some way or another.
A good example would be GDB%
\footnote{\href{https://www.gnu.org/software/gdb/}{https://www.gnu.org/software/gdb/}},
which supports an option called \code{set trace-commands on}. This option flushes
@ -293,7 +293,7 @@ The console has no event handler: it is a purely frontend component which expose
API through TFW messages to write and read it's contents.
It works great when combined with the process management capabilities of the framework:
if configured to do so it can display the output of processes like webservers in real time.
if configured to do so it can display the output of processes like web servers in real time.
When using this next to the TFW frontend editor, it allows for a development
experience similar to working in an IDE on your laptop.
@ -308,10 +308,10 @@ as well as switching between them mid-tutorial using API messages.
The framework includes an event handler capable of managing processes running inside
the \code{solvable} Docker container.
The capabilities of this componenet include the starting, stopping and restarting of processes,
The capabilities of this component include the starting, stopping and restarting of processes,
as well as emitting the standard out or standard error logs belonging to them, even
in real-time (by broadcasting TFW messages).
This logging feature allows for interesting possiblities such as the handling
This logging feature allows for interesting possibilities such as the handling
of live process output, or just requesting the logs belonging to a certain application when
some sort of event has occurred (such as on errors).
This component also can be interacted with using TFW API messages.
@ -353,7 +353,7 @@ location \code{/tmp/}.
All logs coming from the container itself were also logged to this location.
This had caused an infinite recursion: when a process would write to \code{/tmp/}
inotify would invoke a process that would also log to that location causing the kernel to
emit more inotify events, which in turn would cause more and more new proesses to spawn
emit more inotify events, which in turn would cause more and more new processes to spawn
and write to \code{/tmp/}, causing the whole procedure to repeat again and again.
This continued until my machine would start to run out of memory and begin swapping
pages to disk%
@ -369,7 +369,7 @@ fascinating phenomenon, but those were \emph{very} fun hours at least.
\section{FSM Management}
I have already mentioned the event handler called \code{FSMManagingEventHandler},
which is responsible for managing the framework FSM.
which is responsible for managing the framework FSM\@.
For completeness I chose to include it in this chapter as well.
The API it exposes through TFW messages allows client code to attempt stepping the
state machine.
@ -406,7 +406,7 @@ thing work with the Same Origin Policy%
\footnote{\href{https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin_policy}
{https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin\_policy}}
being in effect?
The answer is that developers must use a \emph{relative url}, that is an URL relative
The answer is that developers must use a \emph{relative URL}, that is an URL relative
to the entry point of the TFW frontend itself.
To allow serving several web applications from a single port the framework
supports optional reverse-proxy configurations through the nginx%
@ -416,8 +416,8 @@ More on this in a Chapter~\ref{usingtfw}.
\section{Various Frontend Features}
The Angular frontend of the framework features several different layouts.
These layouts are useful to accomodate different workflows for users,
such as the previous exampe of editig code and being able to view the
These layouts are useful to accommodate different workflows for users,
such as the previous example of editing code and being able to view the
result of said code in real time next to the editor.
Another example would be editing Ansible playbooks in the file editor,
and then trying to run them in the terminal.
@ -426,11 +426,11 @@ to be used like that, for instance the code editor can be used to conveniently
edit larger files this way.
The frontend was designed in a way to be fully responsive in window sizes
that still keep the whole thing usable (i.e.\ it would not be practial to start
that still keep the whole thing usable (i.e.\ it would not be practical to start
solving TFW tutorials on a smart phone, simply because of size limits, so such small screens are
not supported, but the frontend still behaves as expected on small laptops or bigger tablets).
This is not an easy thing to impelent and maintain due to the lots of small
incompatibilites between browsers given the complexity of the frontend.
This is not an easy thing to implement and maintain due to the lots of small
incompatibilities between browsers given the complexity of the frontend.
Just remember that a few years ago the clearfix%
\footnote{\href{https://stackoverflow.com/questions/8554043/what-is-a-clearfix}
@ -438,7 +438,7 @@ Just remember that a few years ago the clearfix%
hack was the industry standard in creating CSS layouts.
The situation has improved \emph{a lot} since then with flexboxes
and grid layouts despite the sheer chaos that is generally involved in web
standardization efforts, but CSS espacially%
standardization efforts, but CSS especially%
\footnote{\href{https://developer.mozilla.org/en-US/docs/Web/CSS/CSS3}
{https://developer.mozilla.org/en-US/docs/Web/CSS/CSS3}}.
@ -460,10 +460,10 @@ keep your software up to date.
There are several additional APIs exposed by the frontend,
which include the changing of layouts, selecting the terminal or console
component to be displayed, the possibility of dynamically modifying
frontend configuration values (such as the frequency of autosaving the files in the editor)
frontend configuration values (such as the frequency of auto-saving the files in the editor)
and more.
To accomodate communication with the TFW server, the frontend of the framework
To accommodate communication with the TFW server, the frontend of the framework
comes with some library code which can be used to send and receive TFW messages.
This code is mostly WebSockets combined with RxJS%
\footnote{\href{https://rxjs-dev.firebaseapp.com}{https://rxjs-dev.firebaseapp.com}},

View File

@ -1,11 +1,11 @@
\chapter*{Acknowledgements}
\addcontentsline{toc}{chapter}{Acknowledgements}
\chapter*{Acknowledgments}
\addcontentsline{toc}{chapter}{Acknowledgments}
This creation of this framework would not have been possible alone.
In this chapter I would like to express my gratitude towards great people who have
helped me in some way or an other along the way.
helped me in some way or another along the way.
First of all I would like to thank Bálint Bokros, my good friend and colleauge for
First of all I would like to thank Bálint Bokros, my good friend and colleague for
his awesome work regarding TFW and for always
being open to provide useful input.
He has also earned my gratitude by always being there to lift my spirits, be that
@ -22,7 +22,7 @@ I can't thank my consultant, Levente Buttyán enough for enduring my general
inability to deal with deadlines and administration.
I also appreciate the great morale my colleagues and friends provided in the office,
by always being there, be that for work or fun. This project couldn't have been realised
by always being there, be that for work or fun. This project couldn't have been realized
sitting in a depressing cube between 200 people hating their jobs. They also have my gratitude
for direct contributions to the framework, be that with ideas, assistance,
or actual code.

View File

@ -13,9 +13,9 @@ During most of this chapter I am going to be discussing the \code{solvable} Dock
with the exception of Section~\ref{solutioncheck}, where I will dive into how the
\code{controller} image is implemented.
The most important feature of the framework is it's messaging system.
The most important feature of the framework is its messaging system.
Basically what we need is a system where processes running inside a Docker container
would be allowed to communicate with eachother.
would be allowed to communicate with each other.
This task is very easy to solve, with lots of possible solutions
(named pipes, sockets or shared memory to name a few).
The hard part is that frontend components running inside a web browser --- which could
@ -27,7 +27,7 @@ that can communicate with JavaScript running in a browser connected to it.
The solution the framework uses is a proxy server, which connects to frontend components
on one side and handles interprocess communication on the other side.
This way the server is capable of proxying messages between the two sides, enabling
communitaion between them.
communication between them.
Notice that this way what we have is essentially an IPC%
\footnote{Interprocess communication} system in which a web application
can ``act like'' it was running on the backend in a sense: it is easily able to
@ -43,7 +43,7 @@ which was later redesigned and fully rewritten by me to allow for greater flexib
message authentication, restoration of frontend state, a complete overhaul of the
state tracking system and the possibility for solution checking among other things).
If you are explicitly interested in the differences between the original POC implementation
(which is out of scope for this thesis due to lenght constraints) and the current
(which is out of scope for this thesis due to length constraints) and the current
framework please consult Bálint's excellent paper and Bachelor's thesis on it\cite{BokaThesis}.
\end{note}
@ -52,7 +52,7 @@ some of the design decisions behind this:
\subsection{Connecting to the Frontend}
The old way of creating dynamic webpages was AJAX%
The old way of creating dynamic web pages was AJAX%
\footnote{AJAX stands for Asynchronous JavaScript And XML, despite usually not having
anything to do with XML in practice.}
polling, which is basically sending
@ -62,16 +62,16 @@ HTTP request-response on each update).
This has been superseded by WebSockets around 2011, which provide a full-duplex
communication channel over TCP between your browser and the server.
This is done by initiating a protocol handshake using the \code{Connection: Upgrade}
HTTP header, which establishes a premanent socket connection between the browser
HTTP header, which establishes a permanent socket connection between the browser
and the server.
This allows for communication with lower overhead and latency facilitating efficient
real-time applications, which were not always possible to create before due to
the overheads%
\footnote{In some applications this overhead could be bigger than the actual data sent,
such as singaling.} introduced by AJAX polling.
such as signaling.} introduced by AJAX polling.
The Tutorial Framework uses WebSockets to connect to it's web frontend.
The TFW proxy server is capable to connecting to an arbirary number of WebSockets,
The Tutorial Framework uses WebSockets to connect to its web frontend.
The TFW proxy server is capable to connecting to an arbitrary number of WebSockets,
which allows the framework to simultaneously connect to components running in
separate browser windows and tabs, or even
in different browsers altogether (such as opening a terminal in Chrome and an IDE in Firefox).
@ -88,11 +88,11 @@ RabbitMQ%
ZMQ does not require a message broker daemon to be running in the background at all times
and as such has a much lower memory footprint while still providing various messaging
patterns and bindings for almost any widely used programming language.
An other --- yet untilized --- capability of this solution is that since ZMQ is capable
Another --- yet unutilized --- capability of this solution is that since ZMQ is capable
of using simple TCP sockets, we could even communicate with processes running on remote
hosts using the current architecture of the framework.
There are various lower level and higher level alternatives for IPC other than
There are various lower level and higher-level alternatives for IPC other than
ZMQ which were also considered during the design process of the framework at some point.
A few examples of top contenders and reasons for not using them in the end:
\begin{itemize}
@ -101,13 +101,13 @@ A few examples of top contenders and reasons for not using them in the end:
all bytes are sent or received both require constantly checking the return values of the
libc \code{send()} and \code{recv()} system calls%
\footnote{Developers forget this very often, resulting in almost untraceable bugs
that seem to occour randomly.},
that seem to occur randomly.},
while ZMQ takes care of this
extra logic involved and even provides higher level messaging patterns such as
extra logic involved and even provides higher-level messaging patterns such as
subscribe-publish, which would need to be implemented on top of raw sockets again.
\item Using something like gRPC\footnote{\href{https://grpc.io}{https://grpc.io}}
or plain HTTP (both of which
are considered to be higher level than ZMQ sockets) would require
are considered to be higher-level than ZMQ sockets) would require
all processes partaking in the communication to be HTTP servers themselves,
which would make the framework
less lightweight and flexible: socket communication with or without ZMQ does not
@ -125,7 +125,7 @@ simultaneously.} in nature, which extorts certain design choices on code
Now being familiar with the technological basis of the framework we can now
discuss it in more detail.
\pic{figures/tfw_architecture.png}{An overwiew of the Tutorial Framework}
\pic{figures/tfw_architecture.png}{An overview of the Tutorial Framework}
Architecturally TFW consists of four main components:
\begin{itemize}
@ -138,10 +138,10 @@ Architecturally TFW consists of four main components:
\end{itemize}
Note that it is important to keep in mind that as I've mentioned previously,
the TFW server and event handlers reside in the \code{solvable} Docker container.
They all run in separate processes and only communicate with eachother using ZeroMQ sockets.
They all run in separate processes and only communicate with each other using ZeroMQ sockets.
In the following sections I am going to explain each of the main components in
greater detail, as well as how they interact with eachother,
greater detail, as well as how they interact with each other,
their respective responsibilities,
some of the design choices behind them and more.
@ -149,7 +149,7 @@ some of the design choices behind them and more.
All components in the Tutorial Framework use JSON%
\footnote{JavaScript Object Notation: \href{https://www.json.org}{https://www.json.org}}
messages to communicate with eachother.
messages to communicate with each other.
These messages must also comply some simple rules specific to the framework.
Let's inspect further what a valid TFW message might look like:
@ -186,14 +186,14 @@ at a later point in this paper.
\subsection{Networking Details}
The default behaviour of the TFW server is that it forwards all messages from coming from
The default behavior of the TFW server is that it forwards all messages from coming from
the frontend to the event handlers and vice versa.
So messages coming from the WebSockets of the frontend are forwarded to event handlers
via ZMQ and messages received on ZMQ from event handlers are forwarded to
the frontend via WebSockets.
The TFW server is also capable of ``reflecting'' messages back to the side they were
received from (to faciliate event handler to event handler communication for instance),
received from (to facilitate event handler to event handler communication for instance),
or broadcast messages to all components.
This is possible by embedding a whole TFW message in the \code{data} field of
an outer wrapper message with a special \code{key} that signals to the TFW server that
@ -236,10 +236,10 @@ at any time when the running of the tests would be required.
An interesting thing to mention is that there \emph{could} be event handlers which
broadcast messages with a \code{key} that they are also subscribed to.
This can distrupt their behaviour in weird ways if they are not prepared to
This can disrupt their behavior in weird ways if they are not prepared to
deal with their own ``echoes''.
The framework offers a solution for this by providing a special
event handler type, which is capable of filtering out it's own broadcasts.
event handler type, which is capable of filtering out its own broadcasts.
The way they do this is by caching the checksum of every message they broadcast,
and ignore the first message that comes back with the same checksum.
@ -265,7 +265,7 @@ One being that Angular is essentially a complete platform that is very well
suitable for building complex architecture into a single page application.
Other reasons included that the frontend of the Avatao platform is also written
in Angular (bonus points for experienced team members in the company).
An other good thing going for it is that Angular forces you to use TypeScript%
Another good thing going for it is that Angular forces you to use TypeScript%
\footnote{\href{https://www.typescriptlang.org}{https://www.typescriptlang.org}}
which tries to remedy some of the issues\cite{JavaScript}
with JavaScript by being a language that transpiles to JavaScript while
@ -278,7 +278,7 @@ strongly encouraging things like static typing or Object Oriented Principles.
A good chunk of the framework codebase is a bunch of pre-made, built-in components
that implement commonly required functionality for developers to use.
These components usually involve an event handler and an Angular component
communicating with eachother to realize some sort of functionality.
communicating with each other to realize some sort of functionality.
An example would be the built-in code editor of the framework
(visible on the right side of Figure~\ref{figures/tfw_frontend.png}).
This code editor essentially is a Monaco editor%
@ -286,7 +286,7 @@ This code editor essentially is a Monaco editor%
{https://microsoft.github.io/monaco-editor/}}
instance integrated into Angular and upgraded with the capability to
exchanges messages with an event handler to save, read and edit files
that reside in the writeable file system of the \code{solvable}
that reside in the writable file system of the \code{solvable}
Docker container.
All of the built-ins come with a full API documentation explaining what they do
@ -319,7 +319,7 @@ search these messages for the given string.
The exact capabilities of these built-in components will be explained in greater
detail in Chapter~\ref{atouroftfw}.
Developers who are well-aware of these capabilites are able to use the framework in extremely
Developers who are well-aware of these capabilities are able to use the framework in extremely
creative ways allowing for very interesting functionality, such as the above example.
The components of TFW can often be combined to work together in unexpected, yet useful
ways, similarly how command-line utilities on UNIX-like systems do.
@ -365,7 +365,7 @@ Depending on the results three cases are possible:
This example shows how content creators can create tutorials that could behave
in many different ways based on what the user does.
In high quality challenges developers can implement several ``paths'' to
In high-quality challenges developers can implement several ``paths'' to
a successful completion.
This is a very engaging feature that offers an immersive learning experience for
users, which many solutions for distance education lack so often.
@ -373,7 +373,7 @@ users, which many solutions for distance education lack so often.
Developers can use a YAML file or write Python code to implement finite
state machines in TFW\@. This is going to be further detailed in
Chapter~\ref{usingtfw}.
In the implementation of state machines it is also possbile to subscribe callbacks to be
In the implementation of state machines it is also possible to subscribe callbacks to be
invoked on certain events regarding the machine, such as before and after
state transitions, or on entering and exiting a state.
It is \emph{very} important to be aware of these callbacks, as much of the
@ -404,7 +404,7 @@ can be used to digitally sign messages (this is what the \code{signature} messag
field is designed for) using HMAC%
\footnote{Hash-based message authentication code}.
In this case the TFW server will only forward the privileged messages that
have a valid signature, and the evend handler managing the state machine
have a valid signature, and the event handler managing the state machine
will also validate the signature of messages it receives
(and sign the updates it broadcasts as well, so that other components can verify that
they come from a trusted source).
@ -425,6 +425,6 @@ makes the Tutorial Framework suitable for implementing
traditional hacking challenges, such as exercises developed for CTF%
\footnote{A ``capture the flag'' game is a competition designed for professionals
--- or just people interested in the field --- to sharpen their skills in IT security.
Avatao often organises similar events.}
Avatao often organizes similar events.}
events, as the controller image is also capable of verifying the authenticity of
FSM update messages via inspecting their signatures.

View File

@ -3,7 +3,7 @@
\section{Project justification}
As the world is being completely engulfed by software, the need for accessible, but
high quality learning materials covering software engineering and especially secure software
high-quality learning materials covering software engineering and especially secure software
engineering is on the rise.
While we are enjoying the comfort that information technology provides us, we often forget
about the risks involved in relying so much on software in our everyday lives.
@ -25,7 +25,7 @@ sensitive data through our ill-protected smart phones\cite{Android} and IoT devi
What a time to be alive.
It is important to express that IT security is something that is \emph{really hard} to
get right.
Even if right often only means better then your neighbour, as perfect security is an utopia
Even if right often only means better then your neighbor, as perfect security is an utopia
that doesn't seem to exist\cite{NoPerfectSecurity}.
Often when large and reputable companies in the industry such as
CloudFlare\cite{CloudFlareLeak} or eBay\cite{EBayGit} can fail to get it right at times
@ -44,9 +44,9 @@ The only thing we can hope and work for is to become better and better as time
and generations pass by.
We \emph{must} do better, and education is the way forward.
The short term goal of this project --- and the goal of this thesis --- is to provide
The short-term goal of this project --- and the goal of this thesis --- is to provide
a new angle in the education of software engineering, especially secure software
engineering based on the aspirations above, with the long term goal of bringing
engineering based on the aspirations above, with the long-term goal of bringing
something new to the table in the matter of IT education as a whole
(not just for developers, but for users as well).
@ -63,17 +63,17 @@ universities around the world and providing a solution for companies in building
\emph{security consciousness} amongst their developer teams.
Since starting out we have amassed some experience in building fun challenges
that showcase the exploitation and fixing of relevant security vulnerabilites in code or
that showcase the exploitation and fixing of relevant security vulnerabilities in code or
configuration.
Traditionally these exercises revolved around offensive and defensive tasks, with challenges
often being split into two or more parts.
For example users would have to hack a website by exploiting a buffer overflow vulnerability,
then in the second challenge they would fix the code they've just exploited in a web based
then in the second challenge they would fix the code they've just exploited in a web-based
code editor.
These kind of exercises offer great flexibility to reflect real world security issues, as in
more complex challenges users might be required to exploit multiple vulnerabilites for success,
These kind of exercises offer great flexibility to reflect real-world security issues, as in
more complex challenges users might be required to exploit multiple vulnerabilities for success,
and understand the ways they augment each other.
We often recreate real world scenarios based on incident reports released by companies for
We often recreate real-world scenarios based on incident reports released by companies for
added authenticity and relevance\cite{AkosFacebook}.
Our challenges usually involve some sort of website acting as frontend for the vulnerable
application, or require the user to connect using SSH\@.
@ -83,24 +83,24 @@ application, or require the user to connect using SSH\@.
The Avatao platform relies heavily on Docker containers to spawn challenges,
which makes it extremely flexible in terms of what is possible to do when creating
content.
Essentially anything that you can do inside a Docker conainer can be done on
Essentially anything that you can do inside a Docker container can be done on
the Avatao platform as well.
Currently each challenge is implemented as a set of Docker images residing inside a
Git repository exclusive to the specific challenge in mind.
Our content creation wokflow enables developers to create such repositories on GitHub,
Our content creation workflow enables developers to create such repositories on GitHub,
which are automatically set up with the proper webhooks, so that when their content gets
reviewed (and their feature branches merged), their changes will go live on the
platform as well.
In the future we also plan on supporting the use of virtual machines to implement
challenges, which could further increase this fexibility by addig the possiblity to do
challenges, which could further increase this flexibility by adding the possibility to do
things like exercises involving the use of Docker or Windows based challenges.
\section{Emergence}\label{intro:emergence}
While working as a content creator I have stumbled into the idea of automating the completion
of challenges for QA\footnote{Quality Assurrance} and demo purposes.
of challenges for QA\footnote{Quality Assurance} and demo purposes.
I used to record short videos or GIFs to showcase my content to management.
In a certain scenario I was required to integrate a web based terminal emulator into a
In a certain scenario I was required to integrate a web-based terminal emulator into a
frontend application to improve user experience by making it possible to use a shell
right on the website rather than having to connect through SSH\@.
@ -110,7 +110,7 @@ as I have often found myself recording over and over again for a demo without an
During the time I was playing around with this idea, researching possible solutions have led me
to a hidden gem of a project on GitHub called \code{demo-magic}%
\footnote{\href{https://github.com/paxtonhare/demo-magic}{https://github.com/paxtonhare/demo-magic}},
which is esentially a bash script that simulates someone typing into a terminal and executing
which is essentially a bash script that simulates someone typing into a terminal and executing
commands.
I have created a fork%
@ -123,7 +123,7 @@ the solution script with the challenge code itself, making it toggleable using b
variables.
Should the solution script be enabled, the challenge would automatically start%
\footnote{I did this by injecting the solution script into the user's \code{.bashrc} file.}
completing itself in the terminal integrated into it's frontend, often even explaining the
completing itself in the terminal integrated into its frontend, often even explaining the
commands executed during the solution process.
\lstinputlisting[
@ -153,9 +153,9 @@ a related task.
This teacher scenario would allow you to take the helm sometimes and try applying
your newfound skills in action immediately.
For example a chatbot would show you how to encrypt a file using GnuGP%
For example a chatbot would show you how to encrypt a file using GnuPG%
\footnote{\href{https://www.gnupg.org}{https://www.gnupg.org}},
then it would ask you to encrypt an other file similarly.
then it would ask you to encrypt another file similarly.
After this the bot could teach you how to a configure a database server and then
ask you to write a configuration file yourself and then encrypt it because it might
contain sensitive data such as open ports, usernames and such.
@ -177,20 +177,20 @@ If the user changes the source code of the application and clicks this button, t
should restart itself with the new code.
Let's say that the user comments out the part that authenticates a user.
In this case the application should let anyone log in dummy credentials.
Meanwhile a console could show the output of the webserver.
Meanwhile a console could show the output of the web server.
For example if the source code the user tried to deploy was invalid, the framework
should report the exact exception raised while running the application.
\pic{figures/webapp_and_editor.png}{The code editor and web application example in TFW}
Even if we did all this, we would still need a way to integrate this whole thing into
a web based frontend with a file editor, terminal, chat window and stuff like that.
a web-based frontend with a file editor, terminal, chat window and stuff like that.
Turns out that today all this can be done by writing a few hundred lines of Python
code which uses the Tutorial Framework.
\pic{figures/webapp_and_editor_err.png}{Invalid code and deployment failure with process output}
Note that it is possible to try out the current version of the Tutorial Framewok
Note that it is possible to try out the current version of the Tutorial Framework
using a guest account on the Avatao platform on this
\href{https://platform.avatao.com/paths/d0ccef1f-0389-45bf-9d44-e85b86d66c49/challenges/a7e08c0a-199f-4f8d-aa7e-51b6e9bfcb15}{url}%
\footnote{\href{https://platform.avatao.com/paths/d0ccef1f-0389-45bf-9d44-e85b86d66c49/challenges/a7e08c0a-199f-4f8d-aa7e-51b6e9bfcb15}
@ -202,7 +202,7 @@ Based on this it is now more or less possible to define requirements for the pro
The reason for the ``more or less'' part is that all of this is pretty much bleeding edge,
where the requirements could shift dynamically with time.
For this reason I am going to be as general as possible, to the point that some of
this might even sound vauge.
this might even sound vague.
To achieve our goals we would need:
\begin{itemize}
@ -210,8 +210,8 @@ To achieve our goals we would need:
\item a way to to handle various events (i.e.\ we can react when
the user has edited a file, or has executed a command in the terminal)
\item a highly flexible messaging system, in which processes running on the backend and
frontend components running in a web browser could communicate with eachother
\item a web based frontend with lots of built-in options (terminal, file editor, chat
frontend components running in a web browser could communicate with each other
\item a web-based frontend with lots of built-in options (terminal, file editor, chat
window, etc.) that use said messaging system
\item stable APIs that can be exposed to content creators to work with (so that
framework updates won't break client code)
@ -220,11 +220,11 @@ To achieve our goals we would need:
\section{Early Development}
Around a year ago a good friend and collage of mine Bálint Bokros, the CTO of our company
Around a year ago a good friend and colleague of mine Bálint Bokros, the CTO of our company
Gábor Pék and myself would start designing the TFW architecture.
In this early phase we would research solutions for the issues described such as
tracking user progress, process management, interprocess communication
and making a web based frontend application capable of communicatig with processes running
and making a web-based frontend application capable of communicating with processes running
inside a Docker container.
After seeing some sort of light at the end of the tunnel regarding what technologies could
@ -253,11 +253,11 @@ But since the project has followed Agile Methodology%
\footnote{Manifesto for Agile Software Development:
\href{https://agilemanifesto.org}{https://agilemanifesto.org}}
from the start, we were able to adapt to these changes without losing
the progess he made in said thesis. Quoting from the Agile Manifesto:
the progress he made in said thesis. Quoting from the Agile Manifesto:
``Responding to change over following a plan''.
This is a really important takeaway.
After becoming a full time employee at Avatao I was tasked with developing the project
After becoming a full-time employee at Avatao I was tasked with developing the project
with Bálint, who was later reassigned to work on the GDPR compliance of the platform.
Thus it became my job to turn the framework into a stable code base ready for
usage by content creators and to implement most of the features that we've envisioned

View File

@ -18,13 +18,13 @@ be honest and admit that I have a sweet spot for this project.
Currently a total of 63 tutorials based on the framework are running in production,
with new ones being released on a weekly basis.
These exercises have been solved several hunders of times.
These exercises have been solved several hundreds of times.
User feedback is getting better and better as the project moves forward.
As a maintainer, currently I know about a single unfixed bug in the framework, which
is getting reported by users as well.
There are more, of course, the world is never going to run out of bugs to fix,
but at least I sleep well knowing that things aren't breaking on a constant basis.
Considering that this is a one year old project including initial development,
Considering that this is a one-year-old project including initial development,
I'd consider this a solid success.
We were able to achieve most --- if not all --- of the goals we have envisioned on the
@ -38,27 +38,27 @@ apart from implementing new features, as these will always keep coming in, and w
have some great ones planned, that I can promise.
First of all I think that we need to put more focus on developing TFW, as currently
other projects are often being priorized over it.
other projects are often being prioritized over it.
While some of these are understandable, the framework is a very promising project
with great potential and deserves more attantion from us.
with great potential and deserves more attention from us.
The fact that it is stable does not validate neglecting it.
I'd also like to concentrate on stabilizing the API of the framework.
Currently each major release lasts for a few months before I am forced to break something
to accomodate new features.
to accommodate new features.
While the communication of these breakages is fine --- we use mailing lists for this purpose
and our versioning scheme seems solid so far, this forces developers
to constantly update older tutorials to comply new API.
to constantly update older tutorials to comply new API\@.
To make this better I'd need to consider planning ahead more, so that the newest API is flexible
enough to support new features on the roadmap and not get distracted as much by
other features emerging on the horizon.
An other thing is that I often feel like that there are some things in using TFW
Another thing is that I often feel like that there are some things in using TFW
that could be made a lot easier. As a maintainer sometimes I find it hard to
tell what these things exactly are, as I know the framework inside out, having written most
of the codebase myself.
I'd like to set some time aside to create tutorials using the framework myself,
so I can better narrow these potential difficulities down.
so I can better narrow these potential difficulties down.
This would require me to be able to take things slow for a few weeks, as this is not
something that is possible to do effectively in a rush. In the summer months, maybe?
@ -81,12 +81,12 @@ as I just simply enjoy admiring quality typography which WYSIWYG%
I've spent a long time working on and maintaining the Tutorial Framework.
While the list of technical things I've learned is long and exciting, I also feel like
I've learned a lot about supporting other developers, project management and communication.
An other thing that I've been able to learn is to adopt a more patient mindset while
Another thing that I've been able to learn is to adopt a more patient mindset while
working. Back in the day I used to be nervous because of deadlines and things not
working how they were supposed to, but now I know that these things are a part
of the job and one must be able to deal with them without getting agitated.
Any time I feel like something is not OK, I just try take a step back, relax a bit to
blow of steam and approach the issue without acting in haste.
blow off steam and approach the issue without acting in haste.
I think this is not too related to working as a software engineer, but something
that can be applied to anything we do.

View File

@ -12,10 +12,10 @@ The main points include:
for this machine, such as ones that display messages to the user
\item Implementing the required event handlers, which may trigger state transitions in the FSM,
interact with non-TFW code and do various things that might be needed during an exercise,
such as compiling code written by the user or runnign unit tests
such as compiling code written by the user or running unit tests
\item Defining what processes should run inside the container besides the things TFW
starts automatically
\item Setting up reverse proxying for any user-facing network application such as webservers
\item Setting up reverse proxying for any user-facing network application such as web servers
\end{itemize}
At first all these tasks can seem quite overwhelming.
Remember that \emph{witchcraft} is what we practice here after all.
@ -58,9 +58,9 @@ understanding of how the framework interacts with client code.
The \code{config.yml} file is an Avatao challenge configuration file,
which is used describe what kind of Docker containers implement a challenge,
what ports do they expose talking what protocols, define the name of the
excercise, it's difficulity, and so on.
exercise, it's difficulty, and so on.
Every Avatao challenge must provide such a file.
An other thing that is not even indicated on the structure above is the \code{metadata}
Another thing that is not even indicated on the structure above is the \code{metadata}
directory, which contains the short and long descriptions of challenges in
Markdown format.
@ -131,8 +131,8 @@ or kubernetes%
to orchestrate their containers).
This approach is not suitable for TFW, as it would require the framework to orchestrate
Docker containers from inside a container managed by the same Docker daemon, which is
feasible in theory but very hard and unservicable to do in practice.
This would require doing something like mounting the unix domain socket used
feasible in theory but very hard and unserviceable to do in practice.
This would require doing something like mounting the UNIX domain socket used
to manage the Docker daemon inside a running container managed by that daemon,
which is a fun thing to
play around with in my free time but not something suitable for running in production,
@ -146,7 +146,7 @@ process started, and who gets PID 1 traditionally.} 1, which in turn starts all
programs defined in the \code{solvable/supervisor} directory.
Content creators can use supervisor configuration files to define these programs.
For example, a developer would write a file similar to this one and place it into the
\code{solvable/supervisor} directory to run a webserver written in Python:
\code{solvable/supervisor} directory to run a web server written in Python:
\begin{lstlisting}
[program:yourprogram]
user=user
@ -155,7 +155,7 @@ command=python3 server.py
autostart=true
\end{lstlisting}
As mentioned earlier in~\ref{processmanagement}, any program that is started this way
can be managed by the framewok using API messages.
can be managed by the framework using API messages.
All this is possible through using the xmlrpc%
\footnote{\href{https://docs.python.org/3/library/xmlrpc.html}
{https://docs.python.org/3/library/xmlrpc.html}}
@ -170,44 +170,44 @@ invoke it's command line utility in a separate process when you need something d
For simplicity, exercises based on the framework only expose a single port from the
\code{solvable} container.
This port is required to serve the frontend of the framework.
If this is the case, how do we run additional web applications to showcase vulnerabilies
If this is the case, how do we run additional web applications to showcase vulnerabilities
on during a tutorial?
Since one port can only be bound by one process at a time, we will need to
run a reverse-proxy%
\footnote{\href{https://www.nginx.com/resources/glossary/reverse-proxy-server/}
{https://www.nginx.com/resources/glossary/reverse-proxy-server/}} server inside the
container to
bind the exposed port and redirect traffic to other webservers binding non-exposed ports.
bind the exposed port and redirect traffic to other web servers binding non-exposed ports.
To support this, TFW automatically starts an nginx webserver. It uses this nginx
To support this, TFW automatically starts an nginx web server. It uses this nginx
instance to serve the framework frontend as well.
It is possible to supply additional configurations to this server in a convenient manner:
any configuration files placed into the \code{solvable/nginx} directory will be
interpreted by nginx once the container has started.
To set up the reverse-proxying of a webserver running on port 3333,
To set up the reverse-proxying of a web server running on port 3333,
one would write a configuration file similar to this one:
\begin{lstlisting}
location /yoururl {
proxy_pass http://127.0.0.1:3333;
}
\end{lstlisting}
Now the content served by this websever on port 3333
will be available on the url \code{<challenge-url>/yoururl} despite that port 3333
Now the content served by this web server on port 3333
will be available on the URL \code{<challenge-url>/yoururl} despite that port 3333
does not accept connections from outside the container as it is not exposed.
It is very important to understand, that developers
have to make sure that their web application \emph{behaves well} behind a reverse proxy.
What this means is that they are going to be served from a ``subdirectory'' of the top
level URL\@:
for example \code{/register} will be served under \code{/yoururl/register}.
This means that all links in the final HTML must refer to the proxied urls, e.g.\
\code{/yoururl/login}, and server side redirects must point to these correct hrefs as well.
This means that all links in the final HTML must refer to the proxied URLs, e.g.\
\code{/yoururl/login}, and server-side redirects must point to these correct hrefs as well.
Idiomatically this is usually implemented by supplying a \code{BASEURL}
to the application through an environment variable, so that it is able to set
itself up correctly.
\subsection{Copying Configuration Files}
Behind the curtains, the Tutorial Framework uses some Dockerfile trickery to
faciliate the copying of supervisor and nginx configuration files to their correct
facilitate the copying of supervisor and nginx configuration files to their correct
locations.
Normally when one uses the \code{COPY}%
\footnote{\href{https://docs.docker.com/engine/reference/builder/\#copy}
@ -251,7 +251,7 @@ framework-specific directories.
The use of this directory is not mandatory, only a good practice, as developers
are free to implement the non-TFW parts of their exercises as they see fit
(the copying of these files into image layers using \code{solvable/Dockerfile}
is their resposibility as well).
is their responsibility as well).
\section{Configuring Built-in Components}
@ -268,11 +268,11 @@ initialized in, which exposes several communicative options through the
\code{__init__()} methods of these event handlers:
\lstinputlisting[
language=python,
caption={Example of inicializing some event handlers},
caption={Example of initializing some event handlers},
captionpos=b
]{listings/event_handler_main.py}
\section{Impelenting a Finite State Machine}
\section{Implementing a Finite State Machine}
The Tutorial Framework allows developers to define state machines in two ways,
as discussed before.
@ -281,7 +281,7 @@ to showcase the capabilities of the framework.
\subsection{YAML based FSM}
YAML\footnote{YAML Ain't Markup Language: \href{http://yaml.org}{http://yaml.org}}
is a human friendly data serialization standard and a superset of JSON.
is a human friendly data serialization standard and a superset of JSON\@.
It is possible to use this format to define a state machine like so:
\lstinputlisting[
caption={A Finite State Machine implemented in YAML},
@ -291,7 +291,7 @@ This state machine has two states, state \code{0} and \code{1}.
It defines a single transition between them, \code{step_1}.
On entering state \code{1} the FSM will write a message to the frontend messaging component
by invoking TFW library code using Python.
The transition can only occour if the file \code{allow_step_1} exists.
The transition can only occur if the file \code{allow_step_1} exists.
YAML based state machine implementations also allow the usage of the Jinja2%
\footnote{\href{http://jinja.pocoo.org/docs/2.10/}{http://jinja.pocoo.org/docs/2.10/}}
@ -338,7 +338,7 @@ I am going to use the Python programming language, but it isn't hard
to create event handlers in other languages, as the only thing
they have to be capable of is communicating with the TFW server using
ZeroMQ sockets, as previously discussed.
The library provided by the framework abstracts low level socket logic
The library provided by the framework abstracts low-level socket logic
away by implementing easy to use base classes.
Creating such base classes in a given language shouldn't take longer
than a few hours for an experienced developer.
@ -346,7 +346,7 @@ Our challenge creators have already implemented similar libraries for
Java, JavaScript and C++ as well.
\lstinputlisting[
language=python,
caption={A very simple event hander implemented in Python},
caption={A very simple event handler implemented in Python},
captionpos=b
]{listings/event_handler_example.py}
This simple event handler subscribes to the \code{fsm_update} messages,
@ -359,7 +359,7 @@ abstract method, which is used to, well, handle events.
\section{Setting Up a Developer Environment}\label{devenv}
To make getting started as smooth as possible I have created
a ``bootstrap'' script which is capable of creating a development envrionment from
a ``bootstrap'' script which is capable of creating a development environment from
scratch.
This script is distributed as the following bash one-liner:
\begin{lstlisting}[language=bash]
@ -370,7 +370,7 @@ This command downloads the script using \code{curl}%
In the open source community it is quite common to distribute installers this way%
\footnote{A good example of this is oh-my-zsh:
\href{https://github.com/robbyrussell/oh-my-zsh}{https://github.com/robbyrussell/oh-my-zsh}},
which might seem a little scary at first, but is not less safe then
which might seem a little scary at first, but is not less safe than
downloading and executing a binary installer from a website with a valid TLS certificate, as
\code{curl} will fail with an error message if the certificate is invalid.
This is because both methods place their trust in the PKI~\footnote{Public Key Infrastructure}
@ -400,13 +400,13 @@ mitigating MITM attacks.
The bootstrap script clones the three TFW repositories and does several steps
to create a working environment into a single directory, that is based on
test-tutorail-framework:
test-tutorial-framework:
\begin{itemize}
\item It builds the newest version of the TFW baseimage locally
\item It pins the version tag of this image in \code{solvable/Dockerfile},
so that this newly-built version will be used by the tutorial
\item It places the latest frontend in \code{solvable/frontend} with
depencendies installed
dependencies installed
\end{itemize}
It is important to note that this script \emph{does not} install anything system-wide,
it only works in the directory it is being executed from.
@ -456,7 +456,7 @@ to remove all the dependencies that won't be used when running the application%
\footnote{Otherwise it won't be possible to serve these applications efficiently
over the internet.}.
The problem is, that these things can take a \emph{really} long time.
This is why today frontend builds usually take a lot longer than building anything
This is why today frontend builds usually take a lot longer then building anything
not involving JavaScript (such as C++, C\# or any other compiled programming language).
This mess presents it's own challenges for the Tutorial Framework as well.
@ -475,7 +475,7 @@ To circumvent this, it is possible to entirely exclude the Angular frontend from
build, using build time arguments%
\footnote{In practice this is done by supplying the option
\code{--build-arg NOFRONTEND=1} to Docker.}.
But when doing so, developers would have to run the frondent locally with
But when doing so, developers would have to run the frontend locally with
the whole \code{node_modules} directory present.
The bootstrap script takes care of putting these dependencies there,
while the \code{tfw.sh} script is capable of starting a development server
@ -483,9 +483,9 @@ to serve the frontend locally using \code{ng serve} besides starting
the Docker container without the frontend.
If this whole thing wasn't complicated enough, since Docker binds the port
the container is going to use, \code{tfw.sh} has to run the Angular dev server on
an other port, then use the proxying features of Angular-CLI to forward requests
from this port to the runnign Docker container when requesting resources
other then the entrypoint to the Angular application.
another port, then use the proxying features of Angular-CLI to forward requests
from this port to the running Docker container when requesting resources
other than the entrypoint to the Angular application.
This is the reason why the frontend is accessible through port \code{4200} (default
port for \code{ng serve}) when using \code{tfw.sh} to start a tutorial, but when running
@ -535,7 +535,7 @@ of cat, for the additional fun factor, and because we love cats.
The part after that is a timestamp of the day the release was made on.
I only change major versions when I introduce backwards incompatible changes in the
API of the framework, this way developers can trust that releases
with the same majors are compatible with eachother in regards to client code.
with the same majors are compatible with each other in regards to client code.
The \code{master} branches of the frontend-TFW and test-TFW repositories are always
kept compatible with the newest release tag of the baseimage.