Mostly finish writing thesis
This commit is contained in:
parent
2206844a06
commit
766198cfd1
@ -137,3 +137,20 @@
|
||||
year={2018},
|
||||
month=sep,
|
||||
}
|
||||
|
||||
@online{CyberArmsRace,
|
||||
title={Inside the secret digital arms race: Facing the threat of a global cyberwar},
|
||||
url={https://www.techrepublic.com/article/inside-the-secret-digital-arms-race/},
|
||||
language={english},
|
||||
author={Steve Ranger},
|
||||
year={2014},
|
||||
}
|
||||
|
||||
@online{MarriottBreach,
|
||||
title={Marriott breach leaves 500 million exposed with passport, card numbers stolen},
|
||||
url={https://arstechnica.com/information-technology/2018/11/marriott-breach-leaves-500-million-exposed-with-passport-card-numbers-stolen/},
|
||||
language={english},
|
||||
author={Megan Geuss},
|
||||
year={2018},
|
||||
month=nov,
|
||||
}
|
||||
|
@ -77,16 +77,28 @@ than showcasing actual, real-life API messages.
|
||||
|
||||
\section{Messages Component}
|
||||
|
||||
The framework must allow content creators to communicate their \emph{message} to the user.
|
||||
In other words, some way must be provided to ``talk'' to users.
|
||||
The framework must allow content creators to communicate with the user,
|
||||
and provide some mechanism to enable ``talking'' to them.
|
||||
This is the responsibility of the \emph{messages} component, which
|
||||
provides a chatbox-like element on the frontend.
|
||||
The simplest form of communication it accomodates it the insertion of text
|
||||
into the chatbox through API messages.
|
||||
Every message has an optional \emph{originator}, which serves to remind the user
|
||||
of the purpose of the given message.
|
||||
This component always expects messages it receives to be in Markdown%
|
||||
\footnote{\href{https://daringfireball.net/projects/markdown/}
|
||||
{https://daringfireball.net/projects/markdown/}} format,
|
||||
so that it is possible to nicely format any text that one might want to display.
|
||||
This is especially important when displaying inline code with text around it,
|
||||
so that it is easier to read for the user.
|
||||
Every message has an optional \emph{originator} field, which serves
|
||||
to remind the user of the purpose of the given message.
|
||||
These messages are also timestamped so that it is easier to navigate through them
|
||||
and look back older messages from the user.
|
||||
If no timestamp is present in the API message, then it will be added on
|
||||
the frontend.
|
||||
This is useful, because this will use system time on the user's machine,
|
||||
and as such time zones will not be an issue (whereas if we suggested adding
|
||||
timestamps to the messages on the backend, content creators would have to
|
||||
deal with conversions between time zones).
|
||||
|
||||
\pic[width=.5\textwidth]{figures/chatbot.png}{The avataobot typing in the messages component}
|
||||
|
||||
@ -111,6 +123,8 @@ message lengths:
|
||||
|
||||
The value 5 comes from the fact that on average english words are 5
|
||||
characters long according to some studies.
|
||||
This value could be made configurable in the future, but currently there
|
||||
are no plans to make non-english challenges.
|
||||
|
||||
\section{IDE Component}\label{idecomponent}
|
||||
|
||||
@ -168,6 +182,13 @@ the selected file.
|
||||
The code making this feature possible is reused several times in the framework
|
||||
for interesting purposes such as monitoring the logs of processes.
|
||||
|
||||
It is also worth to mention that integrating such file monitoring into the framework
|
||||
is not quite as simple as described above, because one has to deal with many issues like the
|
||||
semi-undeterministic nature of how a single file modification can sometimes result in
|
||||
several inotify events, or implement rate limiting for the whole thing to avoid
|
||||
saturating the messaging system with file content updates triggered by said events
|
||||
being triggered too frequently.
|
||||
|
||||
The editor also allows content creators to completely control it using API messages.
|
||||
This involves the selecting, reading and writing of files as well as changing the
|
||||
selected directory.
|
||||
@ -175,6 +196,14 @@ These features allow content creators to ``guide'' a user through code bases
|
||||
for example, where in each step of a tutorial a file is opened and explained
|
||||
through messages sent to the chatbox of the messages component.
|
||||
|
||||
Developers have to \emph{explicitly} allow directories one by one to be listed by the
|
||||
editor. This is done to avoid access control issues in case the editor is
|
||||
running with more permissions than the user should have%
|
||||
\footnote{Actually this involves extra caution, such as dealing with
|
||||
symlinks in an allowed directory which could point to other, non-allowed locations}.
|
||||
It is also possible to blacklist file patterns (so that binary files can be
|
||||
excluded for example, as a text editor is not suitable to deal with these).
|
||||
|
||||
\section{Terminal Component}
|
||||
|
||||
This is a full-fledged xterm terminal emulator running right in the user's browser.
|
||||
@ -191,23 +220,27 @@ This small server is responsible for spawning bash sessions and
|
||||
unix pseudoterminals (or \code{pty}s) in the \code{solvable} Docker
|
||||
container.
|
||||
It is also responsible for connecting the master end of the \code{pty} to the
|
||||
emulator running in your browser and the slave end to the bash session it has
|
||||
emulator running in the browser and the slave end to the bash session it has
|
||||
spawned.
|
||||
This way users are able to work in the shell displayed on the frontend just like
|
||||
they would on their home machines, which allows for great tutorials
|
||||
explaining topics that involve the usage of a shell.
|
||||
Note that this allows coverion an extremely wide variety of topics using TFW: from compiling
|
||||
shared libraries for development, using cryptographic FUSE filesystems
|
||||
Note that this allows us to cover an extremely wide variety of topics using TFW:
|
||||
from compiling shared libraries for development, to using cryptographic FUSE filesystems
|
||||
for enhanced privacy%
|
||||
\footnote{\href{https://github.com/rfjakob/gocryptfs}{https://github.com/rfjakob/gocryptfs}},
|
||||
to automating cloud infrastructure using Ansible%
|
||||
or automating cloud infrastructure using Ansible%
|
||||
\footnote{\href{https://www.ansible.com}{https://www.ansible.com}}.
|
||||
|
||||
This component exposes several functions through TFW message APIs, such as injecting commands
|
||||
This component exposes several functions through TFW API messages, such as injecting commands
|
||||
to the terminal, reading command history and registering callbacks that are invoked when
|
||||
certain command are executed by the user.
|
||||
the user executes anything in the terminal.
|
||||
This allows content developers to implement functionality such as advancing the
|
||||
tutorial when a certain command was invoked, or detect common mistakes in using certain
|
||||
tools, such as warning users when they try to use an outdated cipher when
|
||||
encrypting a file using \code{openssl}, or if they generate an RSA key with a small key size.
|
||||
|
||||
\pic{figures/terminal.png}{The Frontend Terminal of TFW Running top}
|
||||
\pic{figures/terminal.png}{The frontend terminal of TFW running top}
|
||||
|
||||
The implementation of reading command history is quite an exotic one.
|
||||
The framework needs to be able to detect if the user has executed any command in the
|
||||
@ -216,19 +249,23 @@ This is not an easy thing to accomplish without relying on some sort of heavywei
|
||||
monitoring solution such as Sysdig%
|
||||
\footnote{\href{https://sysdig.comq}{https://sysdig.com}}.
|
||||
I deemed most simiar systems a huge overkill to implement this functionality, and their
|
||||
memory footprints are not something we could afford here.
|
||||
memory footprints are not something we could afford here%
|
||||
\footnote{These containers will be spawned on a per-user basis, so we must be as
|
||||
conservative with memory as possible}.
|
||||
Another way would be to use \code{pam_tty_audit.so} in the PAM%
|
||||
\footnote{Linux Pluggable Authentication Modules:
|
||||
\href{http://man7.org/linux/man-pages/man3/pam.3.html}
|
||||
{http://man7.org/linux/man-pages/man3/pam.3.html}}
|
||||
configurations responsible for logins, as this allows for various TTY auditing functions,
|
||||
but I have found an ever simpler approach to the problem in the end.
|
||||
By using the inotify system built into TFW, I can set up the user's environment in
|
||||
such a way, that I can enforce and determine the location of the bash \code{HISTFILE}%
|
||||
but I have found an even simpler approach to solving this problem in the end.
|
||||
It is possible to set up the user's environment in
|
||||
such a way during the build of the image, that I can enforce and determine the
|
||||
location of the bash \code{HISTFILE}%
|
||||
\footnote{This environment variable contains the path to the file bash writes command
|
||||
history to}
|
||||
of the user.
|
||||
This way I can monitor changes made to this file and read the commands executed
|
||||
By combining this with the inotify system built into TFW,
|
||||
the framework can monitor changes made to this file and read the commands executed
|
||||
by the user from it.
|
||||
It is important to keep in mind that the user is able to ``sabotage'' this method%
|
||||
\footnote{By unsetting the \code{HISTFILE} envvar for example},
|
||||
@ -236,6 +273,18 @@ but that should not be an issue as this is not a feature that is intended to be
|
||||
used in competitive environments (and if the users of a tutorial intentionally
|
||||
break the system under themselves, well, good for them).
|
||||
|
||||
An other advantage of this method is that this can be applied to any interactive
|
||||
application that supports logging commands executed in them in some way or an other.
|
||||
A good example would be GDB%
|
||||
\footnote{\href{https://www.gnu.org/software/gdb/}{https://www.gnu.org/software/gdb/}},
|
||||
which supports an option called \code{set trace-commands on}. This option flushes
|
||||
command history to a file after every executed command.
|
||||
This feature can be combined with the file monitoring capabilities of the framework, and now
|
||||
we can even detect commands executed inside GDB by the user.
|
||||
This is a good example of the flexibility provided by this solution. Feature requests
|
||||
like ``I'd like to create a tutorial about <insert software here>'' are quite common, and
|
||||
supporting them is really easy using this extensible system.
|
||||
|
||||
\section{Console Component}
|
||||
|
||||
This component is a simple textbox that can be used to display anything to the user,
|
||||
@ -245,40 +294,55 @@ API through TFW messages to write and read it's contents.
|
||||
|
||||
It works great when combined with the process management capabilities of the framework:
|
||||
if configured to do so it can display the output of processes like webservers in real time.
|
||||
When using this next to the frontend editor of the framework, it allows for a development
|
||||
When using this next to the TFW frontend editor, it allows for a development
|
||||
experience similar to working in an IDE on your laptop.
|
||||
|
||||
\pic{figures/console_and_editor.png}{The Console Displaying Live Process Logs Next to the TFW Code Editor}
|
||||
Similarly to other developer tools, I chose to display this component inside the terminal
|
||||
window, so that the user can switch between the two in order to conserve space using tabs.
|
||||
It is also possible to configure which one should be displayed by default,
|
||||
as well as switching between them mid-tutorial using API messages.
|
||||
|
||||
\pic{figures/console_and_editor.png}{The console displaying live process logs next to the TFW editor}
|
||||
|
||||
\section{Process Management}\label{processmanagement}
|
||||
|
||||
The framework includes an event handler capable of managing processes running inside
|
||||
the \code{solvable} Docker container.
|
||||
It's capabilities include starting, stopping and restarting processes.
|
||||
It is also capable of emitting the standard out or standard error logs of processes
|
||||
(by broadcasting TFW messages).
|
||||
This component can be iteracted with using TFW API messages.
|
||||
The capabilities of this componenet include the starting, stopping and restarting of processes,
|
||||
as well as emitting the standard out or standard error logs belonging to them, even
|
||||
in real-time (by broadcasting TFW messages).
|
||||
This logging feature allows for interesting possiblities such as the handling
|
||||
of live process output, or just requesting the logs belonging to a certain application when
|
||||
some sort of event has occurred (such as on errors).
|
||||
This component also can be interacted with using TFW API messages.
|
||||
The ``Deploy'' button on the code editor uses this component to restart
|
||||
processes.
|
||||
processes, and the console component also uses this event handler to display
|
||||
real-time logs.
|
||||
|
||||
The Tutorial Framework uses supervisor%
|
||||
\footnote{\href{http://supervisord.org}{http://supervisord.org}}
|
||||
to run multiple processes inside a Docker container
|
||||
(whereas usually Docker containers run a single process only).
|
||||
This is going to be explained further in a later chapter.
|
||||
All this is possible through using the xmlrpc%
|
||||
\footnote{\href{https://docs.python.org/3/library/xmlrpc.html}{https://docs.python.org/3/library/xmlrpc.html}}
|
||||
API exposed by supervisor, which allows the framework to iteract with processes it controls.
|
||||
This is going to be explained further in Chapter~\ref{usingtfw}.
|
||||
|
||||
It is also possible to find out what files does a process write logs to.
|
||||
Combining this with the inotify capabilities of TFW explained
|
||||
briefly in~\ref{idecomponent}, it becomes possible to implement live log monitoring
|
||||
in the framework.
|
||||
The features involving the use of inotify were among the most difficult ones implement,
|
||||
since the sheer number of impossible to debug issues that such
|
||||
since the sheer number of almost impossible to debug issues that such
|
||||
a complex system could come with.
|
||||
|
||||
I'll briefly explain such a bug, which I've found to be immersely exciting.
|
||||
I'll briefly explain such a bug, which I've found to be immensely exciting.
|
||||
To understand this, it is necessary to signify, that the inotify API supplied by
|
||||
the Linux kernel is not capable
|
||||
of monitoring a single file, it is only able to watch whole directories.
|
||||
I was unaware of this fact before running into this issue, as the Python
|
||||
bindings I was using did not warn me about supplying a filename on top of
|
||||
the directory path, and just stripped down the filename silently%
|
||||
\footnote{In software development it is considered a bad practice to do such things
|
||||
implicitly. It is better to fail loud and clear instead of trying to figure out
|
||||
what the user meant to do.}.
|
||||
During the initial development of this feature all processes inside the
|
||||
\code{solvable} Docker container were writing their logs to files
|
||||
in the FHS%
|
||||
@ -291,21 +355,22 @@ This had caused an infinite recursion: when a process would write to \code{/tmp/
|
||||
inotify would invoke a process that would also log to that location causing the kernel to
|
||||
emit more inotify events, which in turn would cause more and more new proesses to spawn
|
||||
and write to \code{/tmp/}, causing the whole procedure to repeat again and again.
|
||||
This continued until my machine would start to run out of memory and stat swapping
|
||||
This continued until my machine would start to run out of memory and begin swapping
|
||||
pages to disk%
|
||||
\footnote{When a modern operating system runs out of physical RAM, it is going to swap
|
||||
virtual memory pages to disk so it can continue to operate --- slowly}
|
||||
like crazy, causing the whole system to spiral downwards
|
||||
in a spectacular fashion until the whole thing managed to crash.
|
||||
It was an event of such chaotic beauty, that I often fondly recall it to this day.
|
||||
It was an event of such rare and chaotic beauty, that I often fondly recall it to this day.
|
||||
After my first encounter with the bug I decided to have lunch instead.
|
||||
Of course it would take me several hours to identify the exact causes behind this
|
||||
fascinating phenomenon, but those were \emph{very} fun hours.
|
||||
fascinating phenomenon, but those were \emph{very} fun hours at least.
|
||||
|
||||
\section{FSM Management}
|
||||
|
||||
I have already mentioned the event handler called \code{FSMManagingEventHandler},
|
||||
which is responsible for managing the framework FSM.
|
||||
For completeness I chose to include it on this chapter as well.
|
||||
For completeness I chose to include it in this chapter as well.
|
||||
The API it exposes through TFW messages allows client code to attempt stepping the
|
||||
state machine.
|
||||
As previously explained this is something that is considered to be a \emph{privileged}
|
||||
@ -342,26 +407,27 @@ thing work with the Same Origin Policy%
|
||||
{https://developer.mozilla.org/en-US/docs/Web/Security/Same-origin\_policy}}
|
||||
being in effect?
|
||||
The answer is that developers must use a \emph{relative url}, that is an URL relative
|
||||
to the entry pont of the TFW frontend itself.
|
||||
to the entry point of the TFW frontend itself.
|
||||
To allow serving several web applications from a single port the framework
|
||||
supports optional reverse-proxy configurations through the nginx%
|
||||
\footnote{\href{http://nginx.org}{http://nginx.org}} web server ran by the framework.
|
||||
More on this in a later chapter.
|
||||
More on this in a Chapter~\ref{usingtfw}.
|
||||
|
||||
\section{Various Frontend Features}
|
||||
|
||||
The angular frontend of features several different layouts.
|
||||
The Angular frontend of the framework features several different layouts.
|
||||
These layouts are useful to accomodate different workflows for users,
|
||||
such as the previous exampe of editig code and being able to view the
|
||||
result of said code in real time next to the editor.
|
||||
Another example would be editing Ansible playbooks in the file editor,
|
||||
and then trying to run them in the terminal.
|
||||
There are also almost full screen views for each component that makes sense
|
||||
to be used that way.
|
||||
to be used like that, for instance the code editor can be used to conveniently
|
||||
edit larger files this way.
|
||||
|
||||
The frontend was designed in a way to be fully responsive in windows sizes
|
||||
The frontend was designed in a way to be fully responsive in window sizes
|
||||
that still keep the whole thing usable (i.e.\ it would not be practial to start
|
||||
solving TFW tutorials on a smart phone, simply because of size limits, so they are
|
||||
solving TFW tutorials on a smart phone, simply because of size limits, so such small screens are
|
||||
not supported, but the frontend still behaves as expected on small laptops or bigger tablets).
|
||||
This is not an easy thing to impelent and maintain due to the lots of small
|
||||
incompatibilites between browsers given the complexity of the frontend.
|
||||
@ -381,15 +447,34 @@ The framework frontend is built on grid layout and flexboxes%
|
||||
{https://developer.mozilla.org/en-US/docs/Web/CSS/CSS\_Grid\_Layout}},
|
||||
which gives us the best hopes of being able to maintain it down the line.
|
||||
It would involve unimaginable horrors to support this multi-layout
|
||||
frontend on older browsers, so browsers without flex and grid
|
||||
support are not supported by TFW.
|
||||
frontend on older browsers without flexboxes and grid, so I have
|
||||
decided to avoid spending development time on these and make sure that
|
||||
it works like a charm in reasonably modern browsers.
|
||||
Arguably this is a good thing, as people should keep their browsers up to date to
|
||||
follow frequent security patches anyway, so let this serve as a reminder to
|
||||
developers looking to get into IT security that the first step is to
|
||||
keep your software up to date.
|
||||
|
||||
The frontend of the framework exposes some additional APIs.
|
||||
These include the changing of layouts, selecting the terminal or console
|
||||
\pic{figures/tfw_grid.png}{The grid layout of the TFW frontend showcased from developer tools}
|
||||
|
||||
There are several additional APIs exposed by the frontend,
|
||||
which include the changing of layouts, selecting the terminal or console
|
||||
component to be displayed, the possibility of dynamically modifying
|
||||
frontend configuration values (such as the frequency of autosaving the files in the editor)
|
||||
and more.
|
||||
|
||||
To accomodate communication with the TFW server, the frontend of the framework
|
||||
comes with some library code which can be used to send and receive TFW messages.
|
||||
This code is mostly WebSockets combined with RxJS%
|
||||
\footnote{\href{https://rxjs-dev.firebaseapp.com}{https://rxjs-dev.firebaseapp.com}},
|
||||
which makes it easier to write completely asynchronous, callback based code.
|
||||
The observables%
|
||||
\footnote{\href{http://reactivex.io/documentation/observable.html}
|
||||
{http://reactivex.io/documentation/observable.html}}
|
||||
provided by RxJS are used all around the TFW frontend, as our library code
|
||||
exposes the operation of receiving data from WebSockets as observables.
|
||||
Client code can subscribe to these observables using callbacks,
|
||||
which will be invoked when the observable emits a new value
|
||||
(i.e.\ when a new message was received on the WebSocket).
|
||||
When using Angular it is generally a good idea to get familiar with reactive programming,
|
||||
because it is very easy to get lost in the callback hell without it.
|
||||
|
@ -6,25 +6,29 @@ In this chapter I would like to express my gratitude towards great people who ha
|
||||
helped me in some way or an other along the way.
|
||||
|
||||
First of all I would like to thank Bálint Bokros, my good friend and colleauge for
|
||||
his awesome work during the initial phases of development and for always
|
||||
his awesome work regarding TFW and for always
|
||||
being open to provide useful input.
|
||||
He has also earned my gratitude by always being there to lift my spirits, be that
|
||||
with beer or general friendship things.
|
||||
with beer or friendship.
|
||||
|
||||
I'd also like to thank Gábor Pék and Márk Félegyházi,
|
||||
for letting us make this project possible by always setting reasonable goals,
|
||||
being there to provide feedback and encouraging us along the way.
|
||||
Gábor Pék also contributed to this in several ways, for which we always will be
|
||||
grateful.
|
||||
Gábor has also contributed even more directly, for which I always will be
|
||||
grateful: it was mostly due to his visions that we were able to
|
||||
dream this big, something that I've found extremely valuable when looking back.
|
||||
|
||||
I can't thank my consultant, Levente Buttyán enough for enduring my general
|
||||
inability to deal with deadlines and administration.
|
||||
|
||||
I also appreciate the company my colleagues have provided in the office,
|
||||
by always being there be that for work or fun. They also have my gratitude
|
||||
for general contributions to the framework, be that with ideas, assistance,
|
||||
I also appreciate the great morale my colleagues and friends provided in the office,
|
||||
by always being there, be that for work or fun. This project couldn't have been realised
|
||||
sitting in a depressing cube between 200 people hating their jobs. They also have my gratitude
|
||||
for direct contributions to the framework, be that with ideas, assistance,
|
||||
or actual code.
|
||||
|
||||
Finally I'd like to thank all the developers for using the framework and creating
|
||||
great tutorials with it. They always provide useful feedback, bug reports, and
|
||||
great tutorials with it. At the end of the day it feels awesome to know that
|
||||
my work helps other people, and it is content developers who make this possible.
|
||||
They always provide useful feedback, bug reports, and
|
||||
have great feature requests or even contributions.
|
||||
|
@ -8,8 +8,10 @@ engineering is on the rise.
|
||||
While we are enjoying the comfort that information technology provides us, we often forget
|
||||
about the risks involved in relying so much on software in our everyday lives.
|
||||
When taking a look on recent events, such as a cyber arms race taking place between leading
|
||||
powers, 50 million Facebook accounts being breached
|
||||
powers\cite{CyberArmsRace}, 50 million Facebook accounts being breached
|
||||
due to the incorrect handling of access tokens\cite{FacebookBreach},
|
||||
the very recent Marriott hack where sensitive data on 500 million customers
|
||||
was stolen\cite{MarriottBreach},
|
||||
or how China is building an Orwellian state of total digital surveillance%
|
||||
\cite{ChinaSurv}\cite{ChinaCredit},
|
||||
it becomes clear that security and privacy in the IT sector
|
||||
|
@ -3,38 +3,39 @@
|
||||
In this chapter I am going to evaluate the state of the project and set future goals for
|
||||
the framework.
|
||||
I'll also try and reflect on some of the most important things I have learned during
|
||||
working on this it, in case I've experienced something that might be useful for
|
||||
working on it, in case I've experienced something that might be useful for
|
||||
someone else reading this in the future.
|
||||
|
||||
\section{Project Evaluation}
|
||||
|
||||
How do we define if a project is a success or not?
|
||||
Instead of trying to do so, in this section I am going to express
|
||||
Instead of attempting to do so, in this section I am going to express
|
||||
my personal feelings and opinions about the Tutorial Framework.
|
||||
To get unbiased opinions I'd recommend asking someone who hasn't been maintaining
|
||||
this project for so long.
|
||||
I could promise to be as objective as possible, but I think that it is just better
|
||||
to admit that I have a sweet spot for this project.
|
||||
I could promise to be as objective as possible, but I think that it is just better to
|
||||
be honest and admit that I have a sweet spot for this project.
|
||||
|
||||
Currently a total of 63 tutorials based on the framework are running in production,
|
||||
with new ones being released on a weekly basis.
|
||||
These exercises have been solved several hunders of times.
|
||||
User feedback is getting better and better as the project moves forward.
|
||||
As a maintainer currently I know about a single unfixed bug in the framework, which
|
||||
As a maintainer, currently I know about a single unfixed bug in the framework, which
|
||||
is getting reported by users as well.
|
||||
There are more, of course, the world is never going to run out of bugs to fix,
|
||||
but I sleep well knowing that things aren't breaking on a constant basis.
|
||||
but at least I sleep well knowing that things aren't breaking on a constant basis.
|
||||
Considering that this is a one year old project including initial development,
|
||||
I'd consider this a solid success.
|
||||
|
||||
We were able to achieve most of the goals we have envisioned on the beginning of
|
||||
this journey, and considering some of the things we have planned for the future
|
||||
We were able to achieve most --- if not all --- of the goals we have envisioned on the
|
||||
beginning of this journey, and considering some of the things we have planned for the future
|
||||
we are just getting started.
|
||||
|
||||
\section{I Have a Plan!}
|
||||
|
||||
In this section I'd like to set some goals regarding the future of the framework
|
||||
apart from implementing new features (new features will always come in as we go).
|
||||
apart from implementing new features, as these will always keep coming in, and we
|
||||
have some great ones planned, that I can promise.
|
||||
|
||||
First of all I think that we need to put more focus on developing TFW, as currently
|
||||
other projects are often being priorized over it.
|
||||
@ -52,18 +53,20 @@ To make this better I'd need to consider planning ahead more, so that the newest
|
||||
enough to support new features on the roadmap and not get distracted as much by
|
||||
other features emerging on the horizon.
|
||||
|
||||
An other thing is that I often feel like that there are some things in using the framework
|
||||
An other thing is that I often feel like that there are some things in using TFW
|
||||
that could be made a lot easier. As a maintainer sometimes I find it hard to
|
||||
tell what these things are, as I know TFW inside out, having written most
|
||||
tell what these things exactly are, as I know the framework inside out, having written most
|
||||
of the codebase myself.
|
||||
I'd like to set some time aside to create tutorials using the framework myself
|
||||
I'd like to set some time aside to create tutorials using the framework myself,
|
||||
so I can better narrow these potential difficulities down.
|
||||
This would require me to be able to take things slow for a few weeks, as this is not
|
||||
something that is possible to do effectively in a rush. In the summer months, maybe?
|
||||
|
||||
Currently the framework is proprietary software.
|
||||
While it is not feasible to go open source today or tomorrow for various reasons,
|
||||
we all believe that software which is free as in freedom \emph{is} the future.
|
||||
As such, at some point I'd like to open source the whole thing if the circumstances will allow
|
||||
us to do so.
|
||||
the company to do so.
|
||||
|
||||
\section{Things That I Have Learned}
|
||||
|
||||
@ -78,6 +81,14 @@ as I just simply enjoy admiring quality typography which WYSIWYG%
|
||||
I've spent a long time working on and maintaining the Tutorial Framework.
|
||||
While the list of technical things I've learned is long and exciting, I also feel like
|
||||
I've learned a lot about supporting other developers, project management and communication.
|
||||
An other thing that I've been able to learn is to adopt a more patient mindset while
|
||||
working. Back in the day I used to be nervous because of deadlines and things not
|
||||
working how they were supposed to, but now I know that these things are a part
|
||||
of the job and one must be able to deal with them without getting agitated.
|
||||
Any time I feel like something is not OK, I just try take a step back, relax a bit to
|
||||
blow of steam and approach the issue without acting in haste.
|
||||
I think this is not too related to working as a software engineer, but something
|
||||
that can be applied to anything we do.
|
||||
|
||||
The most important thing, that I will always remember as a software engineer
|
||||
and is something that I've learned during this period
|
||||
|
@ -11,10 +11,11 @@ The main points include:
|
||||
\item Defining an FSM to describe the flow of the tutorial and implementing proper callbacks
|
||||
for this machine, such as ones that display messages to the user
|
||||
\item Implementing the required event handlers, which may trigger state transitions in the FSM,
|
||||
interact with non-TFW code and do various things that might be needed during an exercise
|
||||
interact with non-TFW code and do various things that might be needed during an exercise,
|
||||
such as compiling code written by the user or runnign unit tests
|
||||
\item Defining what processes should run inside the container besides the things TFW
|
||||
starts automatically
|
||||
\item Setting up reverse proxying for any user-facing network applications such as webservers
|
||||
\item Setting up reverse proxying for any user-facing network application such as webservers
|
||||
\end{itemize}
|
||||
At first all these tasks can seem quite overwhelming.
|
||||
Remember that \emph{witchcraft} is what we practice here after all.
|
||||
@ -53,14 +54,18 @@ understanding of how the framework interacts with client code.
|
||||
|--...
|
||||
\end{lstlisting}
|
||||
|
||||
\subsection{Avatao Configuration File}
|
||||
\subsection{Avatao Specific Files}
|
||||
The \code{config.yml} file is an Avatao challenge configuration file,
|
||||
which is used describe what kind of Docker containers implement a challenge,
|
||||
what ports do they expose talking what protocols, define the name of the
|
||||
excercise, it's difficulity, and so on.
|
||||
Every Avatao challenge must provide such a file.
|
||||
The Tutorial Framework does not use this file, this is only required to run
|
||||
the exercise in production, so it is mostly out of scope for this thesis.
|
||||
An other thing that is not even indicated on the structure above is the \code{metadata}
|
||||
directory, which contains the short and long descriptions of challenges in
|
||||
Markdown format.
|
||||
|
||||
The Tutorial Framework does not use these files in any way whatsoever,
|
||||
these are only required to make the tutorial function on the Avatao platform.
|
||||
|
||||
\subsection{Controller Image}
|
||||
It was previously mentioned that the \code{controller} Docker image is responsible
|
||||
@ -68,16 +73,16 @@ for the solution checking of challenges (whether the user has completed the exer
|
||||
Currently this image is maintained in the test-tutorial-framework repository.
|
||||
It is a really simple Python server which functions as a TFW event handler as well.
|
||||
It subscribes to the FSM update messages
|
||||
broadcasted by the \code{FSMManagingEventHandler}, we've previously discussed,
|
||||
broadcasted by the \code{FSMManagingEventHandler}, we have discussed previously,
|
||||
this way it is capable of keeping track of the state of the tutorial,
|
||||
which allows it to detect if the final state of the FSM is reached.
|
||||
|
||||
\subsection{Solvable Image}
|
||||
Currently the Tutorial Framework is maintained in three git repositories:
|
||||
\begin{description}
|
||||
\item[baseimage-tutorial-framework] Docker baseimage (contains all backend logic)
|
||||
\item[frontend-tutorial-framework] Angular frontend
|
||||
\item[test-tutorial-framework] An example tutorial built using baseimage and frontend
|
||||
\item[baseimage-tutorial-framework:] Docker baseimage (contains all backend logic)
|
||||
\item[frontend-tutorial-framework:] Angular frontend
|
||||
\item[test-tutorial-framework:] An example tutorial built using baseimage and frontend
|
||||
\end{description}
|
||||
Every tutorial based on the framework must use the TFW baseimage as the parent of
|
||||
it's own \code{solvable} image, using the \code{FROM}%
|
||||
@ -104,8 +109,11 @@ I am going to discuss these one by one.
|
||||
\subsection{Dockerfile}
|
||||
Since this is a Docker image it must define a \code{Dockerfile}.
|
||||
This image always uses the baseimage of the framework as a parent image.
|
||||
Besides this developers can use this as a regular \code{Dockerfile} to work with as
|
||||
they see fit to implement their tutorial.
|
||||
Besides this developers can use this as a regular \code{Dockerfile} to work with
|
||||
in any way they see fit to implement their tutorial.
|
||||
This means that developers looking to create content on Avatao, be that
|
||||
with the Tutorial Framework or without it must be familiar with Docker,
|
||||
as they will have to set everything up to work inside a container.
|
||||
|
||||
\subsection{Frontend}
|
||||
This directory is designed to contain a clone of the frontend repository.
|
||||
@ -116,19 +124,29 @@ setup of the development environment.
|
||||
As previously mentioned, the framework uses supervisor to run several processes
|
||||
inside a Docker container.
|
||||
Usually Docker containers only run a single process and developers simply start
|
||||
more containers instead of processes if required.
|
||||
more containers instead of processes if required (and use tools such as docker-compose%
|
||||
\footnote{\href{https://docs.docker.com/compose/}{https://docs.docker.com/compose/}}
|
||||
or kubernetes%
|
||||
\footnote{\href{https://kubernetes.io}{https://kubernetes.io}}
|
||||
to orchestrate their containers).
|
||||
This approach is not suitable for TFW, as it would require the framework to orchestrate
|
||||
Docker containers from an other container, which is feasible in theory but
|
||||
very hard and impractial to do in practice.
|
||||
Docker containers from inside a container managed by the same Docker daemon, which is
|
||||
feasible in theory but very hard and unservicable to do in practice.
|
||||
This would require doing something like mounting the unix domain socket used
|
||||
to manage the Docker daemon inside a running container managed by that daemon,
|
||||
which is a fun thing to
|
||||
play around with in my free time but not something suitable for running in production,
|
||||
not even mentioning the severe security implications of doing something like that.
|
||||
|
||||
Supervisor is a process control system designed to be able to work with
|
||||
processes on UNIX-like operating systems.
|
||||
When a tutorial built on TFW is started, the framework starts supervisor with
|
||||
When a tutorial built on TFW is started, a Docker container starts with supervisor running as
|
||||
PID\footnote{Process ID, on UNIX-like systems the \code{init} program is the first
|
||||
process started} 1, which in turn starts all the programs defined
|
||||
in this directory using supervisor configuration files.
|
||||
For example, a developer would use a file similar to this to run a webserver
|
||||
written in python:
|
||||
process started, and who gets PID 1 traditionally.} 1, which in turn starts all the
|
||||
programs defined in the \code{solvable/supervisor} directory.
|
||||
Content creators can use supervisor configuration files to define these programs.
|
||||
For example, a developer would write a file similar to this one and place it into the
|
||||
\code{solvable/supervisor} directory to run a webserver written in Python:
|
||||
\begin{lstlisting}
|
||||
[program:yourprogram]
|
||||
user=user
|
||||
@ -138,35 +156,51 @@ autostart=true
|
||||
\end{lstlisting}
|
||||
As mentioned earlier in~\ref{processmanagement}, any program that is started this way
|
||||
can be managed by the framewok using API messages.
|
||||
All this is possible through using the xmlrpc%
|
||||
\footnote{\href{https://docs.python.org/3/library/xmlrpc.html}
|
||||
{https://docs.python.org/3/library/xmlrpc.html}}
|
||||
API exposed by supervisor, which allows the framework to interact with it to control processes.
|
||||
This API is quite flexible and can be used to achieve a number of things which would be
|
||||
clumsy to do without using it (i.e.\ supervisor has a command line utility called
|
||||
\code{supervisorctl} that exposes similar functionality to the xmlrpc bindings,
|
||||
but it is better to communicate with the supervisor daemon directly than to
|
||||
invoke it's command line utility in a separate process when you need something done).
|
||||
|
||||
\subsection{Nginx}
|
||||
For simplicity, exercises based on the framework only expose a single port from the
|
||||
\code{solvable} container.
|
||||
This port is required to serve the frontend of the framework.
|
||||
If this is the case, how do we run additional web applications to showcase vulnerabilies
|
||||
on during the tutorial?
|
||||
on during a tutorial?
|
||||
Since one port can only be bound by one process at a time, we will need to
|
||||
use a reverse-proxy to to bind the port and redirect traffict to other
|
||||
webservers binding non-exposed ports.
|
||||
run a reverse-proxy%
|
||||
\footnote{\href{https://www.nginx.com/resources/glossary/reverse-proxy-server/}
|
||||
{https://www.nginx.com/resources/glossary/reverse-proxy-server/}} server inside the
|
||||
container to
|
||||
bind the exposed port and redirect traffic to other webservers binding non-exposed ports.
|
||||
|
||||
To support this, TFW automatically runs an nginx webserver (it uses this nginx
|
||||
process to serve the framework frontend as well) we can supply additional configurations to.
|
||||
Any configuration files placed into this directory will be interpreted by nginx
|
||||
once the container has started.
|
||||
To support this, TFW automatically starts an nginx webserver. It uses this nginx
|
||||
instance to serve the framework frontend as well.
|
||||
It is possible to supply additional configurations to this server in a convenient manner:
|
||||
any configuration files placed into the \code{solvable/nginx} directory will be
|
||||
interpreted by nginx once the container has started.
|
||||
To set up the reverse-proxying of a webserver running on port 3333,
|
||||
one would write a config file similar to this one:
|
||||
one would write a configuration file similar to this one:
|
||||
\begin{lstlisting}
|
||||
location /yoururl {
|
||||
proxy_pass http://127.0.0.1:3333;
|
||||
}
|
||||
\end{lstlisting}
|
||||
Now the content server by this websever will be available on ``<challenge\_url>/yoururl''.
|
||||
Now the content served by this websever on port 3333
|
||||
will be available on the url \code{<challenge-url>/yoururl} despite that port 3333
|
||||
does not accept connections from outside the container as it is not exposed.
|
||||
It is very important to understand, that developers
|
||||
have to make sure that their web application \emph{behaves well} behind a reverse proxy.
|
||||
What this means is that they are going to be serverd from a ``subdirectory'' of an URL:
|
||||
for example ``/register'' will be served under ``/yoururl/register''.
|
||||
What this means is that they are going to be served from a ``subdirectory'' of the top
|
||||
level URL\@:
|
||||
for example \code{/register} will be served under \code{/yoururl/register}.
|
||||
This means that all links in the final HTML must refer to the proxied urls, e.g.\
|
||||
``/yoururl/register'' and server side redirects must point to the correct hrefs as well.
|
||||
\code{/yoururl/login}, and server side redirects must point to these correct hrefs as well.
|
||||
Idiomatically this is usually implemented by supplying a \code{BASEURL}
|
||||
to the application through an environment variable, so that it is able to set
|
||||
itself up correctly.
|
||||
@ -181,18 +215,18 @@ Normally when one uses the \code{COPY}%
|
||||
command to create a layer%
|
||||
\footnote{\href{https://docs.docker.com/storage/storagedriver/}
|
||||
{https://docs.docker.com/storage/storagedriver/}} in a Docker image,
|
||||
this action takes place when building that image (i.e.\ in the \emph{build context}
|
||||
this action takes place on building that image (i.e.\ in the \emph{build context}
|
||||
of that image).
|
||||
This is not good for this use case: when building the framework baseimage,
|
||||
these configuration files that will be written by content developers do not even
|
||||
exist.
|
||||
these configuration files that will be written by content developers using TFW in
|
||||
the future do not even exist yet.
|
||||
How could we copy files into an image layer that will be created in the future?
|
||||
|
||||
It is possible to use a command called \code{ONBUILD}%
|
||||
\footnote{\href{https://docs.docker.com/engine/reference/builder/\#onbuild}
|
||||
{https://docs.docker.com/engine/reference/builder/\#onbuild}}
|
||||
in the Dockerfile of a baseimage to delay another command
|
||||
to the point in time when other images will use the baseimage
|
||||
to the point in time where other images will use the baseimage
|
||||
as a parent with the \code{FROM} command. This makes it possible to execute
|
||||
commands in the build context of the descendant image.
|
||||
This is great, because the config files we need \emph{will} exist in the build
|
||||
@ -202,6 +236,12 @@ In practice this looks something like this in the baseimage \code{Dockerfile}:
|
||||
ONBUILD COPY ${BUILD_CONTEXT}/nginx/ ${TFW_NGINX_COMPONENTS}
|
||||
ONBUILD COPY ${BUILD_CONTEXT}/supervisor/ ${TFW_SUPERVISORD_COMPONENTS}
|
||||
\end{lstlisting}
|
||||
It is important to keep in mind however, that the layers created by these
|
||||
\code{ONBUILD} commands will only be available \emph{after} the \code{FROM}
|
||||
command is executed when building the child image \emph{in the future}.
|
||||
This means that if you want to
|
||||
do something with these files in the baseimage build after they have
|
||||
been copied, those things must be done in \code{ONBUILD} commands as well.
|
||||
|
||||
\subsection{Source Directory}
|
||||
The \code{src} directory usually holds tutorial-specific code, such as
|
||||
@ -210,7 +250,8 @@ served by the exercise and generally anything that won't fit in the other,
|
||||
framework-specific directories.
|
||||
The use of this directory is not mandatory, only a good practice, as developers
|
||||
are free to implement the non-TFW parts of their exercises as they see fit
|
||||
(the copying of these files into image layers are their resposibility).
|
||||
(the copying of these files into image layers using \code{solvable/Dockerfile}
|
||||
is their resposibility as well).
|
||||
|
||||
\section{Configuring Built-in Components}
|
||||
|
||||
@ -257,6 +298,21 @@ YAML based state machine implementations also allow the usage of the Jinja2%
|
||||
templating language to substitute variables into the YAML file.
|
||||
These substitutions are really powerful, as one could even iterate through arrays
|
||||
or invoke functions that produce strings to be inserted using this method.
|
||||
This is very similar to how Ansible uses%
|
||||
\footnote{\href{https://docs.ansible.com/ansible/2.6/user_guide/playbooks_templating.html}
|
||||
{https://docs.ansible.com/ansible/2.6/user\_guide/playbooks\_templating.html}}
|
||||
Jinja2, and I was certainly inspired by this
|
||||
when coming up with this idea.
|
||||
For example, if we had an FSM with five states, we could use the following
|
||||
Jinja2 code to generate a transition called \code{step_next} between each state
|
||||
in a \code{for} cycle:
|
||||
\begin{lstlisting}
|
||||
{% for i in range(5) %}
|
||||
- trigger: 'step_next'
|
||||
source: '{{i}}'
|
||||
dest: '{{i+1}}'
|
||||
{% endfor %}
|
||||
\end{lstlisting}
|
||||
|
||||
\subsection{Python based FSM}
|
||||
Optionally, the same state machine can be implemented like this in Python using
|
||||
@ -279,7 +335,7 @@ Python as well, it is going to be easier to interface with library code.
|
||||
In this section I am going to showcase how implementing event handlers is possible
|
||||
when using the framework.
|
||||
I am going to use the Python programming language, but it isn't hard
|
||||
to create event handlers in other languages, because the only thing
|
||||
to create event handlers in other languages, as the only thing
|
||||
they have to be capable of is communicating with the TFW server using
|
||||
ZeroMQ sockets, as previously discussed.
|
||||
The library provided by the framework abstracts low level socket logic
|
||||
@ -305,12 +361,11 @@ abstract method, which is used to, well, handle events.
|
||||
To make getting started as smooth as possible I have created
|
||||
a ``bootstrap'' script which is capable of creating a development envrionment from
|
||||
scratch.
|
||||
|
||||
This script is distributed as the following bash one-liner:
|
||||
\begin{lstlisting}[language=bash]
|
||||
bash -c "$(curl -fsSL https://git.io/vxBfj)"
|
||||
\end{lstlisting}
|
||||
This command downloads a script using \code{curl}%
|
||||
This command downloads the script using \code{curl}%
|
||||
\footnote{\href{https://curl.haxx.se}{https://curl.haxx.se}}, then executes it in bash.
|
||||
In the open source community it is quite common to distribute installers this way%
|
||||
\footnote{A good example of this is oh-my-zsh
|
||||
@ -340,7 +395,7 @@ then pipes it into a bash interpreter \emph{only if} the checksum
|
||||
of the downloaded string matches the one provided, otherwise it displays
|
||||
an error message.
|
||||
Software projects distributing their product as binary installers often
|
||||
display such checksums on their download pages with the purpose to potentially
|
||||
display such checksums on their download pages with the purpose of potentially
|
||||
mitigating MITM attacks.
|
||||
|
||||
The bootstrap script clones the three TFW repositories and does several steps
|
||||
@ -348,7 +403,7 @@ to create a working environment into a single directory, that is based on
|
||||
test-tutorail-framework:
|
||||
\begin{itemize}
|
||||
\item It builds the newest version of the TFW baseimage locally
|
||||
\item It pins the version tag in \code{solvable/Dockerfile},
|
||||
\item It pins the version tag of this image in \code{solvable/Dockerfile},
|
||||
so that this newly-built version will be used by the tutorial
|
||||
\item It places the latest frontend in \code{solvable/frontend} with
|
||||
depencendies installed
|
||||
@ -376,15 +431,15 @@ Why is this the case?
|
||||
|
||||
\subsection{The Frontend Issue}
|
||||
|
||||
To be able to understand this, we will have to gain some understanding of the
|
||||
build process of Angular projects.
|
||||
To be able to understand this, we will have to gain some understanding on how the
|
||||
build process of Angular projects work.
|
||||
|
||||
When frontend developers work on Angular projects, they usually use the built-in
|
||||
developer tool of the Angular-CLI%
|
||||
\footnote{\href{https://cli.angular.io}{https://cli.angular.io}},
|
||||
\code{ng serve} to build and serve their application.
|
||||
\code{ng serve} to build and serve their applications.
|
||||
The advantage of this tool is that it automatically reloads the frontend
|
||||
when the code on disk is changed, and that it is generally very easy to work with.
|
||||
when the code on the disk is changed, and that it is generally very easy to work with.
|
||||
On the other hand, a disadvantage is that a \code{node_modules} directory
|
||||
containing all the npm%
|
||||
\footnote{\href{https://www.npmjs.com}{https://www.npmjs.com}}
|
||||
@ -405,13 +460,14 @@ This is why today frontend builds usually take a lot longer than building anythi
|
||||
not involving JavaScript (such as C++, C\# or any other compiled programming language).
|
||||
|
||||
This mess presents it's own challenges for the Tutorial Framework as well.
|
||||
Since hundreds of megabytes of dependencies have no place inside Docker containers%
|
||||
\footnote{Otherwise it may take tens of seconds just to send the build context to
|
||||
Since hundreds of megabytes of npm dependencies have no place inside Docker images%
|
||||
\footnote{Or it may take tens of seconds just to send the build context to
|
||||
the Docker daemon, which means waiting even before the build began},
|
||||
by default the framework will only place the results of a frontend production build
|
||||
by default the framework will only copy the results of a frontend production build
|
||||
of \code{solvable/frontend} into the image layers.
|
||||
This slows down the build time of TFW based challenges so much, that instead of like
|
||||
30 seconds, they will often take 5 to 10 minutes.
|
||||
30 seconds, they could often take 5 to 10 minutes depending on what hardware
|
||||
you use.
|
||||
|
||||
\subsection{The Solution Offered by the Framework}
|
||||
|
||||
@ -426,7 +482,7 @@ while the \code{tfw.sh} script is capable of starting a development server
|
||||
to serve the frontend locally using \code{ng serve} besides starting
|
||||
the Docker container without the frontend.
|
||||
If this whole thing wasn't complicated enough, since Docker binds the port
|
||||
the container is going to use, \code{tfw.sh} has to run this dev server on
|
||||
the container is going to use, \code{tfw.sh} has to run the Angular dev server on
|
||||
an other port, then use the proxying features of Angular-CLI to forward requests
|
||||
from this port to the runnign Docker container when requesting resources
|
||||
other then the entrypoint to the Angular application.
|
||||
|
BIN
figures/tfw_grid.png
Normal file
BIN
figures/tfw_grid.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 282 KiB |
Loading…
Reference in New Issue
Block a user