Continue writig thesis with focus on arctitecture
This commit is contained in:
parent
65e6426fdc
commit
1ef4feb146
@ -80,7 +80,42 @@
|
||||
title={Education as a key factor in the process of building cybersecurity},
|
||||
url={https://2017.cybersecforum.eu/files/2016/12/ecj_vol2_issue1_i.albrycht_education_as_a_key_in_the_process_of_building_cybersecurity.pdf},
|
||||
language={english},
|
||||
author={IZABELA ALBRYCHT},
|
||||
author={Izabela Albrycht},
|
||||
year={2016},
|
||||
}
|
||||
|
||||
@online{EBayGit,
|
||||
title={Pwning eBay - How I Dumped eBay Japan's Website Source Code},
|
||||
url={https://slashcrypto.org/2018/11/28/eBay-source-code-leak/},
|
||||
language={english},
|
||||
author={David Wind},
|
||||
year={2018},
|
||||
month=nov,
|
||||
}
|
||||
|
||||
@online{CloudFlareLeak,
|
||||
title={Incident report on memory leak caused by Cloudflare parser bug},
|
||||
url={https://blog.cloudflare.com/incident-report-on-memory-leak-caused-by-cloudflare-parser-bug/},
|
||||
language={english},
|
||||
author={John Graham-Cumming},
|
||||
year={2017},
|
||||
month=feb,
|
||||
}
|
||||
|
||||
@online{NoPerfectSecurity,
|
||||
title={The Illusion Of Perfect Cybersecurity},
|
||||
url={https://www.forbes.com/sites/forbestechcouncil/2018/03/27/the-illusion-of-perfect-cybersecurity/},
|
||||
language={english},
|
||||
author={George Finney},
|
||||
year={2018},
|
||||
month=mar,
|
||||
}
|
||||
|
||||
@online{JavaScript,
|
||||
title={JavaScript is a Dysfunctional Programming Language},
|
||||
url={https://medium.com/javascript-non-grata/javascript-is-a-dysfunctional-programming-language-a1f4866e186f},
|
||||
language={english},
|
||||
author={Richard Kenneth Eng},
|
||||
year={2016},
|
||||
month=mar,
|
||||
}
|
||||
|
157
content/architecture.tex
Normal file
157
content/architecture.tex
Normal file
@ -0,0 +1,157 @@
|
||||
\chapter{Framework Architecture}
|
||||
\section{Core Technology}
|
||||
|
||||
It is important to understand that the Tutorial Framework is currently implemented as
|
||||
two Docker images:
|
||||
\begin{itemize}
|
||||
\item the \texttt{solvable} image is responsible for running the framework and the client
|
||||
code depending on it
|
||||
\item the \texttt{controller} image is responsible for solution checking (to figure out
|
||||
whether the user completed the tutorial or not)
|
||||
\end{itemize}
|
||||
During most of this capter I am going to be discussing the \texttt{solvable} Docker image,
|
||||
with the exception of section \ref{solutioncheck}, where I will dive into how the
|
||||
\texttt{controller} image is implemented.
|
||||
|
||||
The most important feature of the framework is it's messaging system.
|
||||
Basically what we need is a system where processes running inside a Docker container
|
||||
would be allowed to communicate with eachother.
|
||||
This is easy with lots of possible solutions (named pipes, sockets or shared memory to name a few).
|
||||
The hard part is that frontend components running inside a web browser -- which could be
|
||||
potentially on the other side of the planet -- would also need to partake in said communication.
|
||||
So what we need to create is something of a hybrid between an IPC system and something
|
||||
that can communicate with JavaScript running in a browser connected to it.
|
||||
The solution the framework uses is a proxy server, which connects to frontend components
|
||||
on one side and handles interprocess communication on the other side.
|
||||
This way the server is capable of proxying messages between the two sides, enabling
|
||||
communitaion between them.
|
||||
Notice that this way what we have is essentially an IPC system in which a web application
|
||||
can ``act like'' it was running on the backend in a sense: it is easily able to
|
||||
communicate with processes on the backend, while in reality the web application
|
||||
runs in the browser of the user, on a completely different machine.
|
||||
|
||||
\begin{note}
|
||||
The core idea and initial implementation of this server comes from Bálint Bokros,
|
||||
which was later redesigned and fully rewritten by me to allow for greater flexibility
|
||||
(such as connecting to more than a single browser at a time, different messaging modes,
|
||||
message authentication, restoration of frontend state, a complete overhaul of the
|
||||
state tracking system and the possibility for solution checking among other things).
|
||||
If you are explicitly interested in the differences between the original POC implementation
|
||||
(which is out of scope for this thesis due to lenght constraints) and the current
|
||||
framework please consult Bálint's excellent paper and Bachelor's Thesis on it\cite{BokaThesis}.
|
||||
\end{note}
|
||||
|
||||
Now let us take a closer look:
|
||||
|
||||
\subsection{Connecting to the Frontend}
|
||||
|
||||
The old way of creating dynamic webpages was AJAX polling, which is basically sending
|
||||
HTTP requests to a server at regular intervals from JavaScript to update the contents
|
||||
of your website (and as such requiring to go over the whole TCP handshake and the
|
||||
HTTP request-response on each update).
|
||||
This has been superseded by WebSockets around 2011, which provide a full-duplex
|
||||
communication channel over TCP between your browser and the server.
|
||||
This is done by initiation a protocol handshake using the \texttt{Connection: Upgrade}
|
||||
HTTP header, which establishes a premanent socket connection between the browser
|
||||
and the server.
|
||||
This allows for communication with lower overhead and latency facilitating efficient
|
||||
real-time applications.
|
||||
|
||||
The Tutorial Framework uses WebSockets to connect to it's web frontend.
|
||||
The framework proxy server is capable to connecting to an arbirary number of websockets,
|
||||
which allows opening different components in separate browser windows and tabs, or even
|
||||
in different browsers at once (such as opening a terminal in Chrome and an IDE in Firefox).
|
||||
|
||||
\subsection{Interprocess Communication}
|
||||
|
||||
To handle communication with processes running inside the container TFW utilizes
|
||||
the asynchronous distributed messaging library ZeroMQ%
|
||||
\footnote{\href{http://zeromq.org}{http://zeromq.org}} or ZMQ as short.
|
||||
The rationale behind this is that unlike other messaging systems such as
|
||||
RabbitMQ%
|
||||
\footnote{\href{https://www.rabbitmq.com}{https://www.rabbitmq.com}} or Redis%
|
||||
\footnote{\href{https://redis.io}{https://redis.io}},
|
||||
ZMQ does not require a daemon (message broker process) and as such
|
||||
has a much lower memory footprint while still providing various messaging
|
||||
patterns and bindings for almost any widely used programming language.
|
||||
An other -- yet untilized -- capability of this solution is that since ZMQ is capable
|
||||
of using simple TCP sockets, we could even communicate with processes running on remote
|
||||
hosts using the framework.
|
||||
|
||||
There are various lower level and higher level alternatives for IPC other than
|
||||
ZMQ which were also considered during the desing process of the framework at some point.
|
||||
A few examples of top contenders and reasons for not using them in the end:
|
||||
\begin{itemize}
|
||||
\item The handling of raw TCP sockets would involve lot's of boilerplate logic that
|
||||
already have quality implementations in messaging libraries: i.e. making sure that
|
||||
all bytes are sent or received both require checking the return values of the
|
||||
libc \texttt{send()} and \texttt{recv()} system calls, while ZMQ takes care of this
|
||||
extra logic involved and even provides higher level messaging patterns such as
|
||||
subscribe-publish, which would need to be implemented on top of raw sockets again.
|
||||
\item Using something like gRPC%
|
||||
\footnote{\href{https://grpc.io}{https://grpc.io}} or plain HTTP (both of which
|
||||
are considered to be higher level than ZMQ sockets) would require
|
||||
all processes partaking in the communication to be HTTP servers themselves,
|
||||
which would make the framework
|
||||
less lightweight and flexible: socket communication with or without ZMQ does not
|
||||
force you to write synchronous or asynchronous code, whereas common HTTP servers
|
||||
are either async or pre-fork in nature, which extort certain design choices on code
|
||||
built on them.
|
||||
\end{itemize}
|
||||
|
||||
\section{High Level Overview}
|
||||
|
||||
Now being familiar with the technological basis of the framework we can now
|
||||
discuss it in more detail.
|
||||
|
||||
\pic{figures/tfw_architecture.png}{An overwiew of the Tutorial Framework}
|
||||
|
||||
Architecturally TFW consists of four main components:
|
||||
\begin{itemize}
|
||||
\item \textbf{Event handlers}: processes running in a Docker container
|
||||
\item \textbf{Frontend}: web application running in the browser of the user
|
||||
\item \textbf{TFW (proxy) server}: responsible for message routing/proxying
|
||||
between the frontend and event handlers
|
||||
\item \textbf{TFW FSM}: a finite state machine responsible for tracking user progress,
|
||||
that is implemented as an event handler called \texttt{FSMManagingEventHandler}
|
||||
\end{itemize}
|
||||
Note that it is important to keep in mind that as I've mentioned previously,
|
||||
the TFW Server and event handlers reside in the \texttt{solvable} Docker container.
|
||||
They all run in separate processes and only communicate using ZeroMQ sockets.
|
||||
|
||||
In the following sections I am going to explain each of the main components in
|
||||
greater detail, as well as how they interact with each other,
|
||||
their respective responsibilities,
|
||||
some of the design choices behind them and more.
|
||||
|
||||
\subsection{Frontend}
|
||||
|
||||
This is a web application that runs in the browser of the user and uses
|
||||
multiple WebSocket connections to connect to the TFW server.
|
||||
Due to rapidly increasing complexity the original implementation (written in
|
||||
plain JavaScript with jQuery%
|
||||
\footnote{\href{https://jquery.com}{https://jquery.com}} and Bootstrap%
|
||||
\footnote{\href{https://getbootstrap.com}{https://getbootstrap.com}}) was becoming
|
||||
unmaintainable and the usage of some frontend framework became justified.
|
||||
|
||||
Several choices were considered, with the main contenders being:
|
||||
\begin{itemize}
|
||||
\item Angular\footnote{\href{https://angular.io}{https://angular.io}}
|
||||
\item React\footnote{\href{https://reactjs.org}{https://reactjs.org}}
|
||||
\item Vue.js\footnote{\href{https://vuejs.org}{https://vuejs.org}}
|
||||
\end{itemize}
|
||||
After comparing the above frameworks we've decided to work with Angular for
|
||||
several reasons.
|
||||
One being that Angular is essentially a complete platform that is very well
|
||||
suitable for building complex architecture into a single page application.
|
||||
Other reasons included that the frontend of the Avatao platform is also written
|
||||
in Angular (bonus points for experienced team members in the company).
|
||||
An other good thing going for it is that Angular forces you to use TypeScript%
|
||||
\footnote{\href{https://www.typescriptlang.org}{https://www.typescriptlang.org}}
|
||||
which tries to remedy the issues\cite{JavaScript}
|
||||
with JavaScript by being a language that transpiles to JavaScript while
|
||||
strongly encouraging things like static typing or Object Oriented Principles.
|
||||
|
||||
\subsection{Messaging}
|
||||
\subsection{TFW Finite State Machine}
|
||||
\subsection{Solution Checking}\label{solutioncheck}
|
@ -21,9 +21,16 @@ a new age of digital wild west, which could involve us running around in vulnera
|
||||
driving cars\cite{SelfDriving} with power over life and death, while exposing all our
|
||||
sensitive data through our ill-protected smart phones\cite{Android} and IoT devices\cite{IoTDDoS}.
|
||||
What a time to be alive.
|
||||
Unless we want to disconnect all our devices from all networks and ban USB sticks, the best
|
||||
lines of defense are going to be people -- a new generation of \emph{security conscious}
|
||||
users and developers.
|
||||
It is important to express that IT security is something that is \emph{really hard} to
|
||||
get right.
|
||||
Even if right often only means better then your neighbour, as perfect security is an utopia
|
||||
that doesn't seem to exist\cite{NoPerfectSecurity}.
|
||||
Often when large and reputable companies in the industry such as
|
||||
CloudFlare\cite{CloudFlareLeak} or eBay\cite{EBayGit} can fail to get it right at times
|
||||
is when people start to grasp how difficult it actually is.
|
||||
This is why unless we want to disconnect all our devices from all networks and ban USB
|
||||
sticks, the best lines of defense are going to be people -- a new generation
|
||||
of \emph{security conscious} users and developers.
|
||||
|
||||
Among many other things outside IT, this is only possible with education\cite{ITSecEdu}.
|
||||
We need to come up with engaging, addictive and fun ways to learn (and teach), so that
|
||||
@ -35,10 +42,10 @@ The only thing we can hope and work for is to become better and better as time
|
||||
and generations pass.
|
||||
We \emph{must} do better, and education is the way forward.
|
||||
|
||||
The short term goal of this project -- and thesis -- is to provide a new angle
|
||||
in the education of software engineering, especially secure software engineering
|
||||
based on the aspirations above, with the long term goal of bringing something new
|
||||
to the table in the matter of IT education as a whole
|
||||
The short term goal of this project -- and the goal of this thesis -- is to provide
|
||||
a new angle in the education of software engineering, especially secure software
|
||||
engineering based on the aspirations above, with the long term goal of bringing
|
||||
something new to the table in the matter of IT education as a whole
|
||||
(not just developers, but users as well).
|
||||
|
||||
\section{A Short Introduction to Avatao}
|
||||
@ -46,7 +53,7 @@ to the table in the matter of IT education as a whole
|
||||
The goal of Avatao as a company is to help software developers in building a \emph{culture} of
|
||||
security amongst themselves, with the vision that if the world is going to be taken over by
|
||||
software no matter what, that software might as well be \emph{secure software}.
|
||||
To achieve this goal we have been working on an online e-learning platform with hundreds\
|
||||
To achieve this goal we have been working on an online e-learning platform with hundreds%
|
||||
\footnote{654 exercises as of today, to be exact}
|
||||
of hands-on learning exercises to help students and professionals
|
||||
master IT security, collaborating with
|
||||
@ -69,6 +76,8 @@ added authenticity and relevance \cite{AkosFacebook}.
|
||||
Our challenges usually involve some sort of website acting as frontend for the vulnerable
|
||||
application, or require the user to connect using SSH.
|
||||
|
||||
\pic{figures/avatao_challenge.png}{An offensive challenge on the Avatao platform}
|
||||
|
||||
The Avatao platform relies heavily on Docker containers to spawn challenges,
|
||||
which makes it extremely flexible in terms of what is possible to do when creating
|
||||
content.
|
||||
@ -87,7 +96,7 @@ things like exercises involving the use of Docker or Windows based challenges.
|
||||
\section{Emergence}
|
||||
|
||||
While working as a content creator I have stumbled into the idea of automating the completion
|
||||
of challenges for QA\footnote{Quality Assurrance} and demo purposes\
|
||||
of challenges for QA\footnote{Quality Assurrance} and demo purposes%
|
||||
\footnote{I used to record short videos or GIFs to showcase my content to management}.
|
||||
In a certain scenario I was required to integrate a web based terminal emulator in a
|
||||
frontend application to improve user experience by making it possible to use a shell
|
||||
@ -96,18 +105,19 @@ After I got this working I was looking into writing hacky bash scripts to automa
|
||||
required to complete the challenge in order to make it easier for me to record the solution,
|
||||
as I have often found myself recording over and over again for a demo without any mistakes.
|
||||
During the time I was playing around with this idea, researching possible solutions have led me
|
||||
to a hidden gem of a project on GitHub called \texttt{demo-magic}\
|
||||
to a hidden gem of a project on GitHub called \texttt{demo-magic}%
|
||||
\footnote{\href{https://github.com/paxtonhare/demo-magic}{https://github.com/paxtonhare/demo-magic}},
|
||||
which is esentially a bash script that simulates someone typing into a terminal and executing
|
||||
commands.
|
||||
I have created a fork\
|
||||
\footnote{The source code is available at
|
||||
\href{https://git.strongds.hu/mrtoth/demo.sh/src/master/demo.sh}{https://git.strongds.hu/mrtoth/demo.sh/src/master/demo.sh}}
|
||||
I have created a fork%
|
||||
\footnote{
|
||||
\href{https://git.strongds.hu/mrtoth/demo.sh/src/master/demo.sh}
|
||||
{https://git.strongds.hu/mrtoth/demo.sh/src/master/demo.sh}}
|
||||
of the project and integrated it into my challenge.
|
||||
Soon after recording demo videos was not even necessary anymore, as I have started to distribute
|
||||
the solution script with the challenge code itself, making it toggleable using build-time
|
||||
variables.
|
||||
Should the solution script be enabled, the challenge would automatically start\
|
||||
Should the solution script be enabled, the challenge would automatically start%
|
||||
\footnote{I did this by injecting the solution script into the user's \texttt{.bashrc} file}
|
||||
completing itself in the terminal integrated into it's frontend, often even explaining the
|
||||
commands executed during the solution process.
|
||||
@ -123,7 +133,7 @@ but what I did not know was that I have accidentally
|
||||
did something far more than a hacky bash script solving challenges, as this little script
|
||||
would help formulate the idea of the project \emph{Tutorial Framework} or just \emph{TFW}.
|
||||
|
||||
\section{Introducing the Tutorial Framework}
|
||||
\section{Vision of the Tutorial Framework}
|
||||
|
||||
The whole ''challenges that solve themselves'' thing seemed like an idea that has great
|
||||
potential if developed further.
|
||||
@ -141,7 +151,7 @@ your newfound skills in action immediately.
|
||||
|
||||
For example a chatbot would show you how to encrypt a file using GnuGP,
|
||||
then it would ask you to encrypt an other file similarly.
|
||||
After this the bot could show you how to a configure a database server and then
|
||||
After this the bot could teach you how to a configure a database server and then
|
||||
ask you to write a configuration file yourself and then encrypt it because it might
|
||||
contain sensitive data such as open ports, usernames and such.
|
||||
|
||||
@ -157,6 +167,28 @@ a web based frontend with a file editor, terminal, chat window and stuff like th
|
||||
Turns out that today all this can be done by writing a few hundred lines of Python
|
||||
code which uses the Tutorial Framework.
|
||||
|
||||
\subsection{Project Requirements}\label{requirements}
|
||||
|
||||
Based on this it is now more or less possible to define requirements for the project.
|
||||
The reason for the ``more or less'' part is that all of this is pretty much bleeding edge,
|
||||
where the requirements could shift dynamically with time.
|
||||
For this reason I am going to be as general as possible, to the point that some of
|
||||
this might even sound vauge.
|
||||
To achieve our goals we would need:
|
||||
|
||||
\begin{itemize}
|
||||
\item a way to keep track of user progress
|
||||
\item a way to to handle various events (i.e. we can react when
|
||||
the user has edited a file, or has executed a command in the terminal)
|
||||
\item a highly flexible messaging system, in which processes and
|
||||
frontend components (running in a web browser) could communicate with eachother
|
||||
\item a web based frontend with lots of built-in options (terminal, file editor, chat
|
||||
window, etc.) that use said messaging system
|
||||
\item stable APIs that can be exposed to content creators to work with (so that
|
||||
framework updates won't break client code)
|
||||
\item tooling for development (distributing, building and running)
|
||||
\end{itemize}
|
||||
|
||||
\section{Early Development}
|
||||
|
||||
Around a year ago a good friend and collage of mine Bálint Bokros, the CTO of our company
|
||||
@ -174,9 +206,27 @@ Bachelor's Thesis\cite{BokaThesis}.
|
||||
Although not much of the original code base has remained due to intense refactoring
|
||||
and all around changes, the result would serve as a solid foundation for further development,
|
||||
and the architecture is mostly the same to this day.
|
||||
The resulting code would be the first working POC\
|
||||
The resulting code would be the first working POC%
|
||||
\footnote{Proof of Concept} of the framework showcasing the fixing of an SQL Injection
|
||||
attack.
|
||||
This initial version included the foundations of the framework:
|
||||
a working messaging system, event handling and state tracking.
|
||||
These provided a great basis
|
||||
despite of the fact that the core codebase of the framework was almost
|
||||
completely rewritten due to an increased focus on code quality,
|
||||
extensibility and API stability required by new features.
|
||||
|
||||
It is interesting to note, that when I've mentioned that the project requirements
|
||||
were kept general on purpose (\ref{requirements}) I had good reason to do so.
|
||||
When taking a look at the requirements of Bálint's Thesis, much of that
|
||||
is completely obsolete by now.
|
||||
But since the project has followed Agile Methodology%
|
||||
\footnote{Manifesto for Agile Software Development:
|
||||
\href{https://agilemanifesto.org}{https://agilemanifesto.org}}
|
||||
from the start, we were able to adapt to these changes without losing
|
||||
the progess he made in said Thesis. Quoting from the Agile Manifesto:
|
||||
``Responding to change over following a plan''.
|
||||
This is a really important takeaway.
|
||||
|
||||
After becoming a full time employee at Avatao I was tasked with developing the project
|
||||
with Bálint, who was later reassigned to work on the GDPR compliance of the platform.
|
||||
|
BIN
figures/avatao_challenge.png
Normal file
BIN
figures/avatao_challenge.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 79 KiB |
BIN
figures/tfw_architecture.png
Normal file
BIN
figures/tfw_architecture.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 46 KiB |
@ -10,7 +10,8 @@
|
||||
sectsty,
|
||||
xcolor,
|
||||
microtype,
|
||||
tabto
|
||||
tabto,
|
||||
amsthm
|
||||
}
|
||||
\RequirePackage[bottom,hang,flushmargin]{footmisc}
|
||||
|
||||
@ -18,6 +19,8 @@
|
||||
\sethlcolor{andigray}
|
||||
\newcommand{\code}[1]{\hl{\mbox{#1}}}
|
||||
|
||||
\newtheorem*{note}{Note}
|
||||
|
||||
\newcommand{\pic}[3][width=\textwidth]
|
||||
{
|
||||
\begin{figure}[H]
|
||||
|
@ -41,7 +41,9 @@
|
||||
\include{content/declaration}
|
||||
\include{content/abstract}
|
||||
\include{content/introduction}
|
||||
\include{content/architecture}
|
||||
|
||||
\listoffigures
|
||||
\lstlistoflistings
|
||||
|
||||
\renewcommand\bibname{References}
|
||||
|
Loading…
Reference in New Issue
Block a user