254 lines
15 KiB
TeX
254 lines
15 KiB
TeX
\chapter{Introduction}
|
|
|
|
\section{Project justification}
|
|
|
|
As the world is being completely engulfed by software, the need for accessible, but
|
|
high quality learning materials on software engineering and especially secure software
|
|
engineering is on the rise.
|
|
While we are enjoying the comfort that information technology provides us, we often forget
|
|
about the risks involved in relying so much on software in our everyday lives.
|
|
When taking a look on recent events, such as a cyber arms race taking place between leading
|
|
powers, 50 million Facebook accounts being breached
|
|
due to the incorrect handling of access tokens\cite{FacebookBreach},
|
|
or how China is building an Orwellian state of total digital surveillance%
|
|
\cite{ChinaSurv}\cite{ChinaCredit},
|
|
it becomes clear that security and privacy in the IT sector
|
|
is more important now than ever.
|
|
|
|
With all of our data slowly crawling towards the cloud and an IoT revolution on our necks,
|
|
we as an industry must face the music and start actually doing something before we enter
|
|
a new age of digital wild west, which could involve us running around in vulnerable self
|
|
driving cars\cite{SelfDriving} with power over life and death, while exposing all our
|
|
sensitive data through our ill-protected smart phones\cite{Android} and IoT devices\cite{IoTDDoS}.
|
|
What a time to be alive.
|
|
It is important to express that IT security is something that is \emph{really hard} to
|
|
get right.
|
|
Even if right often only means better then your neighbour, as perfect security is an utopia
|
|
that doesn't seem to exist\cite{NoPerfectSecurity}.
|
|
Often when large and reputable companies in the industry such as
|
|
CloudFlare\cite{CloudFlareLeak} or eBay\cite{EBayGit} can fail to get it right at times
|
|
is when people start to grasp how difficult it actually is.
|
|
This is why unless we want to disconnect all our devices from all networks and ban USB
|
|
sticks, the best lines of defense are going to be people --- a new generation
|
|
of \emph{security conscious} users and developers.
|
|
|
|
Among many other things outside IT, this is only possible with education\cite{ITSecEdu}.
|
|
We need to come up with engaging, addictive and fun ways to learn (and teach), so that
|
|
more and more people will be motivated to do so and the drive to acquire and share
|
|
knowledge is something that comes naturally, rather than something we have to struggle for.
|
|
I believe that this is something that \emph{can} and \emph{should} be applied to
|
|
everything we do as a society.
|
|
The only thing we can hope and work for is to become better and better as time
|
|
and generations pass.
|
|
We \emph{must} do better, and education is the way forward.
|
|
|
|
The short term goal of this project --- and the goal of this thesis --- is to provide
|
|
a new angle in the education of software engineering, especially secure software
|
|
engineering based on the aspirations above, with the long term goal of bringing
|
|
something new to the table in the matter of IT education as a whole
|
|
(not just developers, but users as well).
|
|
|
|
\section{A Short Introduction to Avatao}
|
|
|
|
The goal of Avatao as a company is to help software developers in building a \emph{culture} of
|
|
security amongst themselves, with the vision that if the world is going to be taken over by
|
|
software no matter what, that software might as well be \emph{secure software}.
|
|
To achieve this goal we have been working on an online e-learning platform with hundreds%
|
|
\footnote{654 exercises as of today, to be exact}
|
|
of hands-on learning exercises to help students and professionals
|
|
master IT security, collaborating with
|
|
universities around the world and providing a solution for companies in building
|
|
\emph{security consciousness} amongst their developer teams.
|
|
|
|
Since starting out we have amassed some experience in building fun challenges
|
|
that showcase the exploitation and fixing of relevant security vulnerabilites in code or
|
|
configuration.
|
|
Traditionally these exercises revolved around offensive and defensive tasks, with challenges
|
|
often being split into two or more parts.
|
|
For example users would have to hack a website by exploiting a buffer overflow vulnerability,
|
|
then in the second challenge they would fix the code they've just exploited in a web based
|
|
code editor.
|
|
These kind of exercises offer great flexibility to reflect real world security issues, as in
|
|
more complex challenges users might be required to exploit multiple vulnerabilites for success,
|
|
and understand the ways they augment each other.
|
|
We often recreate real world scenarios based on incident reports released by companies for
|
|
added authenticity and relevance\cite{AkosFacebook}.
|
|
Our challenges usually involve some sort of website acting as frontend for the vulnerable
|
|
application, or require the user to connect using SSH\@.
|
|
|
|
\pic{figures/avatao_challenge.png}{An offensive challenge on the Avatao platform}
|
|
|
|
The Avatao platform relies heavily on Docker containers to spawn challenges,
|
|
which makes it extremely flexible in terms of what is possible to do when creating
|
|
content.
|
|
Essentially anything that you can do inside a Docker conainer can be done on
|
|
the Avatao platform as well.
|
|
Currently each challenge is implemented as a set of Docker images residing inside a
|
|
Git repository exclusive to the specific challenge in mind.
|
|
Our content creation wokflow enables developers to create such repositories on GitHub,
|
|
which are automatically set up with the proper webhooks, so that when their content gets
|
|
reviewed (and their feature branches merged), their changes will go live on the
|
|
platform as well.
|
|
In the future we also plan on supporting the use of virtual machines to implement
|
|
challenges, which could further increase this fexibility by addig the possiblity to do
|
|
things like exercises involving the use of Docker or Windows based challenges.
|
|
|
|
\section{Emergence}\label{intro:emergence}
|
|
|
|
While working as a content creator I have stumbled into the idea of automating the completion
|
|
of challenges for QA\footnote{Quality Assurrance} and demo purposes%
|
|
\footnote{I used to record short videos or GIFs to showcase my content to management}.
|
|
In a certain scenario I was required to integrate a web based terminal emulator in a
|
|
frontend application to improve user experience by making it possible to use a shell
|
|
right on the website rather than having to connect through SSH\@.
|
|
After I got this working I was looking into writing hacky bash scripts to automate the steps
|
|
required to complete the challenge in order to make it easier for me to record the solution,
|
|
as I have often found myself recording over and over again for a demo without any mistakes.
|
|
During the time I was playing around with this idea, researching possible solutions have led me
|
|
to a hidden gem of a project on GitHub called \code{demo-magic}%
|
|
\footnote{\href{https://github.com/paxtonhare/demo-magic}{https://github.com/paxtonhare/demo-magic}},
|
|
which is esentially a bash script that simulates someone typing into a terminal and executing
|
|
commands.
|
|
I have created a fork%
|
|
\footnote{
|
|
\href{https://git.strongds.hu/mrtoth/demo.sh/src/master/demo.sh}
|
|
{https://git.strongds.hu/mrtoth/demo.sh/src/master/demo.sh}}
|
|
of the project and integrated it into my challenge.
|
|
Soon after recording demo videos was not even necessary anymore, as I have started to distribute
|
|
the solution script with the challenge code itself, making it toggleable using build-time
|
|
variables.
|
|
Should the solution script be enabled, the challenge would automatically start%
|
|
\footnote{I did this by injecting the solution script into the user's \code{.bashrc} file}
|
|
completing itself in the terminal integrated into it's frontend, often even explaining the
|
|
commands executed during the solution process.
|
|
|
|
\lstinputlisting[
|
|
language=bash,
|
|
caption={Example for a solution script},
|
|
captionpos=b
|
|
]{listings/demosh.example}
|
|
|
|
I was quite pleased with myself, no longer having to do the busywork of recording videos,
|
|
but what I did not know was that I have accidentally
|
|
did something far more than a hacky bash script solving challenges, as this little script
|
|
would help formulate the idea of the project \emph{Tutorial Framework} or just \emph{TFW}.
|
|
|
|
\section{Vision of the Tutorial Framework}
|
|
|
|
The whole ``challenges that solve themselves'' thing seemed like an idea that has great
|
|
potential if developed further.
|
|
We have envisioned something that resembles a learning video, but it is real, actual
|
|
software running and interacting with itself to showcase different topics to the user.
|
|
Something that would allow the users to stop at any given time, take a breath, interact
|
|
with the environment on their own (i.e.\ take a look a the directory structure or a file,
|
|
try what happens if a command is executed somewhat differently, etc.) and then
|
|
continue on with the tutorial.
|
|
We wanted to create something that would feel like if an actual teacher was standing
|
|
next to you, explaining topics to you in your own pace, while showing you how to solve
|
|
a related task.
|
|
This teacher scenario would allow you to take the helm sometimes and try applying
|
|
your newfound skills in action immediately.
|
|
|
|
For example a chatbot would show you how to encrypt a file using GnuGP,
|
|
then it would ask you to encrypt an other file similarly.
|
|
After this the bot could teach you how to a configure a database server and then
|
|
ask you to write a configuration file yourself and then encrypt it because it might
|
|
contain sensitive data such as open ports, usernames and such.
|
|
|
|
Technically this is far from trivial however: we would have to keep track of the user's
|
|
progress at all times, be able to actually check if the user has successfully encrypted
|
|
the file by decrypting it and then checking if the configuration file is valid or not
|
|
(this would practically require trying to start a database server with it).
|
|
After all this we would still have to offer \emph{relevant} and helpful assistance if
|
|
something went wrong.
|
|
|
|
Another scenario we've visioned was the following: Imagine a code editor on the
|
|
right which contains the authentication logic of a website.
|
|
On the left, imagine that the website which the code in the editor
|
|
implements is present. Note that the website is completely real: it is an actual, functional web
|
|
application users can interact with (i.e.\ navigate through the pages, register or log in).
|
|
The code editor has a button titled ``Deploy'' on it.
|
|
If the user changes the source code of the application and clicks this button, the application
|
|
should restart itself with the new code.
|
|
Let's say that the user comments out the part that authenticates a user.
|
|
In this case the application should let anyone log in dummy credentials.
|
|
Meanwhile a console could show the output of the webserver.
|
|
For example if the source code the user tried to deploy was invalid, the framework
|
|
should report the exact exception raised while running the application.
|
|
|
|
\pic{figures/webapp_and_editor.png}{The Code Editor and Web Application Example In TFW}
|
|
|
|
Even if we did all this, we would still need a way to integrate this whole thing into
|
|
a web based frontend with a file editor, terminal, chat window and stuff like that.
|
|
Turns out that today all this can be done by writing a few hundred lines of Python
|
|
code which uses the Tutorial Framework.
|
|
|
|
\pic{figures/webapp_and_editor_err.png}{Invalid Code and Deployment Failure with Process Output}
|
|
|
|
\subsection{Project Requirements}\label{requirements}
|
|
|
|
Based on this it is now more or less possible to define requirements for the project.
|
|
The reason for the ``more or less'' part is that all of this is pretty much bleeding edge,
|
|
where the requirements could shift dynamically with time.
|
|
For this reason I am going to be as general as possible, to the point that some of
|
|
this might even sound vauge.
|
|
To achieve our goals we would need:
|
|
|
|
\begin{itemize}
|
|
\item a way to keep track of user progress
|
|
\item a way to to handle various events (i.e.\ we can react when
|
|
the user has edited a file, or has executed a command in the terminal)
|
|
\item a highly flexible messaging system, in which processes and
|
|
frontend components (running in a web browser) could communicate with eachother
|
|
\item a web based frontend with lots of built-in options (terminal, file editor, chat
|
|
window, etc.) that use said messaging system
|
|
\item stable APIs that can be exposed to content creators to work with (so that
|
|
framework updates won't break client code)
|
|
\item tooling for development (distributing, building and running)
|
|
\end{itemize}
|
|
|
|
\section{Early Development}
|
|
|
|
Around a year ago a good friend and collage of mine Bálint Bokros, the CTO of our company
|
|
Gábor Pék and myself would start designing the TFW architecture.
|
|
In this early phase we would research solutions for the issues described such as
|
|
tracking user progress, process management, interprocess communication
|
|
and making a web based frontend application capable of communicatig with processes running
|
|
inside a Docker container.
|
|
|
|
After seeing some sort of light at the end of the tunnel regarding what technologies could
|
|
be applied and coming up with several good alternatives Bálint Bokros was tasked to
|
|
develop the first proof of concept and lay the foundations of the framework in his
|
|
Bachelor's Thesis\cite{BokaThesis}.
|
|
|
|
Although not much of the original code base has remained due to intense refactoring
|
|
and all around changes, the result would serve as a solid foundation for further development,
|
|
and the architecture is mostly the same to this day.
|
|
The resulting code would be the first working POC%
|
|
\footnote{Proof of Concept} of the framework showcasing the fixing of an SQL Injection
|
|
attack.
|
|
This initial version included the foundations of the framework:
|
|
a working messaging system, event handling and state tracking.
|
|
These provided a great basis
|
|
despite of the fact that the core codebase of the framework was almost
|
|
completely rewritten due to an increased focus on code quality,
|
|
extensibility and API stability required by new features.
|
|
|
|
It is interesting to note, that when I've mentioned that the project requirements
|
|
were kept general on purpose (\ref{requirements}) I had good reason to do so.
|
|
When taking a look at the requirements of Bálint's Thesis, much of that
|
|
is completely obsolete by now.
|
|
But since the project has followed Agile Methodology%
|
|
\footnote{Manifesto for Agile Software Development:
|
|
\href{https://agilemanifesto.org}{https://agilemanifesto.org}}
|
|
from the start, we were able to adapt to these changes without losing
|
|
the progess he made in said Thesis. Quoting from the Agile Manifesto:
|
|
``Responding to change over following a plan''.
|
|
This is a really important takeaway.
|
|
|
|
After becoming a full time employee at Avatao I was tasked with developing the project
|
|
with Bálint, who was later reassigned to work on the GDPR compliance of the platform.
|
|
Thus it became my job to turn the framework into a stable code base ready for
|
|
usage by content creators and to implement most of the features that we've envisioned
|
|
earlier.
|