236 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
			
		
		
	
	
			236 lines
		
	
	
		
			14 KiB
		
	
	
	
		
			TeX
		
	
	
	
	
	
| \chapter{Introduction}
 | |
| 
 | |
| \section{Project justification}
 | |
| 
 | |
| As the world is being completely engulfed by software, the need for accessible, but
 | |
| high quality learning materials on software engineering and especially secure software
 | |
| engineering is on the rise.
 | |
| While we are enjoying the comfort that information technology provides us, we often forget
 | |
| about the risks involved in relying so much on software in our everyday lives.
 | |
| When taking a look on recent events, such as a cyber arms race taking place between leading
 | |
| powers, 50 million Facebook accounts being breached
 | |
| due to the incorrect handling of access tokens\cite{FacebookBreach},
 | |
| or how China is building an Orwellian state of total digital surveillance%
 | |
| \cite{ChinaSurv}\cite{ChinaCredit},
 | |
| it becomes clear that security and privacy in the IT sector
 | |
| is more important now than ever.
 | |
| 
 | |
| With all of our data slowly crawling towards the cloud and an IoT revolution on our necks,
 | |
| we as an industry must face the music and start actually doing something before we enter
 | |
| a new age of digital wild west, which could involve us running around in vulnerable self
 | |
| driving cars\cite{SelfDriving} with power over life and death, while exposing all our
 | |
| sensitive data through our ill-protected smart phones\cite{Android} and IoT devices\cite{IoTDDoS}.
 | |
| What a time to be alive.
 | |
| It is important to express that IT security is something that is \emph{really hard} to
 | |
| get right.
 | |
| Even if right often only means better then your neighbour, as perfect security is an utopia
 | |
| that doesn't seem to exist\cite{NoPerfectSecurity}.
 | |
| Often when large and reputable companies in the industry such as
 | |
| CloudFlare\cite{CloudFlareLeak} or eBay\cite{EBayGit} can fail to get it right at times
 | |
| is when people start to grasp how difficult it actually is.
 | |
| This is why unless we want to disconnect all our devices from all networks and ban USB
 | |
| sticks, the best lines of defense are going to be people --- a new generation 
 | |
| of \emph{security conscious} users and developers.
 | |
| 
 | |
| Among many other things outside IT, this is only possible with education\cite{ITSecEdu}.
 | |
| We need to come up with engaging, addictive and fun ways to learn (and teach), so that
 | |
| more and more people will be motivated to do so and the drive to acquire and share
 | |
| knowledge is something that comes naturally, rather than something we have to struggle for.
 | |
| I believe that this is something that \emph{can} and \emph{should} be applied to
 | |
| everything we do as a society.
 | |
| The only thing we can hope and work for is to become better and better as time
 | |
| and generations pass.
 | |
| We \emph{must} do better, and education is the way forward.
 | |
| 
 | |
| The short term goal of this project --- and the goal of this thesis --- is to provide
 | |
| a new angle in the education of software engineering, especially secure software
 | |
| engineering based on the aspirations above, with the long term goal of bringing
 | |
| something new to the table in the matter of IT education as a whole
 | |
| (not just developers, but users as well).
 | |
| 
 | |
| \section{A Short Introduction to Avatao}
 | |
| 
 | |
| The goal of Avatao as a company is to help software developers in building a \emph{culture} of
 | |
| security amongst themselves, with the vision that if the world is going to be taken over by
 | |
| software no matter what, that software might as well be \emph{secure software}.
 | |
| To achieve this goal we have been working on an online e-learning platform with hundreds%
 | |
| \footnote{654 exercises as of today, to be exact}
 | |
| of hands-on learning exercises to help students and professionals
 | |
| master IT security, collaborating with
 | |
| universities around the world and providing a solution for companies in building
 | |
| \emph{security consciousness} amongst their developer teams.
 | |
| 
 | |
| Since starting out we have amassed some experience in building fun challenges
 | |
| that showcase the exploitation and fixing of relevant security vulnerabilites in code or
 | |
| configuration.
 | |
| Traditionally these exercises revolved around offensive and defensive tasks, with challenges
 | |
| often being split into two or more parts.
 | |
| For example users would have to hack a website by exploiting a buffer overflow vulnerability,
 | |
| then in the second challenge they would fix the code they've just exploited in a web based
 | |
| code editor.
 | |
| These kind of exercises offer great flexibility to reflect real world security issues, as in
 | |
| more complex challenges users might be required to exploit multiple vulnerabilites for success,
 | |
| and understand the ways they augment each other.
 | |
| We often recreate real world scenarios based on incident reports released by companies for
 | |
| added authenticity and relevance\cite{AkosFacebook}.
 | |
| Our challenges usually involve some sort of website acting as frontend for the vulnerable
 | |
| application, or require the user to connect using SSH\@.
 | |
| 
 | |
| \pic{figures/avatao_challenge.png}{An offensive challenge on the Avatao platform}
 | |
| 
 | |
| The Avatao platform relies heavily on Docker containers to spawn challenges,
 | |
| which makes it extremely flexible in terms of what is possible to do when creating
 | |
| content.
 | |
| Essentially anything that you can do inside a Docker conainer can be done on
 | |
| the Avatao platform as well.
 | |
| Currently each challenge is implemented as a set of Docker images residing inside a
 | |
| Git repository exclusive to the specific challenge in mind.
 | |
| Our content creation wokflow enables developers to create such repositories on GitHub,
 | |
| which are automatically set up with the proper webhooks, so that when their content gets
 | |
| reviewed (and their feature branches merged), their changes will go live on the
 | |
| platform as well.
 | |
| In the future we also plan on supporting the use of virtual machines to implement
 | |
| challenges, which could further increase this fexibility by addig the possiblity to do
 | |
| things like exercises involving the use of Docker or Windows based challenges.
 | |
| 
 | |
| \section{Emergence}
 | |
| 
 | |
| While working as a content creator I have stumbled into the idea of automating the completion
 | |
| of challenges for QA\footnote{Quality Assurrance} and demo purposes%
 | |
| \footnote{I used to record short videos or GIFs to showcase my content to management}.
 | |
| In a certain scenario I was required to integrate a web based terminal emulator in a
 | |
| frontend application to improve user experience by making it possible to use a shell
 | |
| right on the website rather than having to connect through SSH\@.
 | |
| After I got this working I was looking into writing hacky bash scripts to automate the steps
 | |
| required to complete the challenge in order to make it easier for me to record the solution,
 | |
| as I have often found myself recording over and over again for a demo without any mistakes.
 | |
| During the time I was playing around with this idea, researching possible solutions have led me
 | |
| to a hidden gem of a project on GitHub called \texttt{demo-magic}%
 | |
| \footnote{\href{https://github.com/paxtonhare/demo-magic}{https://github.com/paxtonhare/demo-magic}},
 | |
| which is esentially a bash script that simulates someone typing into a terminal and executing
 | |
| commands.
 | |
| I have created a fork%
 | |
| \footnote{
 | |
| \href{https://git.strongds.hu/mrtoth/demo.sh/src/master/demo.sh}
 | |
| {https://git.strongds.hu/mrtoth/demo.sh/src/master/demo.sh}}
 | |
| of the project and integrated it into my challenge.
 | |
| Soon after recording demo videos was not even necessary anymore, as I have started to distribute
 | |
| the solution script with the challenge code itself, making it toggleable using build-time
 | |
| variables.
 | |
| Should the solution script be enabled, the challenge would automatically start%
 | |
| \footnote{I did this by injecting the solution script into the user's \texttt{.bashrc} file}
 | |
| completing itself in the terminal integrated into it's frontend, often even explaining the
 | |
| commands executed during the solution process.
 | |
| 
 | |
| \lstinputlisting[
 | |
|     language=bash,
 | |
|     caption={Example for a solution script},
 | |
|     captionpos=b
 | |
| ]{listings/demosh.example}
 | |
| 
 | |
| I was quite pleased with myself, no longer having to do the busywork of recording videos,
 | |
| but what I did not know was that I have accidentally
 | |
| did something far more than a hacky bash script solving challenges, as this little script
 | |
| would help formulate the idea of the project \emph{Tutorial Framework} or just \emph{TFW}.
 | |
| 
 | |
| \section{Vision of the Tutorial Framework}
 | |
| 
 | |
| The whole ``challenges that solve themselves'' thing seemed like an idea that has great
 | |
| potential if developed further.
 | |
| We have envisioned something that resembles a learning video, but it is real, actual
 | |
| software running and interacting with itself to showcase different topics to the user.
 | |
| Something that would allow the users to stop at any given time, take a breath, interact
 | |
| with the environment on their own (i.e.\ take a look a the directory structure or a file,
 | |
| try what happens if a command is executed somewhat differently, etc.) and then
 | |
| continue on with the tutorial.
 | |
| We wanted to create something that would feel like if an actual teacher was standing
 | |
| next to you, explaining topics to you in your own pace, while showing you how to solve
 | |
| a related task.
 | |
| This teacher scenario would allow you to take the helm sometimes and try applying
 | |
| your newfound skills in action immediately.
 | |
| 
 | |
| For example a chatbot would show you how to encrypt a file using GnuGP,
 | |
| then it would ask you to encrypt an other file similarly.
 | |
| After this the bot could teach you how to a configure a database server and then
 | |
| ask you to write a configuration file yourself and then encrypt it because it might
 | |
| contain sensitive data such as open ports, usernames and such.
 | |
| 
 | |
| Technically this is far from trivial however: we would have to keep track of the user's
 | |
| progress at all times, be able to actually check if the user has successfully encrypted
 | |
| the file by decrypting it and then checking if the configuration file is valid or not
 | |
| (this would practically require trying to start a database server with it).
 | |
| After all this we would still have to offer \emph{relevant} and helpful assistance if
 | |
| something went wrong.
 | |
| 
 | |
| Even if we did all this, we would still need a way to integrate this whole thing into 
 | |
| a web based frontend with a file editor, terminal, chat window and stuff like that.
 | |
| Turns out that today all this can be done by writing a few hundred lines of Python
 | |
| code which uses the Tutorial Framework.
 | |
| 
 | |
| \subsection{Project Requirements}\label{requirements}
 | |
| 
 | |
| Based on this it is now more or less possible to define requirements for the project.
 | |
| The reason for the ``more or less'' part is that all of this is pretty much bleeding edge,
 | |
| where the requirements could shift dynamically with time.
 | |
| For this reason I am going to be as general as possible, to the point that some of
 | |
| this might even sound vauge.
 | |
| To achieve our goals we would need:
 | |
| 
 | |
| \begin{itemize}
 | |
|     \item a way to keep track of user progress
 | |
|     \item a way to to handle various events (i.e.\ we can react when
 | |
|           the user has edited a file, or has executed a command in the terminal)
 | |
|     \item a highly flexible messaging system, in which processes and
 | |
|           frontend components (running in a web browser) could communicate with eachother
 | |
|     \item a web based frontend with lots of built-in options (terminal, file editor, chat
 | |
|           window, etc.) that use said messaging system
 | |
|     \item stable APIs that can be exposed to content creators to work with (so that
 | |
|           framework updates won't break client code)
 | |
|     \item tooling for development (distributing, building and running)
 | |
| \end{itemize}
 | |
| 
 | |
| \section{Early Development}
 | |
| 
 | |
| Around a year ago a good friend and collage of mine Bálint Bokros, the CTO of our company
 | |
| Gábor Pék and myself would start designing the TFW architecture.
 | |
| In this early phase we would research solutions for the issues described such as
 | |
| tracking user progress, process management, interprocess communication
 | |
| and making a web based frontend application capable of communicatig with processes running
 | |
| inside a Docker container.
 | |
| 
 | |
| After seeing some sort of light at the end of the tunnel regarding what technologies could
 | |
| be applied and coming up with several good alternatives Bálint Bokros was tasked to
 | |
| develop the first proof of concept and lay the foundations of the framework in his
 | |
| Bachelor's Thesis\cite{BokaThesis}.
 | |
| 
 | |
| Although not much of the original code base has remained due to intense refactoring
 | |
| and all around changes, the result would serve as a solid foundation for further development,
 | |
| and the architecture is mostly the same to this day.
 | |
| The resulting code would be the first working POC%
 | |
| \footnote{Proof of Concept} of the framework showcasing the fixing of an SQL Injection
 | |
| attack.
 | |
| This initial version included the foundations of the framework:
 | |
| a working messaging system, event handling and state tracking.
 | |
| These provided a great basis
 | |
| despite of the fact that the core codebase of the framework was almost
 | |
| completely rewritten due to an increased focus on code quality, 
 | |
| extensibility and API stability required by new features.
 | |
| 
 | |
| It is interesting to note, that when I've mentioned that the project requirements
 | |
| were kept general on purpose (\ref{requirements}) I had good reason to do so.
 | |
| When taking a look at the requirements of Bálint's Thesis, much of that
 | |
| is completely obsolete by now.
 | |
| But since the project has followed Agile Methodology%
 | |
| \footnote{Manifesto for Agile Software Development:
 | |
| \href{https://agilemanifesto.org}{https://agilemanifesto.org}}
 | |
| from the start, we were able to adapt to these changes without losing
 | |
| the progess he made in said Thesis. Quoting from the Agile Manifesto: 
 | |
| ``Responding to change over following a plan''.
 | |
| This is a really important takeaway.
 | |
| 
 | |
| After becoming a full time employee at Avatao I was tasked with developing the project
 | |
| with Bálint, who was later reassigned to work on the GDPR compliance of the platform.
 | |
| Thus it became my job to turn the framework into a stable code base ready for
 | |
| usage by content creators and to implement most of the features that we've envisioned
 | |
| earlier.
 |