diff --git a/content/a_tour_of_tfw.tex b/content/a_tour_of_tfw.tex index 130eb72..105a16f 100644 --- a/content/a_tour_of_tfw.tex +++ b/content/a_tour_of_tfw.tex @@ -10,23 +10,26 @@ A \emph{very important} point to keep in mind is that most of this exercise-specific logic will be implemented in \textbf{FSM callbacks} and custom \textbf{event handlers}. The whole framework is built in a way to faciliate this process and developers -who understand this mindset are almost always guaranteed to create great -content using TFW. +who understand this mindset almost always find it a breeze to create great +content using TFW\@. The purpose of this chapter is to further detail the built-in components provided by the framework. -As previously mentioned, these components are implemented as event handlers -running in the \code{solvable} Docker container and frontend +As previously mentioned before, these components are implemented as event handlers +running in the \code{solvable} Docker container which communicate with frontend components written in Angular. -For instance the built-in code editor requires a frontend component and an event -handler to function properly, while the frontend component responsible for -drawing out and managing other components implements no -event handler, so it only exists on the frontend. +Some components might only feature one of these however, +for instance the built-in code editor requires a frontend component and an event +handler to function properly, while the component responsible for +drawing out and managing frontend components implements no +event handler, so it purely exists in Angular. +An other example of a purely frontend component would be the messages component, +which is used to display messages to the user. In the Tutorial Framework most of the built-ins define APIs, which are TFW messages that can be used to interact with them. For example, to inject a command into the terminal one would use a message like this: -\begin{lstlisting}[captionpos=b,caption={An API Message Capable of Writing in the Terminal}] +\begin{lstlisting}[captionpos=b,caption={An API message capable of writing to the terminal}] { "key": "shell", "data": @@ -36,15 +39,15 @@ For example, to inject a command into the terminal one would use a message like } } \end{lstlisting} -Notice the \code{\n} at the end of the command. -By including a newline character, we are also capable of executing commands in the -user's terminal. +Notice the ``\code{\n}'' at the end of the command. +By including a newline character, we are also capable of executing commands directly +in the user's terminal (and the user can see this). Were this newline omitted, the command would only be written to the terminal -(but not automatically executed) for users to inspect and potentially execute themselves. +--- but not automatically executed --- for users to inspect and potentially execute themselves. Some components emit or broadcast messages on specific events, for instance the \code{FSMManagingEventHandler} broadcasts the following message on state transitions: -\begin{lstlisting}[captionpos=b,caption={An FSM Update message}] +\begin{lstlisting}[captionpos=b,caption={An FSM update message}] { "key": "fsm_update", "data" : @@ -65,38 +68,38 @@ Some components emit or broadcast messages on specific events, for instance the } \end{lstlisting} As you can see this message contains loads of useful information regarding what is -exactly happening in the tutorial at a given point and can be used by client code +exactly happening in the tutorial at a given point in time and can be used by client code to make informed decisions based on this knowledge. -It is not the purpose of this text to provide a complete API documentation, so in the +It is not the purpose of this text to provide a complete API documentation however, so in the following I am only going to explain possibilities provided by given components rather than showcasing actual, real-life API messages. \section{Messages Component} The framework must allow content creators to communicate their \emph{message} to the user. -In other words some way must be provided to ``talk'' to users. -This is the responsibility of the \emph{messages} frontend component, which -provides a chatbox-like element on the web application the framework can send -messages to. +In other words, some way must be provided to ``talk'' to users. +This is the responsibility of the \emph{messages} component, which +provides a chatbox-like element on the frontend. The simplest form of communication it accomodates it the insertion of text into the chatbox through API messages. -Every message has an optional \emph{originator}, which serves signal to the user -on the purpose of the message. -These messages are also timestamped. +Every message has an optional \emph{originator}, which serves to remind the user +of the purpose of the given message. +These messages are also timestamped so that it is easier to navigate through them +and look back older messages from the user. -\pic[width=.5\textwidth]{figures/chatbot.png}{The avataobot Typing in the Messages Component} +\pic[width=.5\textwidth]{figures/chatbot.png}{The avataobot typing in the messages component} A particularly interesting feature of the messages component is that TFW client code -can queue a bunch of messages for the component to send one by one, separated by +can queue a bunch of messages for the component to display one by one, separated by appropriate pauses in time so that the user is capable of conveniently reading through all off them. Similarly to a real chat application, some -``jumping dots'' indicate if the bot is still ``typing''. +``jumping dots'' indicate if the bot is still ``typing'' something. The timing of pauses and messages is based on the \emph{WPM} --- or Words Per Minute --- -set by developers according to their specific use cases. +set by developers according to their specific requirements. This creates an experience similar to chatting with someone in real time, as the time -it takes for each message to be displayed is depending on the lenght of the previous message. +it takes for each message to be displayed depends on the lenght of the previous message. This illusion is made possible through appropriate \code{setTimeout()} calls in TypeScript and some elementary math to calculate the proper delays in milliseconds based on message lengths: @@ -106,14 +109,16 @@ message lengths: \[ timeoutSeconds = lastMessageLength / charactersPerSeconds \] \[ timeoutMilliseconds = timeoutSeconds * 1000 \] +The value 5 comes from the fact that on average english words are 5 +characters long according to some studies. + \section{IDE Component}\label{idecomponent} This is the code editor integrated into the frontend of the framework. It allows users to select, display and edit files. -Developers can configure which directory on the file system of the \code{solvable} -container should the editor list files from. +Developers can configure which directory of the file system should the editor list files from. The editor features the ``Deploy'' button referred to earlier in this paper, which is -capable of restarting processes that might execute a file visible in the editor. +capable of restarting processes that might be running code from a file visible in the editor. To implement this IDE% \footnote{Integrated development environment} @@ -124,7 +129,7 @@ This involves commnication with an event handler dedicated to this feature, which is capable of reading and writing files to disk, while sending and receiving editor content from the frontend component. The interaction of this event handler and the Monaco editor provides a seamless -editing expirience, featuring autosave at configurable intervals, code completion, +editing experience, featuring autosave at configurable intervals, code completion, automatic code coloring for several programming languages and more. Perhaps the most ``magical'' feature of this editor is that if any process @@ -141,7 +146,7 @@ appears. If I select it I can confirm that I have successfully created an empty file. After this let's run a \code{while} cycle in the command line which peroadically appends some text to \code{file.txt}: -\begin{lstlisting}[captionpos=b,caption={Bash While Cycle Writing to a File Periodically}, +\begin{lstlisting}[captionpos=b,caption={Bash while cycle writing to a file periodically}, language=bash] while true do @@ -150,7 +155,7 @@ do done \end{lstlisting} The results speak for themselves: -\pic{figures/ide_demo.png}{The Editor Demo Involving Automatic File Refreshing} +\pic{figures/ide_demo.png}{The editor demo involving automatic file refreshing} As you can see, the file contents are automatially updated as the bash script appends to the file. This feature is implemented by using the inotify API% @@ -158,13 +163,13 @@ This feature is implemented by using the inotify API% {http://man7.org/linux/man-pages/man7/inotify.7.html}} provided by the Linux kernel to monitor file system events involving the directory listed by the editor. The event handler of the editor hooks callbacks to said events which notify the -Tutorial Framework to reload the list of files in the directory and the contents of -the selected files. +Tutorial Framework to reload the list of files in the directory as well as the contents of +the selected file. The code making this feature possible is reused several times in the framework for interesting purposes such as monitoring the logs of processes. The editor also allows content creators to completely control it using API messages. -This involves selecting, reading and writing files as well as changing the +This involves the selecting, reading and writing of files as well as changing the selected directory. These features allow content creators to ``guide'' a user through code bases for example, where in each step of a tutorial a file is opened and explained @@ -253,6 +258,8 @@ It's capabilities include starting, stopping and restarting processes. It is also capable of emitting the standard out or standard error logs of processes (by broadcasting TFW messages). This component can be iteracted with using TFW API messages. +The ``Deploy'' button on the code editor uses this component to restart +processes. The Tutorial Framework uses supervisor% \footnote{\href{http://supervisord.org}{http://supervisord.org}} diff --git a/content/abstract.tex b/content/abstract.tex index da75e21..eceaafb 100644 --- a/content/abstract.tex +++ b/content/abstract.tex @@ -6,10 +6,11 @@ interactive tutorials running inside Docker containers, semi-automatically showc IT topics in real time. The user is guided through exercises using real environments with real software, all with the possibility of interaction at any time. This technology can supplement/improve the way e-learning is usually done today ---- which is mostly articles and learning videos --- and help users get hands-on experience -on their way of acquiring knowledge. +--- which is mostly through articles and learning videos --- and help users get hands-on +experience on their way of acquiring knowledge. Currently more than 60 learning exercises based on this framework are available on the e-learning platform called Avatao, with more being released every week. This text is going to justify the need for such technology, explain the ideas leading -to it, discuss architecture, use-cases and more. +to it, discuss use-cases, architecture, the features of the framework and how +developers can use it to create learning exercises. diff --git a/content/architecture.tex b/content/architecture.tex index 75ee8d9..1fba80e 100644 --- a/content/architecture.tex +++ b/content/architecture.tex @@ -6,29 +6,35 @@ two Docker images: \begin{itemize} \item the \code{solvable} image is responsible for running the framework and the client code depending on it - \item the \code{controller} image is responsible for solution checking (to figure out - whether the user completed the tutorial or not) + \item the \code{controller} image is responsible for solution checking: to figure out + whether the user has successfully completed the tutorial or not \end{itemize} -During most of this capter I am going to be discussing the \code{solvable} Docker image, +During most of this chapter I am going to be discussing the \code{solvable} Docker image, with the exception of Section~\ref{solutioncheck}, where I will dive into how the \code{controller} image is implemented. The most important feature of the framework is it's messaging system. Basically what we need is a system where processes running inside a Docker container would be allowed to communicate with eachother. -This is easy with lots of possible solutions (named pipes, sockets or shared memory to name a few). -The hard part is that frontend components running inside a web browser --- which could be -potentially on the other side of the planet --- would also need to partake in said communication. +This task is very easy to solve, with lots of possible solutions +(named pipes, sockets or shared memory to name a few). +The hard part is that frontend components running inside a web browser --- which could +potentially be located on the other side of the planet% +\footnote{Potentially introducing all sorts of issues regarding latency} --- would +also need to partake in said communication. So what we need to create is something of a hybrid between an IPC system and something that can communicate with JavaScript running in a browser connected to it. The solution the framework uses is a proxy server, which connects to frontend components on one side and handles interprocess communication on the other side. This way the server is capable of proxying messages between the two sides, enabling communitaion between them. -Notice that this way what we have is essentially an IPC system in which a web application +Notice that this way what we have is essentially an IPC% +\footnote{Interprocess communication} system in which a web application can ``act like'' it was running on the backend in a sense: it is easily able to -communicate with processes on the backend, while in reality the web application -runs in the browser of the user, on a completely different machine. +communicate with processes running there, while in reality the web application +is running in the browser of the user, on a completely different machine and it uses +some means of communication that is routed through the public internet to achieve this +effect. \begin{note} The core idea and initial implementation of this server comes from Bálint Bokros, @@ -38,54 +44,65 @@ message authentication, restoration of frontend state, a complete overhaul of th state tracking system and the possibility for solution checking among other things). If you are explicitly interested in the differences between the original POC implementation (which is out of scope for this thesis due to lenght constraints) and the current -framework please consult Bálint's excellent paper and Bachelor's Thesis on it\cite{BokaThesis}. +framework please consult Bálint's excellent paper and Bachelor's thesis on it\cite{BokaThesis}. \end{note} -Now let us take a closer look: +Now let us take a closer look at the technology used to implement such a server and +some of the design decisions behind this: \subsection{Connecting to the Frontend} -The old way of creating dynamic webpages was AJAX polling, which is basically sending +The old way of creating dynamic webpages was AJAX% +\footnote{AJAX stands for Asynchronous JavaScript And XML, despite usually not having +anything to do with XML in practice} +polling, which is basically sending HTTP requests to a server at regular intervals from JavaScript to update the contents of your website (and as such requiring to go over the whole TCP handshake and the HTTP request-response on each update). This has been superseded by WebSockets around 2011, which provide a full-duplex communication channel over TCP between your browser and the server. -This is done by initiation a protocol handshake using the \code{Connection: Upgrade} +This is done by initiating a protocol handshake using the \code{Connection: Upgrade} HTTP header, which establishes a premanent socket connection between the browser and the server. This allows for communication with lower overhead and latency facilitating efficient -real-time applications. +real-time applications, which were not always possible to create before due to +the overheads% +\footnote{In some applications this overhead could be bigger than the actual data sent, +such as singaling} introduced by AJAX polling. The Tutorial Framework uses WebSockets to connect to it's web frontend. -The framework proxy server is capable to connecting to an arbirary number of websockets, -which allows opening different components in separate browser windows and tabs, or even -in different browsers at once (such as opening a terminal in Chrome and an IDE in Firefox). +The TFW proxy server is capable to connecting to an arbirary number of WebSockets, +which allows the framework to simultaneously connect to components running in +separate browser windows and tabs, or even +in different browsers altogether (such as opening a terminal in Chrome and an IDE in Firefox). \subsection{Interprocess Communication} To handle communication with processes running inside the container TFW utilizes -the asynchronous distributed messaging library ZeroMQ% +the asynchronous distributed messaging called library ZeroMQ% \footnote{\href{http://zeromq.org}{http://zeromq.org}} or ZMQ as short. The rationale behind this is that unlike other messaging systems such as RabbitMQ% \footnote{\href{https://www.rabbitmq.com}{https://www.rabbitmq.com}} or Redis% \footnote{\href{https://redis.io}{https://redis.io}}, -ZMQ does not require a daemon (message broker process) and as such -has a much lower memory footprint while still providing various messaging +ZMQ does not require a message broker daemon to be running in the background at all times +and as such has a much lower memory footprint while still providing various messaging patterns and bindings for almost any widely used programming language. An other --- yet untilized --- capability of this solution is that since ZMQ is capable of using simple TCP sockets, we could even communicate with processes running on remote -hosts using the framework. +hosts using the current architecture of the framework. There are various lower level and higher level alternatives for IPC other than -ZMQ which were also considered during the desing process of the framework at some point. +ZMQ which were also considered during the design process of the framework at some point. A few examples of top contenders and reasons for not using them in the end: \begin{itemize} \item The handling of raw TCP sockets would involve lot's of boilerplate logic that already have quality implementations in messaging libraries: i.e.\ making sure that - all bytes are sent or received both require checking the return values of the - libc \code{send()} and \code{recv()} system calls, while ZMQ takes care of this + all bytes are sent or received both require constantly checking the return values of the + libc \code{send()} and \code{recv()} system calls% +\footnote{Developers forget this very often, resulting in almost untraceable bugs +that seem to occour randomly}, + while ZMQ takes care of this extra logic involved and even provides higher level messaging patterns such as subscribe-publish, which would need to be implemented on top of raw sockets again. \item Using something like gRPC\footnote{\href{https://grpc.io}{https://grpc.io}} @@ -95,11 +112,15 @@ A few examples of top contenders and reasons for not using them in the end: which would make the framework less lightweight and flexible: socket communication with or without ZMQ does not force you to write synchronous or asynchronous code, whereas common HTTP servers - are either async or pre-fork in nature, which extort certain design choices on code + are either async% +\footnote{Async servers use the \code{select} or \code{epoll} system calls among others +to avoid blocking on IO} or pre-fork% +\footnote{Pre-fork servers spawn multiple processes and threads to handle requests +simultaneously} in nature, which extorts certain design choices on code built on them. \end{itemize} -\section{High Level Overview} +\section{Architectural Overview} Now being familiar with the technological basis of the framework we can now discuss it in more detail. @@ -116,11 +137,11 @@ Architecturally TFW consists of four main components: that is implemented as an event handler called \code{FSMManagingEventHandler} \end{itemize} Note that it is important to keep in mind that as I've mentioned previously, -the TFW Server and event handlers reside in the \code{solvable} Docker container. -They all run in separate processes and only communicate using ZeroMQ sockets. +the TFW server and event handlers reside in the \code{solvable} Docker container. +They all run in separate processes and only communicate with eachother using ZeroMQ sockets. In the following sections I am going to explain each of the main components in -greater detail, as well as how they interact with each other, +greater detail, as well as how they interact with eachother, their respective responsibilities, some of the design choices behind them and more. @@ -149,7 +170,10 @@ Let's inspect further what a valid TFW message might look like: All valid messages \emph{must} include a \code{key} field as this is used by the framework for addressing: event handlers and frontend components subscribe to one -or more \code{key}s and only receive messages with \code{key}s they have +or more of these \code{key}s and only receive% +\footnote{In reality they do receive them, just like how network interfaces receive all +ethernet frames, they just choose ignore the ones not concerning them} +messages with \code{key}s that they have subscribed to. It is possible to send a message with an empty key, however these messages will not be forwarded by the TFW server (but will reach it, so in case the target of a message @@ -165,12 +189,12 @@ at a later point in this paper. The default behaviour of the TFW server is that it forwards all messages from coming from the frontend to the event handlers and vice versa. So messages coming from the WebSockets of the frontend are forwarded to event handlers -via ZMQ and messages received through ZMQ from event handlers are forwarded to +via ZMQ and messages received on ZMQ from event handlers are forwarded to the frontend via WebSockets. The TFW server is also capable of ``reflecting'' messages back to the side they were -received on (to faciliate event handler to event handler for instance), or broadcast -messages to all components. +received from (to faciliate event handler to event handler communication for instance), +or broadcast messages to all components. This is possible by embedding a whole TFW message in the \code{data} field of an outer wrapper message with a special \code{key} that signals to the TFW server that this message requires special attention. @@ -181,7 +205,7 @@ An example of this would be: "data": { ... - The message you want to broadcast or mirror + The whole message you want to broadcast or mirror (with it's own "key" and "data" fields) ... } @@ -198,7 +222,7 @@ As discussed earlier, using ZeroMQ allows developers to implement event handlers in a wide variety of programming languages. This is very important for the framework, as content creators often create challenges that are very specific to a language, for example the showcasing -of a security vulnerability in an older version of Java. +of a security vulnerability in an older version of the Java standard library. These event handlers are used to write most of the code developers wish to integrate with the framework. @@ -210,11 +234,20 @@ based on this knowledge. An event handler such as this could be invoked by sending a message to it at any time when the running of the tests would be required. +An interesting thing to mention is that there \emph{could} be event handlers which +broadcast messages with a \code{key} that they are also subscribed to. +This can distrupt their behaviour in weird ways if they are not prepared to +deal with their own ``echoes''. +The framework offers a solution for this by providing a special +event handler type, which is capable of filtering out it's own broadcasts. +The way they do this is by caching the checksum of every message they broadcast, +and ignore the first message that comes back with the same checksum. + \subsection{Frontend} This is a web application that runs in the browser of the user and uses -multiple WebSocket connections to connect to the TFW server. -Due to rapidly increasing complexity the original implementation (written in +multiple WebSockets to connect to the TFW server. +Due to rapidly increasing complexity, the original implementation (written in plain JavaScript with jQuery% \footnote{\href{https://jquery.com}{https://jquery.com}} and Bootstrap% \footnote{\href{https://getbootstrap.com}{https://getbootstrap.com}}) was becoming @@ -234,7 +267,7 @@ Other reasons included that the frontend of the Avatao platform is also written in Angular (bonus points for experienced team members in the company). An other good thing going for it is that Angular forces you to use TypeScript% \footnote{\href{https://www.typescriptlang.org}{https://www.typescriptlang.org}} -which tries to remedy the issues\cite{JavaScript} +which tries to remedy some of the issues\cite{JavaScript} with JavaScript by being a language that transpiles to JavaScript while strongly encouraging things like static typing or Object Oriented Principles. @@ -244,11 +277,11 @@ strongly encouraging things like static typing or Object Oriented Principles. A good chunk of the framework codebase is a bunch of pre-made, built-in components that implement commonly required functionality for developers to use. -These components usually involve an event handler and an Angular component which -communicates with it to realize some functionality. +These components usually involve an event handler and an Angular component +communicating with eachother to realize some sort of functionality. An example would be the built-in code editor of the framework -(visible on the left side of Figure~\ref{figures/tfw_frontend.png}). -This code editor is essentially a Monaco editor% +(visible on the right side of Figure~\ref{figures/tfw_frontend.png}). +This code editor essentially is a Monaco editor% \footnote{\href{https://microsoft.github.io/monaco-editor/} {https://microsoft.github.io/monaco-editor/}} instance integrated into Angular and upgraded with the capability to @@ -256,21 +289,23 @@ exchanges messages with an event handler to save, read and edit files that reside in the writeable file system of the \code{solvable} Docker container. -All of the built-ins come with full API documentation explaining what they do -on receiving specific messages, and what messages they emit on different events. +All of the built-ins come with a full API documentation explaining what they do +on receiving specific messages, and what kind of messages they may emit on different events. This greatly expands the capabilities of the framework, since it allows developers to do things including, but not limited to: \begin{itemize} \item making the code editor automatically appear in sections - of the tutorial where the user needs to use it + of the tutorial where the user needs to use it, then disappear + when it is no longer needed to conserve space \item inject commands into the user's terminal - \item hook into messages emitted from components to detect events, such as + \item hook callbacks to run code on messages emitted from components to + detect events, such as to detect if the user has clicked a button or executed a command in the terminal - \item monitor the logs (stdout or stderr) of a given process + \item monitor the logs (stdout or stderr) of a given process in real time \end{itemize} Every pre-made component is designed with the mindset to allow flexible -and creative usage by developers, with the possibility of future extensions. +and creative usage by developers, with the added possibility of future extensions. Often when developers require certain new features, they open an issue on the git repository of the framework for me to review and possibly implement later. @@ -279,18 +314,22 @@ One example would be when a developer wanted to automatically advance the tutori when the user has entered a specific string into a file. This one didn't even require a new feature: I recommended him to implement an event handler listening to the messages of the built-in file editor, filter the messages -which contain file content that is going to be written to disk, and simply +which contain file content that is being sent to be written to disk, and simply search these messages for the given string. The exact capabilities of these built-in components will be explained in greater -detail in a later chapter. +detail in Chapter~\ref{atouroftfw}. +Developers who are well-aware of these capabilites are able to use the framework in extremely +creative ways allowing for very interesting functionality, such as the above example. +The components of TFW can often be combined to work together in unexpected, yet useful +ways, similarly how command-line utilities on UNIX-like systems do. \subsection{TFW Finite State Machine} An important requirement we have specified during~\ref{requirements} was that the framework must be capable of tracking user progress. TFW allows developers to define a \emph{finite state machine} -which is capable of describing the desired ``story'' of a tutorial. +which is capable of describing the desired ``story'' of a learning exercise. The states of the machine could be certain points in time during the completion of the tutorial envisioned and transitions could be events that influence the state, such as the editing of files, execution of commands and so on. @@ -301,23 +340,25 @@ Take the fixing of a SQL Injection% vulnerability as an example. Let's assume, that the source code is vulnerable to a SQL injection attack because it tries to compose a query with string concatenation instead of -using a parameterized query provided by the database library. +using a prepared statement provided by the database library. A challenge developer could implement an FSM in the framework that looks like this: -\pic[width=.6\textwidth]{figures/tfw_fsm.png}{An Example for a Finite State Machine in TFW} +\pic[width=.6\textwidth]{figures/tfw_fsm.png}{An example for a finite state machine in TFW} In case the source file has been edited, the unit test cases designed to detect whether the code is vulnerable or not are invoked. Depending on the results three cases are possible: \begin{description} - \item[All test cases have succeeded:] If all the tests succeeded then the user has managed + \item[All test cases have succeeded:] If all the tests cases have ran successfully, + then the user has managed to fix the code properly and we can display a congratulating message accordingly. - \item[All test cases have failed:] In this case the solution is incorrect - and we can offer some hints. + \item[All test cases have failed:] In this case the submitted solution is incorrect + and we should offer some hints, so that the user can try again more effectively, + optionally displaying more and more hints with each successive failure. \item[Some test cases have succeeded:] It is possible that the based on the test cases - that have succeeded and failed we can determine that the user tried to blacklist - certain SQL keywords. This is a common, but incorrect solution of fixing a SQL + that have succeeded and failed we can determine that the user has tried to blacklist + certain SQL keywords. This is a common, but incorrect ``solution'' of fixing a SQL injection vulnerability. Now we can explain to users why their solution is wrong, and give them helpful tips. \end{description} @@ -330,10 +371,11 @@ This is a very engaging feature that offers an immersive learning experience for users, which many solutions for distance education lack so often. Developers can use a YAML file or write Python code to implement finite -state machines. -In state machine implementations it is possbile to subscribe callbacks to be +state machines in TFW\@. This is going to be further detailed in +Chapter~\ref{usingtfw}. +In the implementation of state machines it is also possbile to subscribe callbacks to be invoked on certain events regarding the machine, such as before and after -state transitions, or onentering and exiting a state. +state transitions, or on entering and exiting a state. It is \emph{very} important to be aware of these callbacks, as much of the actual tutorial logic is often going to be implemented in these. @@ -351,22 +393,28 @@ The \code{trigger} field of a message can be used to step the framework FSM if all preconditions are met. The way this works is if the TFW server encounters a message with a \code{trigger} defined, it notifies the event handler managing -the state machine. +the state machine so it can attempt activating said \code{trigger}. -Since messages can come from unauthenticated sources, it is possible to +Since messages in the system can come from unauthenticated sources (such as the frontend), +it is possible to enforce the authentication of privileged messages, such as messages containing a \code{trigger}. -The framework allows trusted code to access a cryptographic key on the file system, which +The framework allows trusted code to access a cryptographic key stored on the file system +with proper permissions, which can be used to digitally sign messages (this is what the \code{signature} message -field is designed for). -In this case the TFW server will only forward privileged messages that -have a valid signature. +field is designed for) using HMAC% +\footnote{Hash-based message authentication code}. +In this case the TFW server will only forward the privileged messages that +have a valid signature, and the evend handler managing the state machine +will also validate the signature of messages it receives +(and sign the updates it broadcasts as well, so that other components can verify that +they come from a trusted source). \subsection{Solution checking}\label{solutioncheck} Traditionally most challenges on the Avatao platform implement a Docker image called \code{controller}, which is responsible for detecting the successful solution of a challenge. -When using the Tutorial Framework a pre-implemented \code{controller} +When using the Tutorial Framework, a pre-implemented \code{controller} image is available, which listens to messages emitted by the framework FSM, and detects if the final state defined by developers is reached. This means that if content creators implement a proper FSM, the solution checking @@ -378,4 +426,5 @@ traditional hacking challenges, such as exercises developed for CTF% \footnote{A ``capture the flag'' game is a competition designed for professionals --- or just people interested in the field --- to sharpen their skills in IT security. Avatao often organises similar events.} -events. +events, as the controller image is also capable of verifying the authenticity of +FSM update messages via inspecting their signatures. diff --git a/content/introduction.tex b/content/introduction.tex index c3cea40..26b155a 100644 --- a/content/introduction.tex +++ b/content/introduction.tex @@ -3,7 +3,7 @@ \section{Project justification} As the world is being completely engulfed by software, the need for accessible, but -high quality learning materials on software engineering and especially secure software +high quality learning materials covering software engineering and especially secure software engineering is on the rise. While we are enjoying the comfort that information technology provides us, we often forget about the risks involved in relying so much on software in our everyday lives. @@ -39,14 +39,14 @@ knowledge is something that comes naturally, rather than something we have to st I believe that this is something that \emph{can} and \emph{should} be applied to everything we do as a society. The only thing we can hope and work for is to become better and better as time -and generations pass. +and generations pass by. We \emph{must} do better, and education is the way forward. The short term goal of this project --- and the goal of this thesis --- is to provide a new angle in the education of software engineering, especially secure software engineering based on the aspirations above, with the long term goal of bringing something new to the table in the matter of IT education as a whole -(not just developers, but users as well). +(not just for developers, but for users as well). \section{A Short Introduction to Avatao} @@ -96,11 +96,12 @@ things like exercises involving the use of Docker or Windows based challenges. \section{Emergence}\label{intro:emergence} While working as a content creator I have stumbled into the idea of automating the completion -of challenges for QA\footnote{Quality Assurrance} and demo purposes% -\footnote{I used to record short videos or GIFs to showcase my content to management}. -In a certain scenario I was required to integrate a web based terminal emulator in a +of challenges for QA\footnote{Quality Assurrance} and demo purposes. +I used to record short videos or GIFs to showcase my content to management. +In a certain scenario I was required to integrate a web based terminal emulator into a frontend application to improve user experience by making it possible to use a shell right on the website rather than having to connect through SSH\@. + After I got this working I was looking into writing hacky bash scripts to automate the steps required to complete the challenge in order to make it easier for me to record the solution, as I have often found myself recording over and over again for a demo without any mistakes. @@ -109,6 +110,7 @@ to a hidden gem of a project on GitHub called \code{demo-magic}% \footnote{\href{https://github.com/paxtonhare/demo-magic}{https://github.com/paxtonhare/demo-magic}}, which is esentially a bash script that simulates someone typing into a terminal and executing commands. + I have created a fork% \footnote{ \href{https://git.strongds.hu/mrtoth/demo.sh/src/master/demo.sh} @@ -131,7 +133,7 @@ commands executed during the solution process. I was quite pleased with myself, no longer having to do the busywork of recording videos, but what I did not know was that I have accidentally did something far more than a hacky bash script solving challenges, as this little script -would help formulate the idea of the project \emph{Tutorial Framework} or just \emph{TFW}. +would help formulate the idea of the \emph{Tutorial Framework} or just \emph{TFW}. \section{Vision of the Tutorial Framework} @@ -177,14 +179,20 @@ Meanwhile a console could show the output of the webserver. For example if the source code the user tried to deploy was invalid, the framework should report the exact exception raised while running the application. -\pic{figures/webapp_and_editor.png}{The Code Editor and Web Application Example In TFW} +\pic{figures/webapp_and_editor.png}{The code editor and web application example in TFW} Even if we did all this, we would still need a way to integrate this whole thing into a web based frontend with a file editor, terminal, chat window and stuff like that. Turns out that today all this can be done by writing a few hundred lines of Python code which uses the Tutorial Framework. -\pic{figures/webapp_and_editor_err.png}{Invalid Code and Deployment Failure with Process Output} +\pic{figures/webapp_and_editor_err.png}{Invalid code and deployment failure with process output} + +Note that it is possible to try out the current version of the Tutorial Framewok +using a guest account on the Avatao platform on this +\href{https://platform.avatao.com/paths/d0ccef1f-0389-45bf-9d44-e85b86d66c49/challenges/a7e08c0a-199f-4f8d-aa7e-51b6e9bfcb15}{url}% +\footnote{\href{https://platform.avatao.com/paths/d0ccef1f-0389-45bf-9d44-e85b86d66c49/challenges/a7e08c0a-199f-4f8d-aa7e-51b6e9bfcb15} +{https://platform.avatao.com/paths/d0ccef1f-0389-45bf-9d44-e85b86d66c49/challenges/a7e08c0a-199f-4f8d-aa7e-51b6e9bfcb15}}. \subsection{Project Requirements}\label{requirements} @@ -199,8 +207,8 @@ To achieve our goals we would need: \item a way to keep track of user progress \item a way to to handle various events (i.e.\ we can react when the user has edited a file, or has executed a command in the terminal) - \item a highly flexible messaging system, in which processes and - frontend components (running in a web browser) could communicate with eachother + \item a highly flexible messaging system, in which processes running on the backend and + frontend components running in a web browser could communicate with eachother \item a web based frontend with lots of built-in options (terminal, file editor, chat window, etc.) that use said messaging system \item stable APIs that can be exposed to content creators to work with (so that @@ -236,14 +244,14 @@ completely rewritten due to an increased focus on code quality, extensibility and API stability required by new features. It is interesting to note, that when I've mentioned that the project requirements -were kept general on purpose (\ref{requirements}) I had good reason to do so. -When taking a look at the requirements of Bálint's Thesis, much of that +were kept general on purpose in~\ref{requirements}, I had good reason to do so. +When taking a look at the requirements of Bálint's thesis, much of that is completely obsolete by now. But since the project has followed Agile Methodology% \footnote{Manifesto for Agile Software Development: \href{https://agilemanifesto.org}{https://agilemanifesto.org}} from the start, we were able to adapt to these changes without losing -the progess he made in said Thesis. Quoting from the Agile Manifesto: +the progess he made in said thesis. Quoting from the Agile Manifesto: ``Responding to change over following a plan''. This is a really important takeaway. diff --git a/content/summary.tex b/content/summary.tex index d0e57ba..b403f39 100644 --- a/content/summary.tex +++ b/content/summary.tex @@ -67,17 +67,31 @@ us to do so. \section{Things That I Have Learned} -I've spent a long time working on and maintaining this project. +Despite being an enthusiast of \LaTeX{} for a few years now, I still managed to learn a great +deal about it while working on this text. +This might seem like something unrelated, but most documentation issues with software often +come from the fact that developers usually dislike writing documentation. +Since working with \LaTeX{} I \emph{love} writing larger bodies of text such as this, +as I just simply enjoy admiring quality typography which WYSIWYG% +\footnote{What You See Is What You Get} editors just seem unable to produce. + +I've spent a long time working on and maintaining the Tutorial Framework. While the list of technical things I've learned is long and exciting, I also feel like I've learned a lot about supporting other developers, project management and communication. -A thing that I will always remember as a software engineer and I've learned during this period +The most important thing, that I will always remember as a software engineer +and is something that I've learned during this period is to never, ever lower my expectations regarding code quality. -No matter what anybody tells you about ``but we have to finish until'' and stuff like that, -in the long run it is always like shooting yourself in the foot. -We as professionals must always thrive for excellence, and must always express our +No matter what anybody tells you about things like ``but we have to make haste and finish in time'', +in the long run, making compromises in code quality is always like shooting yourself in the leg. +We as professionals must always \emph{thrive} for excellence, and must always express our deepest respect towards our craft. -The only way we can do this is by creating quality software as craftsmen. +The only way we can do this is by creating quality software while being a responsible +\emph{craftsman}. +It is a thing of great importance, which cannot be stressed enough, that in the software +field \emph{craftsmanship}% +\footnote{\href{http://manifesto.softwarecraftsmanship.org} +{http://manifesto.softwarecraftsmanship.org}} is what matters most. Many developers fail to understand that no matter how insignificant the code you write today may seem, software is art, and art is something worth pursuing just for the sake of doing art itself. diff --git a/content/using_the_framework.tex b/content/using_the_framework.tex index 057812d..0a8c378 100644 --- a/content/using_the_framework.tex +++ b/content/using_the_framework.tex @@ -1,4 +1,4 @@ -\chapter{Using the Framework} +\chapter{Using the Framework}\label{usingtfw} In this section I am going to dive into further detail on how client code is supposed to use the framework, some of the design decisions behind this and how everything is diff --git a/figures/tfw_fsm.png b/figures/tfw_fsm.png index 09ca27b..2d5ebe8 100644 Binary files a/figures/tfw_fsm.png and b/figures/tfw_fsm.png differ