Continue writing thesis

This commit is contained in:
Kristóf Tóth 2018-12-02 18:02:48 +01:00
parent ce8cad2ac4
commit 2f728dc353
2 changed files with 118 additions and 10 deletions

View File

@ -119,3 +119,12 @@
year={2016}, year={2016},
month=mar, month=mar,
} }
@online{NodeModules,
title={Whats really wrong with node\_modules and why this is your fault},
url={https://hackernoon.com/whats-really-wrong-with-node-modules-and-why-this-is-your-fault-8ac9fa893823},
language={english},
author={Mateusz Morszczyzna},
year={2017},
month=dec,
}

View File

@ -273,29 +273,30 @@ initialized in, which exposes several communicative options:
captionpos=b captionpos=b
]{listings/event_handler_main.py} ]{listings/event_handler_main.py}
\section{Setting Up a Developer Environment} \section{Setting Up a Developer Environment}\label{devenv}
To make getting started as smooth as possible I have created To make getting started as smooth as possible I have created
a ``bootstrap'' script which is capable of creating a development envrionment from a ``bootstrap'' script which is capable of creating a development envrionment from
scratch. scratch.
This script is distributed as a bash one-liner command, like so: This script is distributed as the following bash one-liner:
\begin{lstlisting}[language=bash] \begin{lstlisting}[language=bash]
bash -c "$(curl -fsSL https://git.io/vxBfj)" bash -c "$(curl -fsSL https://git.io/vxBfj)"
\end{lstlisting} \end{lstlisting}
This command downloads a script using \code{curl}, then executes the downloaded This command downloads a script using \code{curl}%
script in bash. \footnote{\href{https://curl.haxx.se}{https://curl.haxx.se}}, then executes it in bash.
In the open source community it is quite common to distribute installers this way% In the open source community it is quite common to distribute installers this way%
\footnote{A good example of this is oh-my-zsh \footnote{A good example of this is oh-my-zsh
\href{https://github.com/robbyrussell/oh-my-zsh}{https://github.com/robbyrussell/oh-my-zsh}}, \href{https://github.com/robbyrussell/oh-my-zsh}{https://github.com/robbyrussell/oh-my-zsh}},
which might seem a little scary at first, but is not less safe then which might seem a little scary at first, but is not less safe then
downloading and executing a binary installer from a website with a valid TLS certificate. downloading and executing a binary installer from a website with a valid TLS certificate, as
\code{curl} will fail with an error message if the certificate is invalid.
This is because both methods place their trust in the PKI~\footnote{Public Key Infrastructure} This is because both methods place their trust in the PKI~\footnote{Public Key Infrastructure}
to defend against man-in-the-middle% to defend against man-in-the-middle%
\footnote{\href{https://www.owasp.org/index.php/Man-in-the-middle_attack} \footnote{\href{https://www.owasp.org/index.php/Man-in-the-middle_attack}
{https://www.owasp.org/index.php/Man-in-the-middle\_attack}} attacks. {https://www.owasp.org/index.php/Man-in-the-middle\_attack}} attacks.
Debating the security of this infrastructure is certainly something that we Debating the security of this infrastructure is certainly something that we
as an industry should constantly do, but it is out of scope for this paper. as an industry should constantly do, but it is out of the scope of this paper.
Nevertheless I have also created a version of this command that Nevertheless I have also created a version of this command that
checks the SHA256 checksum of the bootstrap script before executing it checks the SHA256 checksum of the bootstrap script before executing it
@ -312,10 +313,12 @@ then pipes it into a bash interpreter \emph{only if} the checksum
of the downloaded string matches the one provided, otherwise it displays of the downloaded string matches the one provided, otherwise it displays
an error message. an error message.
Software projects distributing their product as binary installers often Software projects distributing their product as binary installers often
display such checksums on their download pages. display such checksums on their download pages with the purpose to potentially
mitigating MITM attacks.
The bootstrap script clones the three TFW repositories and does several steps The bootstrap script clones the three TFW repositories and does several steps
to create a working environment: to create a working environment into a single directory, that is based on
test-tutorail-framework:
\begin{itemize} \begin{itemize}
\item It builds the newest version of the TFW baseimage locally \item It builds the newest version of the TFW baseimage locally
\item It pins the version tag in \code{solvable/Dockerfile}, \item It pins the version tag in \code{solvable/Dockerfile},
@ -325,8 +328,104 @@ to create a working environment:
\end{itemize} \end{itemize}
It is important to note that this script \emph{does not} install anything system-wide, It is important to note that this script \emph{does not} install anything system-wide,
it only works in the directory it is being executed from. it only works in the directory it is being executed from.
This is a good practice, as many users --- including me --- find scripts that
write files all around the system intrusive if they could provide the same functionality
while working in a single directory.
It would be a lot easier to simply use Docker Hub% It is also worth to mention that it would have been a lot easier to simply use Docker Hub%
\footnote{\href{https://hub.docker.com}{https://hub.docker.com}}, \footnote{\href{https://hub.docker.com}{https://hub.docker.com}},
but since the code base is currently proprietary we cannot distribute but since the code base is currently proprietary we cannot distribute
it using a public medium. it using a public medium, and we use our own image registry to store private Docker
images.
\section{Building and Running a Tutorial}
After the environment has been created using the script described in~\ref{devenv},
it is possible to simply use standard Docker commands to build and run the tutorial.
Yet the \code{hack} directory of test-TFW also provides a script called
\code{tfw.sh} that developers prefer to use for building and running their
exercises.
Why is this the case?
\subsection{The Frontend Issue}
To be able to understand this, we will have to gain some understanding of the
build process of Angular projects.
When frontend developers work on Angular projects, they usually use the built-in
developer tool of the Angular-CLI%
\footnote{\href{https://cli.angular.io}{https://cli.angular.io}},
\code{ng serve} to build and serve their application.
The advantage of this tool is that it automatically reloads the frontend
when the code on disk is changed, and that it is generally very easy to work with.
On the other hand, a disadvantage is that a \code{node_modules} directory
containing all the npm%
\footnote{\href{https://www.npmjs.com}{https://www.npmjs.com}}
dependencies of the project must be present while doing so.
The problem with this is that because the JavaScript ecosystem is a \emph{huge}
mess\cite{NodeModules}, these dependencies can easily get up to
\emph{several hundreds of megabytes} in size.
To solve this issue, when creating production builds,
Angular uses various optimizations such as tree shaking%
\footnote{\href{https://webpack.js.org/guides/tree-shaking/}
{https://webpack.js.org/guides/tree-shaking/}}
to remove all the dependencies that won't be used when running the application%
\footnote{Otherwise it won't be possible to serve these applications efficiently
over the internet}.
The problem is, that these things can take a \emph{really} long time.
This is why today frontend builds usually take a lot longer than building anything
not involving JavaScript (such as C++, C\# or any other compiled programming language).
This mess presents it's own challenges for the Tutorial Framework as well.
Since hundreds of megabytes of dependencies have no place inside Docker containers%
\footnote{Otherwise it may take tens of seconds just to send the build context to
the Docker daemon, which means waiting even before the build began},
by default the framework will only place the results of a frontend production build
of \code{solvable/frontend} into the image layers.
This slows down the build time of TFW based challenges so much, that instead of like
30 seconds, they will often take 5 to 10 minutes.
\subsection{The Solution Offered by the Framework}
To circumvent this, it is possible to entirely exclude the Angular frontend from a TFW
build, using build time arguments%
\footnote{In practice this is done by supplying the option
\code{--build-arg NOFRONTEND=1} to Docker}.
But when doing so, developers would have to run the frondent locally with
the whole \code{node_modules} directory present.
The bootstrap script takes care of putting these dependencies there,
while the \code{tfw.sh} script is capable of starting a development server
to serve the frontend locally using \code{ng serve} besides starting
the Docker container without the frontend.
If this whole thing wasn't complicated enough, since Docker binds the port
the container is going to use, \code{tfw.sh} has to run this dev server on
an other port, then use the proxying features of Angular-CLI to forward requests
from this port to the runnign Docker container when requesting resources
other then the entrypoint to the Angular application.
This is the reason why the frontend is accessible through port \code{4200} (default
port for \code{ng serve}) when using \code{tfw.sh} to start a tutorial, but when running
a self-contained container built with the frontend included it is accessible on port \code{8888}
(the default port TFW uses).
While it also provides lots of other functionality, this is one of the reasons why
the \code{tfw.sh} script is a several hundreds of lines long bash script.
The implementation of making the frontend toggleable during Docker builds requires some
of the \code{ONBUILD} stuff we've discussed earlier:
\begin{lstlisting}[language=bash]
ONBUILD RUN test -z "${NOFRONTEND}" &&\
cd /data && yarn install --frozen-lockfile || :
ONBUILD RUN test -z "${NOFRONTEND}" &&\
cd /data && yarn build --no-progress || :
ONBUILD RUN test -z "${NOFRONTEND}" &&\
mv /data/dist ${TFW_FRONTEND_DIR} && rm -rf /data || :
\end{lstlisting}
Remember that \code{ONBUILD} commands run in the build context of the child image.
What these commands do is they check if the \code{NOFRONTEND} build argument
is present or not, and only deal with the frontend if this argument is not defined.
The \code{|| :} notation in bash basically means ``or true'', which is required
to avoid aborting the build due to the non-zero return code produced
by the \code{test} command if the build arg is defined.