Literate Programming, Reproducible Research, and "Clean Code" + Docstrings
What is “Literate Programming”?
Donald Knuth formally introduced the idea of “Literate Programming” in 1984 (Knuth, 1992). In his book he suggested the following change in attitude within the construction of programs:
Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want the computer to do.
At Stanford University the idea of literate programming was realized through
the WEB
system. It involves combining a document formatting
language with a programming language (Knuth, 1992). In the original report
submitted to The Computer Journal in 1983, the example presented of the
WEB
system consisted of the combination of TeX
, and PASCAL
. This
choice is not surprising, as Donald Knuth created TeX
. The code was written
in a single .WEB
file. A reason Knuth was happy with the term WEB
is
because he considered a complex piece of code as a web simple
materials delicately placed together (Knuth, 1992). The ‘weaving’ process
involves separating out the document formatting language into a separate
file. While the ‘tangling’ entails extracting the programming language to
produce a machine-executable code (Knuth, 1992). This process was depicted
as figure 1 (Knuth, 1992).

Figure 1: Adopted image from the original paper showing dual usage of a .WEB
file. The ‘weaving’ separates out the TeX
from the .WEB
file compiles it to an output document, while the ‘tangling’ extracts the PASCAL
code to produce a machine-executable program.
If you consider using a tool for literate programming their website has a long list of tools which can be used for writing code with literate programming (“More Tools”). While it is comprehensive, it does not list a fantastic tools which will be mentioned later. There are also several other projects online which seem to be popular, such as Zachary Yedidia’s Literate project.
Data Science and Reproducible Research
The term “data science” likely first appeared in 1974 and initially defined by the following quote (Cao, 2017).
The science of dealing with data, once they have been established, while the relation of the data to what they represent is delegated to other fields of data.
The term has increased in popularity since 2012 (Cao, 2017). This increase in interest is likely due to the emergence of data mining, big data and the advances in Artificial Intelligence (AI) and Machine Learning (ML). Due to the accumulation of large amounts of data, interdisciplinary tools are being combined with to form a “new” field. A more modern and comprehensive definition of data science is as follows (Cao, 2017).
From the disciplinary perspective, data science is a new interdisciplinary field that synthesizes and builds on statistics, informatics, computing, communication, management, and sociology to study data and its environments (including domains and other contextual aspects, such as organizational and social aspects) in order to transform data to insights and decisions by following a data-to-knowledge-to-wisdom thinking and methodology.
A central point in literate programming is the intent of ‘weaving’ and ‘tangling’ the original document into two separate files. The original web document was not necessarily intended to be used as the final result. However, the modern use of “reproducible research” implies using a similar literate programming web document as the final result (Goodman, Fanelli, & Ioannidis, 2018).

Figure 2: Example of a Jupyter Notebook showing the mix of rich text, code block, along with output.
The computer scientist Jon Claerbout coined the term “reproducible research” and described it with a reader of a document being able to see the entire process from the raw data to the code producing tables and figures (Goodman et al., 2018). The term can be slightly confusing, as reproducibility has long been an important part of the research and the scientific method.
One of the issues in with the use of software in scientific field generally, is the reproducibility of the research being conducted (Hinsen, 2013). Therefore, the greater context surrounding the code is more important and possibly not self explanatory through the code alone. Experts within the field of exploratory data science even struggle keeping track of the experiments they conduct. It can lead to confusion of what experiments have been performed and how they came by the results (Hinsen, 2013). Since digitalization and digital tools have become a more important part of many fields, reproducible research has become popular in such fields such as epidemiology, computational biology, economics, and clinical trials (Goodman et al., 2018).
While literate programming and reproducible research are closely related in form, they were intended for slightly different use-cases. Literate programming is not specifically for research, but is a more broad approach to programming. While reproducible research focuses on providing the full context of an experiment or study and not making the code itself more legible.
Some of the more popular tools for reproducible research are
Jupyter Notebooks for programming in python (example in figure
2), and the integration of SWeave
and knitr
into R
Markdown for programming in the statistical language R (Kery, Radensky, Arya, John, & Myers, 2018).
Jupyter Notebooks is an open source spin-off project from the IPython
(Interactive Python) project. Python is considered a fairly easy language to
learn, there are several python projects for both data manipulation, and
support for different machine learning frameworks such as Tensorflow and
PyTorch. This makes Jupyter Notebooks particularly well suited for data
science.
Readable and Clean Code
Knuth’s attitude change mentioned above was primarily about making code more readable for humans. Literate programming is a method of accomplishing this goal, but it is not the only avenue to take. What if the code was written so well, it was completely self-explanatory.
Robert C. Martin, affectionately called “Uncle Bob”, is a software engineer and the author of the bestselling book Clean Code: A Handbook of Agile Software Craftsmanship (Martin & Coplien, 2009). He has an interesting talk on clean code where he explains that code should read like “well written prose”. In his book he writes (Martin & Coplien, 2009):
Indeed, the ratio of time spent reading versus writing is well over 10 to 1. We are constantly reading old code as part of the effort to write new code. …[Therefore,] making it easy to read makes it easier to write.
While writing readable code is something to always strive towards, it will likely never be completely self-explanatory. Therefore, it would likely be beneficial to write good documentation to complement the code.
Self-Documentation with Docstrings
Emacs has a nice method of self-documentation which is very helpful
to understanding the editors source code written in elisp
. It
uses documentation strings (or docstrings), which are comments directly
written into the code. Writing comments in code to describe functions is not
at all uncommon. However, traditional in-file documentation is stripped
from the source during runtime, such as Javadocs. Docstrings are maintained
throughout the runtime of the code in which it is written. The programmer
can then interactively inspect this throughout the runtime of the program.
Some notable languages which support Docstrings include:
- Lisp/Elisp
- Python
- Elixir
- Clojure
- Haskell
- Gherkin
- Julia
Emacs is almost like working in an lisp
terminal. The editor is written
entirely in a variant of lisp
, called elisp
. For example, in order to
split a window vertically, the function split-window-below
is called. The
code snippet below is the start of the source code for this function.
(defun split-window-below (&optional size)
"Split the selected window into two windows, one above the other.
The selected window is above. The newly split-off window is
below and displays the same buffer. Return the new window.
If optional argument SIZE is omitted or nil, both windows get the
same height, or close to it. If SIZE is positive, the upper
\(selected) window gets SIZE lines. If SIZE is negative, the
lower (new) window gets -SIZE lines.
If the variable `split-window-keep-point' is non-nil, both
windows get the same value of point as the selected window.
Otherwise, the window starts are chosen so as to minimize the
amount of redisplay; this is convenient on slow terminals."
(interactive "P")
(let ((old-window (selected-window))
(old-point (window-point))
(size (and size (prefix-numeric-value size)))
moved-by-window-height moved new-window bottom)
(when (and size (< size 0) (< (- size) window-min-height))
;; `split-window' would not signal an error here.
(error "Size of new window too small"))
;; ...
;; ...
;; ...
When calling the function describe-function
and inputting the function
name, “split-window-below”, it will generate the following output in a
mini-buffer (a kind of little buffer). This documentation is created
dynamically during runtime.
split-window-below is an interactive compiled Lisp function in ‘window.el’.
It is bound to SPC w -, SPC w s, M-m w -, M-m w s, C-x 2,
. (split-window-below &optional SIZE)
Probably introduced at or before Emacs version 24.1.
Split the selected window into two windows, one above the other. The selected window is above. The newly split-off window is below and displays the same buffer. Return the new window.
If optional argument SIZE is omitted or nil, both windows get the same height, or close to it. If SIZE is positive, the upper (selected) window gets SIZE lines. If SIZE is negative, the lower (new) window gets -SIZE lines.
If the variable ‘split-window-keep-point’ is non-nil, both windows get the same value of point as the selected window. Otherwise, the window starts are chosen so as to minimize the amount of redisplay; this is convenient on slow terminals.
[back]
Comparing the docstring in the source code and the output of
describe-function
, there is more information added to the output
documentation. This was a slight aside, but using describe-*
functions in
Emacs is probably of the most useful and helpful ways to learn emacs.
However, this does show a benefit of how docstrings are used in emacs.
Org-Mode for Literate Programming and Reproducible Research
While Jupyter Notebooks are a fantastic way of writing reproducible
research, they are not a method of literate programming as they are not
intended to be ‘weaved’ and ‘tangled’. The original tool such as WEB
system for literate programming, does not allow for compiling embedded code
interactively. However, Org-mode was the first to provide full support for
reproducible research and literate programming (Schulte, Davison, Dye, & Dominik, 2012).
Org-mode is something called a major mode in emacs. An .org
file is essentially
a plain text markup language, but there are so many things that can be done
in org-mode it is mind-boggling. With some configuration it can be used as
almost anything including an advanced agenda system, calendar, financial
ledger, zettlekasten note taking system (org-roam), and an exporter into
almost any text formatting. Even that was an inadequate list, but for those
interested this is a more complete list.

Figure 3: An example of a .org
file with code blocks in emacs-lisp
which will be tangled to a file called spacemacs.el
.
Using a package called org-babel
it can allow for both ‘weaving’ using
org-mode’s built in export functionality, and ‘tangling’ using
org-babel-tangle
of the entire file (see figure 3). As with other markup languages,
.org
files already support rich text formatting of code blocks. With
simple tags and commands the code can run code blocks can be run dynamically
by using certain session
tags. Running embedded code blocks in this manner
through the same, or multiple, sessions in the same file is important for
reproducible research. More details on using org-mode for both literate
programming and reproducible research is the paper by (Schulte et al., 2012).
A blog post which explains and demonstrates how to use org-babel
can be
found here.
Closing thoughts
While literate programming is the a useful tool for providing more context to programming projects, in many cases writing self-explanatory code might be the best avenue for pure software engineering. Reproducible research is more suited for explaining a study or an experiment which is particularly useful in the scientific practice today as the use of digital tools continues to grow. Contrary to software engineering, these use-cases aren’t well suited for simply writing self-explanatory code. Literate programming and reproducible research attack mixed natural language and computational language documents for different ends (Schulte et al., 2012). While literate programming introduces natural language to programming files, reproducible research adds embedded executable code into natural language documents (Schulte et al., 2012).
If humans were capable of easily writing in machine code, there would be no need for all the different types of programming languages. The reason we use programming languages is to be able to interpret and understand what we are instructing the computer to do. Therefore, making the code understandable should be a top priority. Even though Knuth’s literate programming might not be suitable in all settings, I think his attitude about telling the reader what we are instructing the machine to perform is valuable in any programming situation. This is especially true for collaboration and long term complexity of coding projects.
While some of the differences between literate programming and reproducible research have been mentioned, particular examples of literate programming were not discussed. Configuration files are ideal for literate programming. They typically require explanations of the context of the settings or an explanation of the setting itself. After writing the literate configuration, they can be tangled to generate the actual configuration. I hope to write a post soon about showing how I created a literate configuration for spacemacs.
Bibliography
Cao, L. (2017, June). Data science: A comprehensive overview. ACM Comput. Surv., Vol. 50, pp. 1–42. Association for Computing Machinery. Retrieved from https://dl.acm.org/doi/10.1145/3076253
Goodman, S. N., Fanelli, D., & Ioannidis, J. P. (2018). What does research reproducibility mean? In Get. to Good Res. Integr. Biomed. Sci. (Vol. 8, pp. 96–102). Springer International Publishing. Retrieved from www.ScienceTranslationalMedicine.org
Hinsen, K. (2013). Software development for reproducible research. Comput. Sci. Eng., /15/(4), 60–63.
Kery, M. B., Radensky, M., Arya, M., John, B. E., & Myers, B. A. (2018). The Story in the Notebook: Exploratory Data Science using a Literate Programming Tool. Retrieved from https://doi.org/10.1145/3173574.3173748
Knuth, D. E. (1992). Literate programming. In Cent. Study Lang. Inf.
Martin, R. C., & Coplien, J. O. (2009). Clean code: a handbook of agile software craftsmanship. Upper Saddle River, NJ [etc.]: Prentice Hall. Retrieved from https://www.amazon.de/gp/product/0132350882/ref=oh%5Fdetails%5Fo00%5Fs00%5Fi00
Schulte, E., Davison, D., Dye, T., & Dominik, C. (2012). A multi-language computing environment for literate programming and reproducible research. J. Stat. Softw., /46/(3), 1–24. Retrieved from https://www.jstatsoft.org/index.php/jss/article/view/v046i03/v46i03.pdf https://www.jstatsoft.org/index.php/jss/article/view/v046i03
More Tools. Lit. Program. Retrieved from http://www.literateprogramming.com/tools.html