PUBLICATION: Indexes as Hypertext

June 1st, 2015 § Comments Off on PUBLICATION: Indexes as Hypertext § permalink

Abstract: Digital media presents several challenges to the index, but this ignores the fact that the index has played an important role in the development of the computer. Hypertext, or links between chunks of text, is a vital concept in computation, and one which can be traced back to the index. The author explores the link between indexes and hypertext through three case studies of novels with indexes: Vladimir Nabokov’s Pale fire, Mark Z. Danielewski’sHouse of leaves and Steven Hall’s The raw shark texts. This analysis reveals how indexes can be used as a subversive part of experimental fiction that authors employ to encourage the reader to move beyond superficial forms of reading.

Simon Rowberry, “‘Indexes as Hypertext.” The Indexer. June 2015, pp. 50-56

PRESENTATION: 1984 Redux: The long term materiality of the Kindle infrastructure

May 22nd, 2015 § Comments Off on PRESENTATION: 1984 Redux: The long term materiality of the Kindle infrastructure § permalink

Abstract: The launch of the Kindle in 2007 marked the arrival of the eBook as a marketable phenomenon and in the following years, the eBook marketplace has gone from strength to strength. Amazon has consolidated its position as the market leader through created a complex proprietary infrastructure that has locked users into the Kindle system. This spans the ubiquitous hardware, software, large store and range of services which constitute the Kindle brand.

This has a caveat, as it means that all the data and infrastructure is reliant on Amazon’s continual investment in the Kindle brand. Due to the cloud-based storage of the Kindle’s data and the limited lifespan of the hardware, users are reliant on Amazon’s continual support. This transition is from book-as-object to book-as-service, which has some exciting opportunities but leaves consumers, and book historians, vulnerable to losing important historical data. The removal of data is not without precedent, as a copy of George Orwell was removed from users’ Kindles directly once it was discovered the publisher did not own the rights to the novel. More recently, Amazon discontinued their Kindle Popular Highlights website which offered an annotation corpus of over one million individual highlights, which is now no longer available.

In order to understand the complex materiality of the Kindle’s infrastructure, it is important to understand how it creates a situation in which we have landed into the precarious reliance on Amazon to preserve the infrastructure. The current project explores the precarious materiality of the Kindle infrastructure and the difficulties it presents for contemporary and future book historians who wish to delineate a comprehensive account of digital book culture in the early twenty-first century. As a corollary, the paper will suggest some solutions to the problem that can be undertaken currently including the urgent need to preserve the evidence that is proliferating on the Kindle infrastructure.

The strange orthography of ebooks

February 9th, 2015 § Comments Off on The strange orthography of ebooks § permalink

While the ebook has become a familiar concept since 2007 and the launch of the Kindle, there appears to be little consensus over how exactly to spell it. There appear to be three main contenders: e-book, ebook, and eBook.

Unfortunately, it is difficult to trace usage of small orthographic differences to see the popularity of each over time, but there are clear comparisons with the term ’email,’ which started off with the hyphen (e-mail) but is now normally simply spelled email as it has become the standard form of communication over the postal system. In early discussions around ebooks, a hyphen similarly marked the emergent form as alien and distinct from its printed counterpart. Perhaps over time we will drop the hyphen and this ellipsis will demonstrate how the ebook has become embedded within contemporary culture, as it is possible to trace with email.

But this leaves the question of the third orthographic variation, ‘eBook.’ While this may look like a riff on Apple’s branding for the iPod and associated devices, it’s history goes back much further to the first generation of commercial ebook device, and the Rocket eBook in particular. A couple of other devices borrowed the orthography, and it appears to have caught on beyond the brand. Interesting, since the ebook revival in 2006, this orthographic convention has not been widely copied, perhaps due to the dominance of Apple with that kind of orthography. Given its awkwardness, particularly when using the word at the beginning of a sentence, perhaps it should be used only with reference to these historic devices.

PRESENTATION: Twitter as a Site of Worship

September 22nd, 2014 § Comments Off on PRESENTATION: Twitter as a Site of Worship § permalink

Abstract: Robert Darnton’s communication model of the book trade closes with a feedback loop from the reading public back to the author. Traditionally, this would have happened through private correspondence or small-scale public events. The development of large social network sites such as Facebook and Twitter has scaled up these interactions, as well as made them visible to a wider audience, as readers can directly and publicly show their affection and support for their favourite authors.

In recent years, the rise of Twitter has been linked to its successes as a marketing and news network since it functions as a one-way broadcast medium, with many authors using Twitter to engage with an audience. In Twitter parlance, the audience of a twitter count is referred to a “followers,” a quasi-religious term that demonstrates the relationship between the authors and their readers in many interactions. The Twitter platform has also opened up the possibilities for systematic research of reception, as users can mine the large dataset of tweets for mentions of a particular book or author.

While many authors only use their Twitter account for publicity reasons, if indeed, the work is not outsourced, some authors have embraced the medium as a form of communication. Margaret Atwood (@MargaretAtwood, c.450,000 followers), William Gibson (@greatdismal, c.150,000 followers), Neil Gaiman (@neilhimself, c.2,000,000 followers) and E. L. James (@E_L_James, c.450,000 followers) represent four high profile examples of authors using Twitter as both a personal and professional tool. The current project examines the messages sent publicly to the authors as evidence of contemporary readership and the ways in which these interactions demonstrate the reception of twenty-first century authors. The writers’ mixture of tweets about contemporary issues, as well as the creative process and broadcasting some fans’ requests, reveals a new and interesting way for authors to engage with their audience. The data reveals that these authors choose to engage with some elements of their contemporary readership, but other comments go as unanswered “prayers” since the overwhelming volume of requests and messages are unmanageable for an author on their own.

EN3061: Text(ure)

September 17th, 2014 § Comments Off on EN3061: Text(ure) § permalink

I had a little tinker after last year to focus more on digital culture and less on traditional bibliographical methods. Readings are also updated in places, since, as ever, innovations in digital culture wait for no one.

Learning Outcomes

(a)        Demonstrate proficiency in skills necessary to analyse traces of the production and reception of texts in a variety of formats, both print and digital.

(b)       Have a sophisticated understanding of how a single text may exist in many different formats and how this may fundamentally alter the reception of the text.

(c)        Show an advanced awareness of contemporary techniques for analysing texts using digital tools.

(d) Critically evaluate interdisciplinary data available digitally.

Module Structure

Week 1:          Introduction

Week 2:          Literate, Oral and Tactile
Key terms: oral, literate, modality, tactile
Rubery, M. “Canned Literature: The Book after Edison.” Book History 16.1 (2013): 215-245.

Week 3:          Signs & Symbols
Key terms: writing systems, Unicode, typography, punctuation, emojis
Selections from Houston, K. 2013. Shady Characters: The Secret Life of Punctuation, Symbols and Other Typographical Marks. New York: W & W Norton
Coulmas, F., 1989. The writing systems of the world, Oxford: Basil Blackwell. Chapter 2

Week 4:          Cryptography (and poster workshop)
Key terms: code, cracking, information theory, cipher
Excerpts from Kahn, D., 1996. The codebreakers: the story of secret writing. New York: Scribner.

Week 5:          Book History
Key terms: publishing, reception, materiality
Darnton, R., 2007. “What is the History of Books?” Revisited. Modern Intellectual History, 4(03), pp.495–508.
Anderson, B., 1991. Imagined Communities: Reflections on the Origin and Spread of Nationalism Revised., London: Verso. Chapter 3.

Week 6:          Born digital
Key terms: hypertext, code, platform, software
Barnet, B., 2012. Machine Enhanced (Re)minding: the Development of Storyspace. Digital Humanities Quarterly, 6(2).

Week 7:          POSTER SESSION

Week 8:          Digitization workshop
Key terms: Facsimile, scanning, OCR
Mak, B. 2014. Archaeology of a digitization. Journal of the Association for Information Science and Technology, 65: 1515–1526.
Spend at least 30 minutes acquainting yourself with one or more of these resources: HaithiTrust, NYPL Menus, Project Gutenberg or EEBO

Week 9:          eBook History
Key terms: eBook, formats, updatability
Maxwell, J. 2013. E-Book Logic: We Can Do Better. Papers Of The Bibliographical Society Of Canada, 51(1).

Week 10:        Social Texts
Key terms: annotation, marginalia, reception
Find one or two annotated books in the library/your own collection or the Harvard Views of Readers, Readership and Reading History collection
Sherman, W.H., 2008. Used Books: Marking Readers in Renaissance England, Philadelphia: University of Pennsylvania Press. Introduction & Excerpts (LN)
Jackson, H.J., 2002. Marginalia: Readers Writing in Books, New Haven and London: Yale University Press. Chapter 3

Week 11:        Artists’ Books (case study workshop)
Key terms: book-as-object, form vs. content, artists’ book
Excerpts from Drucker, J., 1995. The Century of Artists’ Books, New York: Granary Books.

Week 12:        Automated reading & writing
Key terms: searching, automation, bot
Rosenberg, D., 2014. “Stop, Words.” Representations, 127(1), pp. 83-92.

Semester 2, Week 1: Case Studies Due

PRESENTATION: Indexes as Hypertext

September 5th, 2014 § Comments Off on PRESENTATION: Indexes as Hypertext § permalink

Abstract: The digital revolution has led to the development of new forms of literature, including hypertext fiction. Hypertext, most commonly known as links on the Internet, is not exclusive to digital media, but instead has a long history in print. One of the ways in which hypertext can appear in print is through creative use of indexes to form a conceptual network on top of the linear text. With reference to three novels—Vladimir Nabokov’s Pale Fire, Mark Danielewski’s House of Leaves, and Steven Hall’s Raw Shark Texts—this talk will demonstrate the ways in which indexes are used in fiction to encourage readers to search through the text to assemble their own interpretation of the text. These innovative uses of indexes in fiction offer a blueprint for the creative appropriations of the index in digital fiction.

POSTER: Marked E-Books and Kindle’s popular highlight culture.

July 10th, 2014 § Comments Off on POSTER: Marked E-Books and Kindle’s popular highlight culture. § permalink

You can find further information about the project here:

Abstract: The current project analyses the evidence of readership available through the public facing popular highlights feature of Amazon’s Kindle platform. In order to be considered a popular highlight, the text must be shared by three users. There are over one million quotations that meet this basic criteria and can be analyzed in similar ways to evidence of marginalia and provenance in book historical research. The present research analyses the popular highlights as a measure of various genre’s popularity as well as observing usage patterns of the highlighting and sharing features.

The static e-book has become embedded in the public’s imagination as an exemplar of the future of reading on the screen. The Kindle is one of the forerunners in the commercial e-book marketplace, encompassing a range of both software and hardware platforms and offering millions of titles. While others have begun to explore the impact of e-book culture, (Galey 2012; Lang 2012; Wu 2013; Thomas & Round 2013), the current project focuses on the traces readers leave directly on their Kindles. Amazon offer tools to share annotations and highlights of their eBooks to replicate print marginalia. The data for popular highlights is shared on a public-facing webpage (, Inc. 2013) that can be collected for analysis. This research offers an approach to the empirical study of reception on a previously unprecedented scale and offers an insight into what users find interesting about the material they are reading.

The data was collected using wget on the Kindle Popular Highlights website, as does not currently offer an API for the dataset. The project focused on the Popular Highlights feature and the metadata pertaining to the book title, author, quotation and number of highlights. While this does not provide evidence of individual readers, it can be used to analyse patterns of readership and marginalia. An initial foray produced the first 100,000 popular highlights (out of a dataset of over 1,000,000 highlights) that were produced by over 8 million shared highlights. Unfortunately this method left many artefacts when converting certain characters, so the data was cleaned and organized.

The initial results revealed some interesting patterns. The most highlighted books were primarily Young Adult (YA) fiction, literary classics, pop science and self-help. Individual passages can be highlighted more than 1,000 times, with a quotation from Catching Fire (The Second Book of the Hunger Games) received over 17,000 highlights. Each genre’s annotations often fit into roughly categorized groups: literary classics and pop science produce pithy aphorisms; self-help books are quoted for their instructions; and YA generally highlighted “spoilers” and dialogue that is central to the novel’s plot. Over 90% of the quotations are under 350 characters, although occasionally readers will highlight a whole page. Since one of the core features of the popular highlight function is the ability to re-use the quotations as tweets, brevity of quotation length is expected and confirmed as 42% of the highlights are tweetable. As the number of highlights fall, the books’ genres tend to become more esoteric and the highlights become fuzzier. Some of these bear the marks of experimenting with the feature or more playful purposes, such as “THE” in the New Oxford American Dictionary receiving 73 highlights.

The analysis comes with a few caveats: (1) the Kindle is only one eBook provider and is not representative of digital reading; (2) it is unknown to what degree this data is representative of reading on the Kindle in general; (3) the data does not currently include 90% of the data; and (4) without a finer breakdown of the users’ demographics, the data can only tell us so much about what the readers are attempting to do through highlighting. Nonetheless, the Kindle Popular Highlights dataset offers a snapshot into the possible ways in which book historical research can be conducted in the early twenty-first century.

References, Inc., 2013. Most Highlighted Passages of All Time. Available at:

Galey, A., 2012. The Enkindling Reciter: E-Books in the Bibliographical Imagination. Book History, 15(1), pp.210–247.

Lang, A. ed., 2012. From Codex to Hypertext: Reading at the Turn of the Twenty-First Century, Amherst and Boston: University of Massachusetts Press.

Thomas, B. & Round, J., 2013. Digital Reading Network. Available at: [Accessed October 27, 2013].

Wu, Y.-H., 2013. Kindling, Disappearing, Reading. , 7(1). Available at: [Accessed October 27, 2013].

PRESENTATION: Reading the Kindle’s (Non-)Readers

June 24th, 2014 § Comments Off on PRESENTATION: Reading the Kindle’s (Non-)Readers § permalink

Abstract: As readers have migrated to eBooks and similar digital forms, there has been a transformation in the manner in which they leave marks on books. With this shift, there has been a movement from the widespread ability to find evidence of individual readers towards an aggregation of this information as a monolithic entity often described as “big data,” offering little discrimination between various people. The Kindle contains a database of several million individual highlights that cannot be analysed in great detail on the individual level, but rather on a global, where all nuances and reasonable analysis become vaguer.

One way in which we can trace the reader in these new forms is by instead looking at the collective markings of a single book. Surprisingly, the most popular book to annotate is the The New Oxford American Dictionary, which has been pre-installed on all the devices sold on the US. Over 1000 people have left annotations on the dictionary, but in an atypical way, characterizing many of Leah Price’s arguments about non-responsive readers. These notes are not close reading dictionary but rather something more complex and social, as a group has formed, primarily of pseudonymous teenagers, who use the dictionary in order to chat in a space that has been unrestricted by their parents or educators.

This paper examines this pocket of activity to the degree that it is representative of all digital reading practices and evidences. As the text is transformed into a social network and data to be mined for a variety of companies, to what degree can we see these readers as representative of the new forms of reading on top of the book rather through or in to it.

“Reading the Kindle’s (Non-)Readers.” Real, ideal or implied…? The Reader in Stylistics. June 2014. University of Nottingham.


June 19th, 2014 § Comments Off on PRESENTATION: Used eBooks § permalink

In lieu of an abstract, in this presentation I looked at the methodological barriers to studying eBooks and how we can reconcile the ability to distant read millions of shared highlights and the lack of access of user’s personal used eBooks.

“Used eBooks.”Digital Reading Network Symposium, June 2014. University of Bournemouth.

Twitterbots: Reading Automata

June 9th, 2014 § Comments Off on Twitterbots: Reading Automata § permalink

One of Twitter’s unique selling points as a social network is its unerring focus on text. Even posting a picture or video independently generates a textual anchor for the media in the form of an URL. As a textual media, Twitter’s primary currency is reading. The politics of reading, and more typically, not reading, characterizes a user’s relationship with their audience of followers and beyond.

It turns out that some of Twitter’s most voracious readers are not human at all, but rather the range of artisan bots that have emerged in the last couple of years. While they vary in type dramatically (Tully Hansen’s taxonomy covers this territory well), at the most fundamental level these bots are reading machines. On a basic level, the tweets emerge from the bots reading, and enacting, their source algorithms, but their literacy extends beyond that. These bots do not generate material out of nothing but rather than read a variety of sources—Twitter searches, the dictionary, novels, ROM texts, headlines, and other assorted materials—and present their readings as new writings.

This sleight of hand, based upon the process of reading to write, is reminiscent of automata that have intrigued countless historical audiences. Through use of clockwork and other mechanisms, automata maintain an illusion of autonomy. Once the underlying mechanics have been figured out, the automata become either trivial or joyful for appreciating the underlying mechanics. Twitterbots garner similar reactions once the processes have been understood, although many make use of dynamic and timely reading sources to ensure proceedings do not become stale. Nonetheless, Adam Parrish’s @everyword, a project that undertook its name in alphabetic order, is probably the most popular bot, despite its stable reading material and its relative predictability ending (that was eventually subverted due to collating words starting with é after z).

@horse_ebooks, the most contentious, and previously beloved, of all Twitterbots has a special place in this analogy: the machine that was disappointingly all too human in the end. The grand reveal that the account which generated bizarrely poetic and uncommercial spam was in fact a human performance mirrors the trajectory of a human-run automata affectionately known as the mechanical turk (picked as a name for Amazon’s crowdsourcing service due to its namesake’s “artificial artificial intelligence”). The Turk appeared to be a brilliantly gifted automatic chess-player that was actually entirely controlled by a human hidden in a secret compartment behind the false clockwork. The performance that fueled @horse_ebooks’s final years equally represented a kind of reverse Turing Test, whereby a human attempted to appropriate the linguistic tics of a bot.

These reading automata become much more interesting when considering the ways in which they challenge our notions of reading. Take, for example, Mark Sample’s Station 51000 (@_LostBuoy_), which plays with the tensions of reading on various levels currently being teased out in humanities research.  The bot mixes a reading of sections of Moby Dick with live data from the unmoored buoy classified as station 51000. Despite the specificity of location, the buoy still transmits a range of maritime data. The mash-up of a single, fixed, canonical work of literature with an erratic stream of nautical data goes beyond a comical clash of high culture, low culture and data—it reflects upon digital methodologies of reading that have emerged in recent decades including the use of “big data.” Of course, I’m not the first one to notice this, and the trend in Twitterbots more generally:

As automata, it is not up to these bots to make aesthetic decisions, as evidenced by Station 51000‘s mixture of the literary and real-time feed. Instead they can be used to push the limits of what reading means, and occasionally make us smile or laugh.