PUBLICATION: DIY Peer Review and Monograph Publishing in the Arts and Humanities

Butchard, Dorothy, Simon Peter Rowberry & Claire Squires (2018) “DIY Peer Review and Monograph Publishing in the Arts and Humanities”. Convergence. Online First.

Abstract: In order to explore monograph peer review in the arts and humanities, this article introduces and discusses an applied example, examining the route to publication of Danielle Fuller and DeNel Rehberg Sedo’s Reading Beyond the Book: The Social Practices of Contemporary Literary Culture (2013). The book’s co-authors supplemented the traditional ‘blind’ peer-review system with a range of practices including the informal, DIY review of colleagues and ‘clever friends’, as well as using the feedback derived from grant applications, journal articles and book chapters. The article ‘explodes’ the book into a series of documents and non-linear processes to demonstrate the significance of the various forms of feedback to the development of Fuller and Rehberg Sedo’s monograph. The analysis reveals substantial differences between book and article peer-review processes, including an emphasis on marketing in review forms and the pressures to publish, which the co-authors navigated through the introduction of ‘clever friends’ to the review processes. These findings, drawing on science and technology studies, demonstrate how such a research methodology can identify how knowledge is constructed in the arts and humanities and potential implications for the valuation of research processes and collaborations.

PRESENTATION: The End of Ebooks

Abstract: Amazon have dominated the ebook market since the launch of the Kindle in 2007 but the next decade may be defined by the merger of the Independent Digital Publishing Forum (IDPF) with the World Wide Web Consortium (W3C) in January 2017. The merger resulted in the formation of the W3C Publishing Working Group with the remit to maintain the EPUB standard while working to future-proof digital publications as “first-class entities on the Web” in the form of Packaged Web Publications (PWP). The proposed PWP specification would mark a paradigm shift for the book trade with ebooks gaining all the features of the modern Web rather than the more conservative EPUB specification.

The PWP specification is yet to be finalized but during its development, Working Group participants have extensively debated the limits of the book and its digital representation. The new standard must satisfy a broad range of use cases including trade publishing, scholarly communication, journalism, and grey literature. In this presentation, I conduct an analysis of the consensuses and fractures that will shape the presentation of books in browsers from a Science and Technology Studies perspective. The W3C offer an unprecedented level of transparency in decision making compared to prior ebook standards such as EPUB revealing the human decisions behind algorithmic interventions by mark-up validators, InDesign export wizards, and web browsers. These on-going discussions will not only shape the future of digital publishing, but return to the question of “what is a book?” in the context of the early twenty-first century.

PUBLICATION: Continuous, not discrete

Rowberry, Simon (2018), “Continuous, not discrete: The mutual influence of digital and physical literature.” Convergence. Online First.

Abstract:  The use of computational methods to develop innovative forms of storytelling and poetry has gained traction since the late 1980s. At the same time, legacy publishing has largely migrated to using digital workflows. Despite this possible convergence, the electronic literature community has generally defined their practice in opposition to print and traditional publishing practices more generally. Not only does this ignore a range of hybrid forms, but it also limits non-digital literature to print, rather than considering a range of physical literatures. In this article, I argue that it is more productive to consider physical and digital literature as convergent forms as both a historicizing process, and a way of identifying innovations. Case studies of William Gibson et al.’s Agrippa (A Book of the Dead) and Christian Bök’s The Xenotext Project’s playful use of innovations in genetics demonstrate the productive tensions in the convergence between digital and physical literature.

DOI: 10.1177/1354856518755049

Open Access Version (Stirling Repository)

PRESENTATION: Strategies for reconstructing the pre-history of the ebook through catalogue archives

Abstract: Amazon’s dominance of the ebook trade since 2007 can be credited to their erasure of evidence about the historical development of ebooks prior to the launch of the Kindle. This activity included removing catalogue records for their ‘Ebook and E-Doc’ store, a strategy Amazon repeated with the removal of old public domain Kindle titles in 2014. Early ebook experiments prior to the Kindle were not financially lucrative but provided the foundation for the platform’s future success. In this presentation, I will explore the challenges of analysing contemporary digital publishing due to the shifting landscape prior to the Kindle’s entry to the market. I will use a case study of Microsoft LIT format (discontinued in 2012) and, Microsoft’s dedicated catalogue of ebook titles to demonstrate the importance of the catalogue website for contemporary book historical research.
The preservation of the original ebooks is an optimistic ideal for platforms that have shut down and are therefore only available for consumers who have kept backups of files from at least half a decade ago. As a consequence, catalogues are vital evidence of what titles were available for sale. The reconstruction and preservation of these corporate catalogue records, only partially available through the Internet Archive’s Wayback Machine. The preservation of these metadata sources allows for a more comprehensive understanding of the history of the ebook and the flow of content from platforms as they fall in and out of fashion. In this paper, I present some initial findings from reconstructing this catalogue and highlight the importance of archiving contemporary ebook catalogues to preserve important evidence of early twenty-first century publishing practices.

PUBLICATION: Peer Review in Practice

Butchard, Dorothy, Simon Rowberry, Claire Squires, & Gill Tasker (2017), “Peer Review in Practice”. BOOC.

Preface: The report Peer Review in Practice was originally published in beta version during Peer Review Week 2016. It was the first stage in a mini-project focusing on peer review as part of the broader Academic Book of the Future project, and reviews the existing literature of peer review, and builds models for understanding traditional and emerging peer review practices.

The report underwent its own peer review. The beta version allowed readers to make comments upon the report, and a peer review was also commissioned by UCL Press. The former are still available on the beta version, while the latter is available here. The author of the latter (Professor Jane Winters of the School of Advanced Studies, University of London) made her peer review anonymously, but agreed on request that her comments be made public and her identity revealed.

The comments we received on the beta version were from a small number of individuals, and provided some useful additional resources and suggestions. As discussed in much of the literature of peer review, however, it was difficult to encourage substantial numbers of scholars to participate in the open, post-publication peer review. We also noted that the comment function led to responses being made about individual sentences or paragraphs, rather than providing overall analysis of the report. Overall, as an experiment in open post-publication peer review, we had hoped to receive more responses that would enable the report to develop further an ongoing core of knowledge and analysis of peer review. This current version of the report also has a commenting function, and we encourage the scholarly and publishing community to engage further with our report, in order to make it a useful ongoing resource.

One of the points made in the traditional peer review was about the lack of information about monograph publishing, something which we flag up in the introduction to our report. There is little research currently written on this subject, although as part of our mini-project, we are working on a forthcoming journal article focusing on peer review and monographs publishing in the Arts and Humanities. There are also further research projects focusing on peer review, including that encapsulated in a report by Fyfe et al., Untangling Academic Publishing: A History of the Relationship Between Commercial Interests, Academic Prestige and the Circulation of Research (May 2017), and the forthcoming project on ‘Reading Peer Review’, headed by Professor Martin Eve. The next stage in our own research into peer review is examining the language of peer review in Arts and Humanities journals.

DOI: 10.14324/111.9781911307679.15

Repository Version

PRESENTATION: Resurrecting the Ebook: A media archaeological excavation of the Kindle’s development, 1930-2007.

I recently gave a talk for the Media History Seminar at the Institute of English Studies. I took the opportunity to link earlier ebook developments to the success of the Kindle.

Abstract: Amazon’s launch of the Kindle in 2007 was lauded as the moment when ebooks finally became economically viable for publishers. This success was facilitated by Amazon’s careful analysis of previous failed attempts to commercialize ebooks since the early 1990s, and earlier theoretical models developed since the 1930s. This presentation will explore how the Kindle’s reputation stems from a mixture of adapting pre-existing technology and the right social-technological context rather than a complete revolution in ebook design.


PUBLICATION: Commonplacing the public domain: Reading the classics socially on the Kindle

Rowberry, Simon (2016), ”Commonplacing the Public Domain: reading the classics socially on the Kindle”. Language and Literature. 25.3. 211225.

This was published as part of a great special issue of Language and Literature on ‘Reading in the Age of the Internet’ edited by Daniel Allington and Stephen Pihlaja.

Abstract: Amazon leads the market in ebooks with the Kindle brand, which encompasses a range of dedicated e-reader devices and a large ebook store. Kindle users are able to share the experience of reading ebooks purchased from Amazon by selecting passages of text for upload to the Kindle Popular Highlights website. In this article, I propose that the Kindle Popular Highlights database contains evidence that readers are re-appropriating commonplacing – the act of selecting important passages from a text and recording them in a separate location for later re-use – while reading public domain titles on the Kindle. An analysis of keyness in a corpus of 34,044 shared highlights from public domain titles suggests that readers focus on words relating to philosophy and values to draw an understanding of contemporary society from these classic works. This form of highlighting takes precedence over understanding and sharing key narrative moments. An examination of the top ten most popular authors in the corpus, and case studies of Jane Austen’s Pride and Prejudice and William Shakespeare’s Hamlet, demonstrate variation in highlighting practice as readers are choosing to shorten famous commonplaces in order to change their context for an audience that extends beyond the original reader. Through this analysis, I propose that Kindle users’ highlighting patterns are shaped by the behaviour of other readers and reflect a shared understanding of an audience beyond the initial highlighter.

PRESENTATION: A historiography of the ebook

I was invited to give a talk for the Centre for the History of the Book at the University of Edinburgh. I took the opportunity to talk through some of the methodological challenges facing researchers of ebooks.


Social Reading of Harry Potter on the Kindle (from a distance)

I’ve been seriously working on research for my history of the Kindle for a couple of years now and I’m still figuring out how to capture the impact of the Kindle on the scale of both the publishing/technology industry and the individual reader.

This tension is clearest when looking at the available data on reading and the shared highlights. There are a large number of individuals making personal choices behind the 500,000 shared highlights of a single edition of Wuthering Heights. If we scale this to over 4 million ebooks and 40 million Kindle users, it becomes extremely difficult to focus on both the local and global trends (and doubly so when access to the data is obsfucated and entirely unavailable): What counts as an appropriate sample? To what degree can individual highlights link to the mass of activity? How much data can I even get hold of?

While I ponder these questions, there’s still the problem of method. In order to figure this out, here’s a pilot study of the Harry Potter series as a complete unit that is manageable yet has received a fair amount of attention.

On the global level, shared highlights might not be able to tell us much about readership because an unknown number of readers choose not to highlight or share their efforts. The benefit of using Harry Potter, however, comes from the fact it is possible to gauge popularity across the series.

In recent versions of the Kindle software, a helpful pop-up box appears “About This Book” when opening a title for the first time. Luckily, this pop-up contains the total number of shared highlights and how many unique sections of the title have been highlighted. (These may not necessarily be up-to-date, but all the data here comes from 20 October 2015)

The data from the Harry Potter series reveals some interesting patterns. Figure 1 shows the total volume of shared highlights for each title, while figure 2 looks at the number of unique highlights per title. The most striking part of figure 1 is that the visible highlights (the top 10 most shared highlights) barely represent 10% of all shared highlights for any individual title.

Total number of highlights per Harry Potter book

Figure 1.  Total highlights for each Harry Potter title and the visible top 10 highlights (click for full size)


Figure 2. Unique highlights for each Harry Potter title (click for full size)

While the two graphs appear to show that the popularity of the series drops at the end and plummets after the first novel only to be pick up towards the middle, there is a far simpler explanation: the longer books receive more highlights as there is more text to highlight.

The only notable exception is Harry Potter and the Philosopher’s Stone, where more readers are focusing on particular passages. The large increase in total highlights without a similar increase in unique highlights likely indicates that more people are reading the first book than the rest of the series, or at the very least, they lose enthusiasm after the first book.

The second macroscopic view we can get from the Popular Highlights is the location of the shared highlights. Jordan Ellenberg has coined the Piketty Index as a way of using popular highlight locations to see how far through a book a reader got before quitting. From the evidence I’m gathering, it looks like the top 10 shared highlights are more likely to appear at the beginning of a book than the end, but what about the Harry Potter series?


Figure 3. Top 10 Shared Highlights for each Harry Potter title (click image for full size)

As a series, readers are more likely to highlight passages at the end of the book than the beginning. Not only does this suggest that readers are likely to finish the books, but through looking at the content of the highlights from the end of the book, it is clear that some of the most popular parts of the titles are Dumbledore’s speeches to Harry and the denouement of the narrative. Given the make-up of Rowling’s series and the slow start of most of the books, this inversion makes sense.

And that’s about as much as you can deduce from looking at the global level as far as I can tell. Once I’ve dug into the more traditional annotations and highlights of individual readers, I’ll compare the results with the broad patterns identified here.

Between Format and Platform: problems with standardizing ebook bibliographic descriptions

August 4th, 2015 § 0 comments § permalink

One of the problems with studying digital texts is coming up with a bibliographic description that captures enough information for others to identify (and often replicate the conditions) of the object. Unsurprisingly, ebooks have thrown up some interesting challenges for budding digital bibliographers.

Alan Galey has explored this issue across formats in The Enkindling Reciter. From this analysis, it is clear that the format of the ebook is important to record. For example, when talking about Walter Isaacson’s biography of Steve Jobs, the bibliographic record should indicate that the text was the ‘[Kindle edition]’ or ‘[EPUB]’. This is becoming standard practice in several venues, but is this sufficient to identify an edition?

Unfortunately, ebooks are likely to automatically update. Luckily, Amazon have several ways of identifying versions of a text:

  • the Amazon Standardized Identifier Number (ASIN), the 10-character string which identifies each record in Amazon’s catalogue, which can vary between separate editions of the same ebook. For Walter Isaacson’s biography of Steve Jobs, the Little, Brown Book edition is B005J3IEZQ, while the Simon & Schuster is B004W2UBYW. This is not the case of a same book reskinned for different markets, as the Simon & Schuster file is eight times larger than the Little, Brown edition, which I will discuss here.
  • The APNX file (used to generate page numbers) contains a ‘fileRevisionId’ (1378512022867) and ‘acr’, an identifier for a palm database (often a lengthy string, such as ‘CR!EBPXHWBERS4VV2GK50GFF58D17NS’). These values, while not infallible, can be used to match similar files.

Even this information is not sufficient for an accurate bibliographic description, since as I have argued elsewhere, the ebook must be considered as platform of at least four different layers: hardware, software, format and content. Without mapping all of these elements, it is impossible to accurately describe an ebook.

Just five words from Isaacson’s biography (“KOBUN CHINO. A Sōtō Zen…”) are sufficient to demonstrate why we need to pay closer attention to more than just the format of an ebook.

In the paperback edition of the text, the text is formatted with small caps and macrons on both the ‘o’s in Sōtō:


Walter Isaacson (2013) Steve Jobs. New York: Simon & Schuster, xiii.

The second generation Kindle renders this in a slightly different manner:


Kindle 2 

This in turn is slightly different from the Kindle for Android, iPad, Mac & Cloud Reader edition:


Android 4.4.2 (Sony Xperia D2005 | Kindle for Android


iOS 8.4 (iPad MD522B/A | Kindle for iPad 4.10)


Mac OS X 10.10.4 (Kindle for Mac 1.11.2 [40670])


Kindle Cloud Reader (Chrome 44.0.2403.125 | Mac OS X 10.10.4)

Variation in font and reading preferences aside, there are clear differences between versions that are of interest for the descriptive bibliographer. There are two major differences I want to highlight:

  1. Sōtō doesn’t look right in any of the Kindle edition.
  2. Kobun Chino’s name appears in small caps in the original print version, but not all Kindle platforms replicate this.

The first is a clear limitation of the Kindle platform and its design. Rather than using the rich and varied palette of a Unicode standard such as UTF-8 (allowing users to include a wide range of alphabets, and more importantly, emoji!), Amazon chose the much more restrictive Latin-1 encoding, which includes a range of diacritics and punctuation common to Latinate alphabets but not a lot else.

Unfortunately, this did not include the ‘o’ with macron, which just so happens to appear twice in a single word. Luckily, rather than simply removing the macrons, the producers have used a work round by including an image of the character. Unfortuantely, the image does not properly scale with the text and it only works with black text on a white background.

This has a couple of consequences for the ebook itself too, since it makes it impossible to search for ‘Sōtō’, as the text is either rendered into two single character words, or worse, turned into ‘St’. Not only does this make the word difficult to search for, but it also effects the quality of the Kindle’s text-to-speech facilities.


Sōtō rendered as “saint”

While the first bibliographic glitch was readily visible, the second would be difficult to spot without comparing different versions of the same edition. Formatting standards such as HTML, which ebooks use as their basic logic, are not hard laws, but recommendations for how to display text which can vary between different interpreters. Small caps is one of those features which is not universally supported by different instances of the Kindle application.

This may appear to be a minor aesthetic variation, but once again, it has an effect on the functionality of the ebook. Due to the variation in parsing the ‘small caps’ formatting tag, different versions of the Kindle software do not agree on whether the start of the ‘small caps’ formatting represents the start of a new word.

For example, Kobun Chino’s second name is rendered as ‘C hino’ on the iPad version, but remains ‘Chino’ on the Kindle for Mac version. This is a problem for readers who try to look up the name through the dictionary, Wikipedia or X-Ray, as the surname may be rendered as two separate words. Again, the text-to-speech functions of the Kindle stumble on this split word too, rendering some of the accessibility functions difficult to navigate.

 Kchino9 Kchino8


It is clear that identifying the brand and associated file format alone will not suffice, and even the file format may not be enough due to variation among platforms. Hardware and software configurations make a real difference in the version and behavior of the file. Since Amazon’s file formats (AWZ, PRC, KF8 and so forth) are not openly documented, so it is insufficient to look at the source code, noting the software and OS may be a necessary step in ensuring the replicability and accurate documentation of Kindle ebooks. Even this may not be enough to stave off the constantly updating Kindle infrastructure, but at least it’s a start towards documenting a specific moment in time.