» 2019 » September Simon Rowberry

PRESENTATION: An archaeology of patent databases as material objects

September 9th, 2019 § 0 comments § permalink

Patent databases including Espacenet and the USPTO Patent Full-Text and Image Database (PatFT) offer rich sources for big data analysis to document the evolution of a technology or to demonstrate the value of a filing through citation networks. While scholars such as Manuel Trajtenberg, Douglas O’Reagan and Lee Fleming have complicated the relationship between citations and chains of influence between inventors, less attention has been paid to the affordances and limitations of the databases storing the underlying data. Researchers can access patent databases as searchable, online databases or via third-party sources such as the National Bureau of Economic Research (NBER) US Patent citation data files, but how does the structure, type, and availability of data shape our understanding of patents as historical evidence?

In this paper, I offer a case study of the layers of digitisation embedded within USPTO PatFT and Patent Application Full-Text and Image Database (AppFT) to analyse what Ian Milligan terms the “illusionary order” of patent databases. PatFT and AppFT combined provide access to all granted patents in the United States from Samuel Hopkin’s filing for the manufacture of potash in 1790 to weekly updates of new patents in early 2019. There is a clear separation between patents filed before and after 1976, with the latter available as full searchable text, and the latter available only as photograph facsimile copies of the original document. While the USPTO began transitioning to digital workflows in the 1970s, the full text from this time can rely on automatic Optical Character Recognition (OCR) processes, leading to a difference between the facsimile PDF copy of the patent and the semantically-rich full text.

Discrepancies proliferate beyond simple historical distortions. PDF documents can differ from the HTML full text, and updates to patent classifications can create different versions of the same patent without acknowledgement of an update. Some files were scanned from earlier efforts at microfilm preservation. Original copies of up to 5 million patent documents stored in Franconia, Virginia were destroyed in 2018, while the National Archives primarily holds material filed prior to 1978. As a result, many patents filed prior to the complete adoption of digital workflows are only available as digital surrogates without a print copy of the original.

The provenance of these patents as digital objects is therefore uneven and can create an incomplete image of patent filings, which is only exacerbated in other national databases where even photo-facsimiles of all patents remain unavailable. This paper offers a bibliographic and media archaeological excavation of the database documenting the digitalisation of the USPTO’s workflows to contextualise the types of error that may affect our understanding of patents as documents.

Where am I?

You are currently viewing the archives for September, 2019 at Simon Rowberry.

Recent Posts
Recent Comments
- “Media History” Seminar Programme 2016-17 | Media History Seminar on About
Archives
- October 2019
- September 2019
- July 2019
- August 2018
- July 2018
- February 2018
- September 2017
- May 2017
- August 2016
- October 2015
- August 2015
- July 2015
- June 2015
- May 2015
- February 2015
- September 2014
- July 2014
- June 2014
- May 2014
- April 2014
- March 2014
- October 2013
- September 2013
- August 2013
- July 2013
- May 2013
- November 2012
- September 2012
- July 2012
- May 2012
- January 2012
- June 2011
- May 2011
- February 2011
Categories
Meta

Simon Rowberry

PRESENTATION: An archaeology of patent databases as material objects

Where am I?

Recent Posts

Recent Comments

Archives

Categories

Meta