For our XML encoding challenge project Mike and I decided to digitize a entry from Samuel Johnson’s unpublished edits to the “B” section of his dictionary. We found this material in a secondary source published by Johnson scholar Allan Reddick. In this secondary source, Reddick transcribed the handwritten edits and provided additional information to help clarify the kinds of changes that were being suggested by Johnson. We made the decision to also include Reddick’s annotations in our digitization. We wanted our XML file to provide a synthesis of the information provided in the primary and secondary texts, a reading experience that is more difficult to achieve in a physical book.
At first, we thought we could transcribe an entire page of the revised dictionary text, and chose the “BON” section because it had a number of different revision marks. We though it would be interesting to try and code these various marks using XML. When we sat down to actually start coding we realized we had overestimated what we could actually do. Although TEI does have specific guidelines for dictionaries, it took a lot of time and a lot of discussion for us to map out the structure and hierarchies at play. We spent a lot of time talking through the differences between the <entry> tag and the<entryFree> tag, traipsing back and forth through the guidelines to figure out which tags could contain which other tags, how to use attributes, if tags could skip “generations”, etc. We actually spent way more time talking and debating that we did transcribing or coding.
In the end, after all the discussions, we decided to focus on trying to code a single entry. The entire dictionary is made up of entries, so if we could get one entry right we could, in principle, code the entire text. We decided to work on the the entry for “Bombast”, because it had some elements crossed out, an addition added in the margin, and an annotation from Reddick. It contained numerous items that would be a challenge to encode.
As we started coding, we kept drawing out our tag structure to make sure that what we were doing was following TEI guidelines, and that it made sense in terms of the content of our entry. We also ended up engaging in many conversations about interface. We were trying to figure out what would be the most useful way to display the primary and secondary information. In the end, we decided that it would be most useful to provide multiple views of the entry: 1) clean, as the entry was first published; 2) the entry with Johnson’s revisions; 3) the entry with Reddick’s annotations added to Johnson’s revisions; 4) the entry with full citations for the quotes Johnson deploys as examples (which were provided by Reddick’s text). Understanding the interface was integral to our coding–we kept tacking back and forth to make sure that the kinds of tags we were using would actually allow us to provide all four of these views.
Neither Mike nor my self have ever spent so much time reading, discussing and talking through a single dictionary. When we were done we certainly had a much greater appreciation for the work that goes into this kind of a text. We also felt confident in the decisions we made. Since we spent so much time thinking about it, we felt like we made the best decisions we could, for the goal that we set out for ourselves. And as crazy as it sounds, we’re both interested in trying to code more entries to see if the tag structure we came up with would work for other entries!