The goal of this project is to make The Dictionary of Love (1753) accessible online for anyone who might find its content or its history useful or interesting. Although the analysis and annotations anticipate an audience of English-literature students, eighteenth- century scholars, sociolinguists and lexicographers and even the curious layperson should have no trouble navigating the site and will ideally find it a worthwhile and entertaining discovery.
I first ran across the DOL in the English Short Title Catalogue while researching more traditional eighteenth-century dictionaries. At the time, aside from several original editions in rare book collections across the United States and Britain, it was only available on microfilm and as a digital facsimile in Thomson-Gale's proprietary database Eighteenth Century Collections Online.
Rather than put another facsimile on the Web, my intention was to produce an annotated transcript with variant readings from the several English editions of the work that appeared in the later half of the eighteenth century. Upon discussing the nature of the text and the goals of the project with my committee, we determined that eXtensible Markup Language (XML) would be the most advantageous means of putting the DOL on the web. I began by drafting a skeletal methodology that laid out which encoding scheme I would use and which textual features I would highlight.
My plan was to use XML to encode the text, parsing the terms and definitions to identify various linguistic elements and appending to each entry explanatory comments such as examples from contemporary literature and variations of the entry in subsequent editions. In addition to the transcription and annotations of the text, the final project would also include several essays that detailed the publication history of the dictionary and its place in eighteenth-century courtship conventions, and examined its sociolinguistic and literary implications. All of this information would be displayed on a website by using XSLT to convert the XML files to web-compatible XHTML.
I immersed myself in XML and began to develop familiarity with the syntax I would be using to encode the text of the DOL. Initially, I used online research to teach myself how to use XML, testing sections of code as I went along. The primary advantage of using XML instead of another mark-up language, such as HTML, is that it allows the user to define tags according to the specific needs of the project.
This freedom can be daunting, however, and with it comes a greater potential for making ineffective choices, analytical oversights and time-consuming errors in coding. Almost as soon as I commenced working, I discovered that my initial proposal was an inadequate blueprint for this project: sparse and lacking in specifics, it provided no clear instruction for accomplishing the goal it outlined. Before I could begin coding, I needed to determine which aspects of the text required annotation and which elements of the text needed to be tagged. Furthermore, I need to figure out how to create a workable hierarchy, one that would account for all the information but would arrange it in a uniform way that would be easy to navigate and manipulate later. This was the first of many stumbling blocks.
Overcoming this first hurdle required that I reevaluate the plan and reestablish, in more specific terms, what the final project needed to include and how it should work. However, coming up with a new plan was not so much a matter of creating a correct foundation as it was a matter of simply choosing a different direction and striking out; trial and error was often the name of the game.
The revised first step, then, became to devise a tag set that would aid potential users in their intellectual pursuits, a task that necessitated at least a basic understanding of the audiences' fields of knowledge. Therefore, time not devoted to studying coding rules and practices was spent looking for the answers to questions about eighteenth-century culture and the DOL itself: Who first translated the dictionary into English and who was the target audience? How do the editions differ from each other and what do those differences imply about changing societal or literary ideas? How were the dictionaries used or received when they were originally published?
Based on the answers I found to these questions—or on the further questions that these queries unearthed—I could begin creating a tag set. Not having done much literary research on the document at this point, the first tags I created focused on basic linguistic and formatting elements: parts of speech, proper names, dialogue, synonyms, etc. As I progressed, I began to create analytical tags that added information to or commented on the entries rather than simply encoding what was there. Some of the tags included in this initial brainstorm were courtship stage, gender, slang vs. standard language, emotional state, and style height of the word. Clearly, some of the tags overlapped, as far as what information they were encoding. For example, slang vs. standard language and style height were two versions of the same element; however, using the former element only allowed a binary interpretation whereas applying the latter element opened up a wider range of possibilities.
A wider range, however, was not always desirable. Many of the analytical tags never made it into the final document because they were too broad or ambiguous to be useful. The tag <gender>, for example, became confusing to use, because it referred to different things for different parts of speech and even varied among words within the same part of speech: nouns referring to people (e.g., coquette, fop, beau) were usually straightforward enough; however, the word 'fribble' threw a monkey wrench into even this most basic application:
Fribble: This word signifies one of those ambiguous animals, who are neither male nor female; disclaimed by his own sex, and the scorn of both....
Additionally, tagging nouns that referred to things required that I determine the gender of words like 'inclination,' 'sacrifice,' and 'obstacle,' which quickly proved to be a pointless and random exercise, since I could not satisfactorily explain, even to myself, why such words would be described as male or female. For verbs and adjectives, it was unclear whether gender should refer to the subject or the object of the action or to the person using the word or the person being described, respectively.
In short, it began to feel like imposing the gender tag on the text limited rather than enhanced its usefulness: I was not creating a guide or revealing a latent pattern; I was damming possible interpretive pathways. When it became clear that a tag was more of a hindrance than a help, I eliminated it and revisited the text and my notes to look for other possibilities.
Most of the analytical tags changed several times over the course of the project in response to new research. Likewise, my research was often redirected as I developed a better understanding of tagging procedures and coding rules. All of this information worked together in a feedback loop that necessitated constant reevaluation and revision.
After researching and drafting for roughly two months, I had a base tag set, ready to be translated into the DTD, which would provide the framework for the subsequent XML document. The set included both structural tags (formatting; basic linguistic components such as parts of speech and sections of dialogue) as well as analytical tags (commentary on the entries such as which stage of courtship it would fall under; descriptions of the variations among editions): for example, each entry in the DOL would be marked up to identify the term (<mainword>), any synonyms listed (<syn>) and the definition of the term (<definition>). Within the <definition> tag, there were elements to parse, among other things, dialogue, foreign language phrases, and examples of the term in context. Analytical tags such as <pos> (part of speech), <courtshipstage>, and <variation> (between editions) completed the set.
One of the most challenging aspects of creating a tag set is that it must be able to fit into the hierarchical structure of a DTD. This means considering if and how the elements are related and developing a nesting system that is both consistent and comprehensive. For example, some tags may only be needed for only one or two entries, but they must be incorporated into the DTD using the same rules that govern the structuring of the most frequently used tags. Depending on how specific the coder wants to get, a DTD can be complex and lengthy. Additionally, the format and content of the original text also impacts how detailed the DTD must be. The terms defined in the DOL, for instance, are not all one word: many of the verbs are listed as infinitives, such as 'To Love' and 'To Address.' This can make alphabetization problematic; spelling variations can also create problems.
Other entries attach two words to one definition, as in the record 'Sick. Sickness.' The issue here was whether the word should be recorded as a noun or an adjective. To solve this problem, I revised the DTD to make it possible for an entry to have two parts of speech (<pos> and <altpos>). While I wanted to represent the text accurately, the variation among terms would have made organizing the information in XSLTs confusing: for example, if I tried to sort by the first letter of the <mainword> content, all of the verbs would be grouped with the 'Ts' because of their infinitive construction. Here again, I solved the problem by creating two elements, <mainword> and <altword>: the former contained a regularized version of the term (i.e. one word) and the latter allowed me to encode the term as it was originally published. In the end, I used multiple tags to incorporate all of the material in several categories, including the entry word, the part of speech and the variations among editions.
Once the DTD was complete, I began transcribing the DOL into XML. Although I had a better grasp of the concept of markup as well as of the specific language I was using, I still struggled with an outline that failed to anticipate many coding and research questions. This time, however, most of the difficulties lay in trying to manage the dialogue between the analytical coding and literary research aspects of the project. Encoding the structural elements of the text took only a few weeks, but considering how to fill the analytical tags was more complicated. Although the ultimate goal was to create a digital humanities resource, I found that working on the two aspects of the project separately and in a piecemeal fashion was inefficient.
For several months, I worked on the project in separate chunks: coding one day, research and writing the next. While compartmentalizing this way made the workload more manageable, it also made it difficult to envision how the two halves were going to create a whole. The methods I used and information I gathered for one aspect of the project did not inform my work on the other: I did not consider, for example, how the digital format influences a traditional literary analysis. As a result, nothing seemed to be working; no real progress was being made. I was soon in a stalemate with my project: it stubbornly resisted conforming to the schedule and plan I had devised, and I refused to alter my approach. Indeed, I was so attached to the ideas I had developed about how this was going to work, that I wasn't even sure where to start changing gears.
I reexamined my research: I reread the articles, reviewed the coding practices and reorganized the information. I came across John Unsworth's "The Importance of Failure," in which he argues that "... hypertext research projects should be expected to address unsolved problems...and the proof of their having done so should be that they culminate in a new plateau of ignorance, a new set of unsolved problems." By these standards, I was making great strides, at least in my own accumulation of knowledge. When I considered the problems as part of my research instead of as discrete forces working against the project, I found that my efforts thus far were not as fruitless as I believed.
What had I learned then? Generally, that any scholarly enterprise has an energy and a dynamic of its own, that one makes progress in such an endeavor by following the research trajectory, on whatever tangent it offers. But digital humanities projects are even more unruly, since one must consider how the ultimate form of the project and the information and argument being presented interact with and influence one another.
More specifically, and as it relates to text encoding, I discovered that what I previously understood to be inert scaffolding is instead an intricate structure whose function is as much exegetical as formal and analytical. There is, however, no clear method of integrating the traditional and digital analyses of a text: no precedent determining which part to do first, how or when to compare the parts, which part should take priority or how to represent that hierarchy. In short, I finally understood that I could not rely on a "right way" to work on this project. So after months of balking at the egregious disconnect between my expectations and the reality of the situation, I turned my attention to creating and documenting what Unsworth champions as "the ignorance that was uniquely yours..."
My experience suggested that it would have been more practical to have created the analytical content of the project before beginning the digitizing process. However, at this stage, the necessary adjustment was one of perspective rather than procedure: digital projects are constantly in flux, and it is all about making the rules up as you go. By abandoning any preconceived notions about how the two halves of the project should work together, I could better visualize the possibilities and therefore more easily anticipate and troubleshoot problems.
Once I had the XML file of the DOL, I focused on creating the XSLTs that would determine how the information would display. The XSLT stylesheet converts the tagged XML document into an HTML document that can be read as a webpage. Assisting me with the XSLT development was a group of students from one of the committee member's Writing for the Web class. I had never even created a basic HTML website, so using XSL to convert XML to XHTML was a bit like a monolingual English speaker trying to translate between French and German. The group created a template that illustrated many of the parts of the XSL syntax that would be the most useful in manipulating this particular XML file; however, the design possibilities are endless, and the template could not provide more than a glimpse of these options. Ultimately, I benefited most, not from using the template per se, but from being able to study the code alongside its corresponding website and thereby see exactly how the various expressions interpret and display the data. Although detailed coding manuals and handbooks provided some guidance and served as beneficial resources, here again, trial and error was the best teacher.
As the coding became less demanding, partly because it was nearing at least an interim stage of completion and partly because it was no longer a wholly foreign language, the literary research component of the project became more manageable. For approximately nine months, I had devoted the majority of my time to learning XML, creating a DTD, transcribing the DOL and developing XSLTs. Now I began to pull together the supplementary research I had gathered: literary examples of the terms in context; reviews of the DOL in contemporary magazines; biographical facts on its author, translator and publisher; information about the eighteenth-century's "culture of improvement" (Borsay); and sociolinguistic theories regarding politeness and courtship.
Although the gap between 'digital' and 'humanities' was closing, coding the XSLTs continued to raise questions and introduce considerations about the critical nature of the project. The XML tags provided one level of analysis, and the XSLTs had to incorporate that analysis into a visual organization of the data. In some ways, the XSLTs would simply display the XML content, but they also had the potential to create additional dimensions of interpretation. Thus, the several aspects of the project became even more tightly interwoven, creating a shorter distance between cause and effect and expediting the discovery process.
Again, I was forced to reevaluate which elements of the text I wished to highlight, in what way and for what purpose. Until this point I had not given much thought to the principals of visual rhetoric. Sketching images of a webpage seemed like a secondary concern, as if bringing together the necessary pieces would somehow automatically and clearly suggest a certain design. To some extent, this is true: the way the tags are defined in a DTD creates the basic hierarchy that will appear, to some degree, on the webpage. However, beyond this, there are questions of font size, color, graphics, page arrangement, and linkage. In this way, coding is an ideal tool for any kind of research project because it forces the user to define and test their goals, theories and assumptions over and over, from a slightly different point of view each time. It demands and ensures a comprehensive study, if not understanding, of the material.
Once a foundational site of the DOL has been established, the opportunities for development and change are endless. My philosophy for this project echoes John Lavagnino's argument in favor of a "criterion [of] adequacy...[or the] criterion of incompleteness" (71). He asserts that using digital technology does not alter the essential pattern of humanities scholarship, which often consists of examining the same text numerous times, from various perspectives. Hence, just as we might reject the contention that there can be a definitive edition of a written text, so too should we eschew the idea that an encoded text can ever be complete. Indeed, Lavagnino writes that "to do everything would require not only that we know what everything is, but that we also be able to adopt all critical positions ourselves" (71). In this light, one might consider XML, or any markup language, as an editorial tool more than a computer language, in which case, the digital and print worlds are more partners than rivals. To use Harold Short's words, "[this 'hybrid' is] perhaps even in some way [an] 'ideal,' in which a paper or material original is preserved and treasured for what it is, and the electronic is exploited for what it makes possible" (16).
I would like to continue amending this project when new research or ideas emerge. Additionally, I would certainly encourage implementing users' suggestions for markup and including further analysis of the DOL.
[May-June 2007. Emily Davis defended her thesis with the distinction in April 2007. In preparing the project for digital publication it was necessary to make but few changes, chiefly simplifying the DTD to reflect work completed as opposed to work intended, and cleaning up the text and markup. Page design and the PHP scripts used to process the XML and interact with the document were done by CATH. — David Hill Radcliffe.]