Brown & Clements: The Orlando Project

Brown & Clements, "Tag Team: Computing, Collaborators, and the History of Women's Writing in the British Isles"

4. Use of Technology

We have chosen, despite the fact that the scholars who initiated the project had no prior experience in humanities computing, to make computing technology the intimate companion of this history which aims to account for both the complexity of the moment and for larger temporal shifts. We opened this paper with the image of a bursting book, since the roominess, the sheer information-bearing capacity of the electronic medium, together with its ability to enable complex indexing, provided our initial motivation for turning to computing. But the project is now well beyond the purposes of indexing and deeply involved in something which has far more impact on the way we conduct our work as literary historians. That is the structuring of the information we are gathering.

For the Orlando Project, technology has become much more than a simple tool. It is altering the way we are conducting our research and changing the ways in which we approach the problems of literary history. We think the technology will allow us to do a different, and in some ways a better, kind of history. We want computers to help us to bring together and into focus the complex relationships that inform literary history. The hype of hypertext's ability to create multiple pathways for users through electronic material has become a bit stale, but the possibility of offering multiple rather than single trajectories, of fracturing the single, monological narrative, retains its freshness for us. The possibility of offering parallel and intersecting narratives, interlinked with non-narrative material, allows us to make the user of our electronic history an active partner -- another collaborator -- in the history. Electronic forms permit, and may indeed embrace, the display of difference, of competing narratives or explanatory paradigms, or even of contradiction in the materials presented, offering alternatives to more linear paradigms. We don't, of course, think that books are merely linear: the footnoted, annotated scholarly text is its own kind of hypertext. But we do believe, with Jerome McGann, that computers can help to dispel "the illusion that eventual relations are and must be continuous, and that facts and events are determinate and determinable" (McGann 1991: 197). We think that the computing tools we are using should help to make evident the patterns and meanings immanent in massed historical detail. We also hope to find news ways of addressing the gaps, discontinuities and unknowable silences in the history of women's writing by devising new ways of seeing and taking stock of what isn't there as well as what is. So we want our logo, an image of an oak tree, to be a pun, a technologized paronomasia, underlining the points of similarity and the points of difference between the organic/literary Oak Tree with which Orlando struggles for centuries and the branching structures of the literary/technological tools we are building.

5. Choice of Standard Generalized Markup Language

Before we give some examples, we need briefly to outline some features of the encoding language we are using, though these will be familiar to some readers of this paper. Standard Generalized Markup Language -- SGML -- is an international standard that exists independently of proprietary computer programs. Our use of it means that the material we are encoding will remain available to later generations of programs, and ensures that the usable life of our history does not depend on Bill Gates's marketing plans. The Text Encoding Initiative is a project which is developing protocols for encoding electronic texts in SGML. [2] TEI-conformant SGML is increasingly the choice of academic electronic edition projects. Our use of SGML, then, will make it that much easier for us to link our project to other humanities computing projects, such as the growing corpus of on-line texts, including the Women Writers Project at Brown University and the Victorian Women Writers Project at Indiana.

SGML adds "descriptive markup" in the form of tags -- or "elements" -- that describe various features of the text which is being tagged. The different tags are related to each other according to a set of rules that governs their hierarchical, branching relationship. Those rules are called a Document Type Definition or DTD. How a DTD is conceived has an immense impact on the shape of an encoded text, since the DTD structures the encoding of the information and largely determines what may be gleaned from the material much further down the line. Different DTDs can be created to suit different types of documents and different tagging purposes. The language used by the World Wide Web, Hypertext Markup Language, is an SGML DTD, and the DTD governs what you can and cannot do with HTML tags. SGML thus offered us an excellent compromise between standardization and customization, by allowing us to develop tools tailored to the project's needs.

5.1 Orlando's Use of SGML

The Orlando Project's work in humanities computing is experimental to the extent that the work we are doing in SGML is different from what is being done in other SGML projects we know of. For one thing, we are not encoding preexisting texts. SGML has typically been used by scholars to describe the structural features of texts -- their titles, annotations, stanzas, paragraphs, and so on -- for the purposes of making such structural features available for various kinds of scholarly analysis and of governing their presentation in print or on the Web. The Brown Women Writers Project, the Victorian Women Writers Project, the Perseus Project and the Model Editions Partnership are all using SGML for such purposes. [3] Unlike them, however, we are not creating an electronic archive based on existing text: the text we are tagging is the one we are writing.

Our attempt to produce the tools and the text simultaneously has far-reaching impacts on our research and writing practices. It means that the tagging -- the whole process of tagging from the initial development stage to the final writing up of research -- is in a reciprocal creative relationship with the research and the writing. The tagging embodies a set of judgments about which information we wish to present and the ways in which we wish to describe it. Once the tagging is structured, the tagging directs the research. Neither the text being tagged nor the tagging method has been imported, predetermined or preproven, into the Orlando Project, so we face a different challenge from that presented by a prefab computing program. We think that this is what must be done to domesticate the tools of computing for use in humanities research. Projects like the Text Encoding Initiative are addressing the problem of how best to translate products of the pen or press into electronic form. We are participating in that large project from the slightly different angle of exploring how best to produce the texts that we want to write in this new medium.

For those of us who work in the language, the restriction of the tagging process is initially -- and continuingly -- uncomfortable. It is an experience which sometimes makes the problem of the bursting book look elementary. Though this forest is endless, the tracks through it must be set in advance, and though the tracks will be very numerous, there is no such thing as free navigation. The idea of total freedom in electronic space is a marketing myth of hypertext, and the structures we put in place now will help determine which tracks are and aren't available when our work is done. The structures for tagging, the DTDs, are the central site of the integration of computing and literary history on The Orlando Project. We have developed three DTDs to date: a Biography DTD, a Writing DTD and an Events DTD. These present us with the means of gathering and entering information about writers' lives, their texts and the historical conditions in which they worked and with which they constantly interacted.

5.2 DTD Development

Construction of these Biography, Writing, and Events DTDs is at the heart of the collaboration: we have spent countless hours together, deciding exactly what it is we find important about, for instance, a woman's life. What elements of women's lives do we want to be accessible when the writing is complete? What kinds of questions do we want the materials to be able to answer? In this collaboration, the literary members of the team try to make their processes of evaluation and judgment as explicit and as concrete as possible; the computing members of the team push for specificity and attempt to systematize the process of structuring the texts we write. The DTD which emerges from our Document Analysis Sessions must amply and accurately reflect the sense of values of the research team (each member of which entered the conversation with an individually shaped set of literary and scholarly values, not to mention individually defined feminisms) and, at the same time, it must be workable for the people who are building the DTDs. Our analysis and the DTD construction is presided over by our sense of our multiple hypothetical future reader, who will need to find in what we construct a critically intelligent, usable instrument, serviceable in her research. The process of DTD development has forced us to make the conceptual organization of the project -- from our initial, strategic, if uncomfortable, division of its work into the categories of biography, writing and events -- explicit. We have had to perform close, careful analysis of the categories of significance in the kinds of interpretive statements we anticipate wanting to make as a result of the very research we want to use these tools to undertake.

Struggling to make the language of computing serve the needs of our scholarship, we are acutely aware of the ceaseless shifting of both language and history. As previously noted by a non-computing worker in the language, words "slip, slide, perish, / Decay with imprecision, will not stay in place, / Will not stay still", and everything we are working with changes its meaning through history (Eliot 1963: 194). In this collaborative project, specialists in different periods of women's writing have different perspectives on the meanings of key critical matters, such as class or race. These differences present our computing colleagues with the challenge of making the language of encoding able not only to reflect nuanced judgements but also to represent these historical changes of value and these multiplicities of critical perspective. Not a small challenge for a medium apparently so situated on the yes/no divide, so fixed, static in structure.

5.3 Limitations of SGML

We ask ourselves constantly what it means to be dependent on a language of binaries. The language of computers is, of course, in its most fundamental aspect, dependent on binaries, and we feel that the rigidity, exclusivity and lack of ambiguity this implies carries over into the tools of the project. Though we know that nothing is ever only one thing, in SGML we are confronted with exclusive choices. Does this tag apply: yes or no? A piece of text is either in the tag or it is not. If a woman writer has worked for the Women's Social and Political Union, do we wrap our discussion of that fact in a tag labeled "Politics" or a tag labeled "Occupation"? There are no shades of gray, and we need to make choices about how deeply to tag for the sake of productivity. Sometimes we double tag, spreading our discussion of an issue across more than one element because both aspects of it are important to capture for retrieval purposes, but in other cases we choose not to.

As we seek to overcome some of the limitations of literary history, then, we are confronted in various ways with the need to overcome limitations in computer encoding. The technology is both enabling and confining. For instance, we have three DTDs which permit us to encode important information and critical analysis. But we have had to sort our historical writing into three distinct areas, despite our view that such division is a distortion. Our division among biography, writing and world events is a strategic choice arising from the complexity of the tagging we want to do. We couldn't design a DTD that would allow us to talk about everything at once -- indeed, that would frustrate many of the aims of our computer encoding -- so we were faced with a trade-off.

We have devised a number of interlocking strategies for working against these difficulties. We allow for some nesting of tags so some things can be more than one thing at once. We will, of course, use hypertext links between different documents, regardless of those three divisions. We can use keywords such as "activism" or organizational names, such as the Women's Social and Political Union, to group documents together, and we are working towards a structured vocabulary or thesaurus that will serve as an entry-point into the material. Nor will the documents we are creating appear as isolated texts in the final version: they will be linked sequentially or interleaved in various ways. We are currently working on a system to put our SGML texts into "database" form, so we can pull together bits and pieces from different types of documents. Thus, for instance, we might produce a WSPU mini-web, which traces the relationships between women writers and the WSPU through social and political events, events in the lives of women writers, and the texts that they wrote, thus making visible the reciprocal relationship between textual, material, and ideological factors. We are already at the stage where we can draw together events from all three kinds of documents, so that biographical, publishing, and social and political material appears in a single chronology.

The yes-no language of computing is, we hope, overridden by the structural dynamics of these tools as a whole; they create multiple binaries in various relationships to each other which are in some sense analogous to the layerings of ideologies or discourses. We can't operate without discursive structures or ideological categories, nor would we wish to, but the computing language we are learning to use has some strengths for us. It pushes us to articulate our positions in as clear and detailed a way as possible and to make it possible for our users to assess more easily the steps in the process by which we have made our history. That commitment to ongoing collaborative process defines many aspects of the Orlando project's computer work, down to the fact that each document has attached to it a history of responsibility which records who worked on a document in its various stages. Letting the user into this process is one of the ways in which what we are creating differs markedly from a book: the user is more active in the process of creating and assessing meaning, and indeed may view herself as the latest in a series of collaborators who produce the text that she reads.

[Return to table of contents] [Continue]

Notes

[2] Text Encoding Initiative Home Page is found at <URL: http://www.uic.edu:80/orgs/tei/>.

[3] Information about these projects can be found via their home pages: The Brown University Women Writers Project <URL: http://www.wwp.brown.edu/>; Victorian Women Writers Project <URL: http://www.indiana.edu/~letrs/vwwp/>; The Perseus Project <URL: http://www.perseus.tufts.edu/>; The Model Editions Partnership <URL: http://mep.cla.sc.edu/MEP-Home.HTM>.