Oral History and Archives(1)
TRANSCRIPTION
Once recorded, the tape, cassette, disk, CD or file is brought to the Office, a record is created noting the date of the recording, the place, the person interviewed, the timing of the recording and other details of the object itself such as notes as to its condition and composition. A duplicate copy is made and sent out for transcription. We have always transcribed and continue to do so for a number of reasons. Firstly a paper transcript, as far as we know, is still the most efficient and cost effective way to organize and administer the interview; to build finding aids such as summaries, catalog entries and indexes, and to ease access for researchers. This last is simply a matter of common sense. It takes but a few minutes to read an hour’s worth of transcript but over an hour to listen to audio while taking take notes etc. Secondly, the transcript sets a standard for use of the interview text. Each researcher will cite and quote the interview testimony in the same format with the same orthography. As you can easily understand changing spoken syntax into written syntax can be quite subjective at times, different people will hear and transcribe the stories in a slightly different form, which can have decidedly different meaning. A transcript produced by the archive standardizes the form and makes peer review that much easier and more certain. Lastly, because we send the initial transcript back to the interviewee for review and correction, which we will talk about in a minute, the final transcript carries with it a certain certification that the interview says what the interviewee intended to say in the form he or she wanted to say it. All of this is, of course, the imposition of systems of authority upon the text.
We send the initial transcript back for review for a number of reasons. Probably most important, is our desire to make the interview widely available to the public broadly conceived. In our view, the words spoken are the property of the interviewee until they are legally assigned to the institution or interviewer or author. This is a particular reading of common law practice in relation to intellectual property. Obviously journalists and others have widely divergent views on this matter. But because we seek to obtain all rights to the interview we want to insure that the donor fully understands the nature of the gift, and is given the right to be the first reviewer. A second reading is one way to make sure this is the case. Also, since we believe that the interview is a process that begins when research begins and ends only when the interview is fully processed, we believe that the interview session is only one step in the process. We do not fetishize the moment of presentation. A second more relaxed review of the interview, we believe, will yield a better document, one that has the benefit of a more nuanced consideration. Our experience returning interviews justifies this belief. Most transcripts are returned with minimal editing, for the most part stylistic. A fair number are returned with additional commentary, sometimes asked for, that improves the usefulness of the interview. Over the past twenty years there have been only three cases where the interviewee totally rewrote the interview.
It must be noted here that in many cases it is financially or otherwise impossible to try to transcribe interviews. This may be particularly true of academic researchers or others on limited budgets and tight time lines. In those cases ways should be found to consult with those interviewed on the correctness of the use of the testimony. But, on archival projects we must assume that responsible budgeting will take into account ways to be both accountable to those interviewed and to ease access for those using the interviews. Given the flexible tools for annotation and indexing in digital media, an alternative would be to carefully describe and index topics in the interview that can be easily accessed by the reader/listener.
Transcription of recordings was traditionally done on a typewriter and a great deal of time and effort went into the process. When an edited transcript was returned it often had to be retyped and then copy edited and proofread. Some oral history projects took great care to produce a finely edited and attractive transcript. Others did not. At Columbia the only editing done by staff on a corrected transcript was as to insure it could be easily read in the case of difficult handwriting or retyped only if editing had been extensive. Thus the researcher was presented with a document containing the editing of the interviewee. Within a few years the Office had gone from detailed editing to almost none. The factor was cost. With the advent of word processors the whole process was made much easier since corrections could be entered into the text and a clean copy produced with about half or even less staff time. However, we have found that our patrons much prefer the earlier form of the text where they could see the interviewee’s editing as compared to now where these are no longer visible. In addition, word processing did not save us much space since we have had to keep on file, for legal reasons, the interviewee-edited copy in case there is a dispute as to what was contained in the transcript. As an aside, our transition to word processing was not smooth. The initial step was a change over to a dedicated word processor. This decision was made because the institution within which we were located had made a collective decision on the purchase such equipment even though PCs were already available. After two years such equipment was out of date, the manufacturer had gone out of business and it was only by luck that we were able to migrate our large eight inch discs to floppies for use with PCs. We have since migrated all of these files onto a secure on-site server and are currently looking into options for translating the word processing documents into XML for use on the web, as well as for future digital migrations.
Today the Oral History Research Office is experimenting with more stable and easily migrated methods of transcription deliverables. In some cases, we have transcriptions done completely in XML, eliminating the Microsoft Word file/document, enabling us to make the material available on the web immediately. This also produces a file that experts suggest will be more easily migrated over into future systems, perhaps bypassing issues of outdated word processing files, fonts and formatting. In other cases, we have been experimenting with sending word processing documents and/or scans of paper transcripts out to be marked up in XML, producing the same files for web dissemination. The benefits of using XML also run far beyond file readability. XML enables the ability to keyword search a document online, which in many cases eliminates the need for a proper name index, or at the very least enhances the ability of a researcher to search for and find elements of the transcribed interview. We have also created PDF copies of Microsoft Word documents to serve as both a backup and an alternative to photocopies for patrons. The Office is still determining our policy on the digital distribution of transcripts. However, if the first ten years of the twenty-first century is any indication, we can guess that such distribution will become more commonplace shortly.
In the past few years there have been a number of attempts to optimize the digital revolution by devising schemes to index the sound recordings and to retrieve information through these indexes. Two in particular in the United States deserve mention; those the California State University at Long Beach and those at Randforce at the State University of New York in Buffalo. Both systems hold enormous promise but at this writing they are much more time consuming and expensive than traditional practice since a very large amount of staff time, highly paid and highly skilled, must be devoted to the proper indexing. Until these problems are solved it does not seem that there is yet an alternative to transcribing for easy access. (Since this presentation both projects have made remarkable progress although costs are still high.)
Even when such efforts are successful, I think we will continue to transcribe. Digitally recorded sound is incredibly easy to edit with the properly trained staff. Thus with the recording alone as a document one must trust the staff of the depository that what is on the disk or hard drive is what was actually said. An interviewee vetted transcript obviates that need. Again the problem of a medium and technology dependent document must be raised. Quick hardware and software change is only one risk factor. In addition one must be aware of the possibility of the effects of magnetic fluctuation, the separation of information from the substrate of CD’s caused by mild environmental disturbances, accidental overwrites, and bit rot. Files must be ‘refreshed’ often and preservation programs must be active and aggressive in migrating from any one software platform and any one hardware platform to another. With limited budgets, although everyone agrees that digitization is the way to go, transcribing will always be a fallback.
Transcribing and editing sometimes confuses patrons seeking access to the recordings. Part of the problem concerns one’s legal or ethical obligations, which we will discuss in more detail later. For now, the particular problem of transcription and tape should be mentioned. Our traditional policy, in those cases where for one reason or another we did not have full title, and where editing of the transcript had been extensive, was to refer the patron to the person interviewed for their permission to listen to the tapes, or to heirs and executors. If they agreed we allowed listening with the understanding that all citation would be from the transcript. Now with requests for use of the interviews for non-print media the fact that the transcript, which has been vetted by the interviewee and legally deposited in the Collection, varies from the recording, presents a major problem; a problem exacerbated by the possibility of posting the voice of the interviewee on the Internet. The solution was to amend our releases to secure the rights to such a posting and informing the interviewee that the unedited raw testimony will thus be available to hundreds if not millions of others. We have developed a slightly similar release for video interviews since it has always been impossible to edit those interviews in the same manner as a transcript. In those cases we made a distinction between the two forms of the interview – the video and the audio transcript – and secured two releases. To date we have not had any problems but we expect them to emerge. A proper archival solution is very much a work in progress.
No matter what that solution may be, it seems clear that as long as we adhere to the practice of sending the interview to the interviewee for review and correction we will have to produce two different forms of the interview, one that can easily be edited and emended and one that will remain unedited. That is the case because, to date, we have not found a way for the person interviewed to undertake, with ease and low cost, the kind of intensive and detailed review and editing demanded on the sound document be it on a hard drive, a CD, a MD, a cassette or tape. Of course, digital recordings offer wonderful opportunities for such editing, and some of the equipment needed to do so is readily available and inexpensive. But such editing does not allow for the interjection of added testimony, which is one of the aims of offering review and editing. The same considerations would apply to video editing.
A larger and more theoretical consideration is the issue of control. The practices we devised assured control over the product by both the institution and the person interviewed. Every agreement was enforceable because a researcher had to come to the depository where staff controlled access to the artifact itself, either tape or transcript. But digital information is intangible. It is available to be ‘output’ to any number of media for access but has no intrinsic physical form. There is not as great a need to consult with an archivist onsite, and the institution must be willing to cede control in order to provide greater public access.
LEGAL ISSUES
After transcription, as noted already, and for reasons already outlined, the transcript is returned to the interviee for review. At that time a legal release is sent as well. This release is essentially a document whereby the interviewee makes a donation to the institution of the interview, in all of its formats, and states what restrictions apply to its use. In general, we ask for full copyright and offer three choices as to access: open the interview immediately, close the interview until some specified (and reasonable) time, or require anyone seeking to use the interview to obtain, for a specified time, the permission of the interviewee to do so. We often receive a variety of additional requests, which we, through negotiations, try to resist. These include passing rights to the interview to children, spouses or executors, requiring researchers to vet their use of the interviewee prior to any use, a commitment to pay royalties on various uses, and commitments to secure publication in any particular form.
Obviously all of this influences what we feel we can and cannot post on the Internet. So far we have not experienced the tension many archives with commercially profitable holdings have had between the drive on the part of donors to protect copyright limits for longer and longer periods of time and the use of digital and broadband potential for distribution. One reason, of course, is the limited commercial potential of our documents but part is also the result of the nature of our agreements.
The Office is currently grappling with this web rights issue. At present time we are faced with two different rights situations – older, digitized interviews with releases that did not take into account the broader distribution of the internet and “born digital” audio, video and transcripts that are governed by our modern releases, grating us digital rights. In the case of the former, we currently review materials on a case-by-case basis. Many of our older releases prohibit the duplication of interview transcripts; in this situation, it appears that we will be unable to serve the digitized transcript to the public online. There is, however, the option of making the digitized material available to patrons on site. In other cases, there are no duplication restrictions, allowing the Office to interpret the releases more liberally. Despite this, we still try to go back to interviewees and/or their heirs to seek permission and a new, modern legal release, if possible. For “born digital” interviews, we are in most instances free to make the material available online with little to no restrictions, unless specified by the interviewee. In some situations in which interview content is particularly sensitive, we will ask the interviewee again if they are comfortable with the material being shared online. We have encountered little resistance to this thus far.
The release, while it certifies the gift of the interview to the archive, in turn licenses the interview to retain all rights to use the interview in any manner they desire. Turning an oral history interview into a publishable memoir is a long, difficult and tedious task. However, on the face of it, it looks to be easy, and, in some cases, of course, has been done. Many interviewees desire to publish their interviews, few do. So we offer the option. Indeed we often encourage some interviewees to contract with editors and publish the memoir because it publicizes the Collection, draws researchers to the full and unedited interview and eventually to other interviews in the Collection.
If we have dwelt too much on these issues it is because in the United States, and perhaps the common law world, they determine much of archival practice. I do not know if they resonate with the legal systems and traditions elsewhere, but I do know that they can be the basis for a discussion of the ethics of oral history interviewing and the intersubjective relations between the two parties in the interview. They should be viewed as the formal arrangements that reflect the shared authority deeply embedded in our fieldwork practice.
ACCESS
Once the reviewed edited transcript and legal release are returned, a final archival copy is produced as well as a copy for the interviewee. A forward is written explaining the conditions of the interview and its relevant information such as dates and places of the interview itself. In some projects a much more elaborate introduction is composed to accompany the transcript. The transcript is also indexed, a summary prepared for the files and a catalog entry describing the topics of the interview and all relevant information concerning its provenance and production. This entry becomes, ideally, the basis for inclusion into a printed catalog produced every few years or, now, a currently maintained database of the Collection.
All of this processing is part of an access policy, both internal for staff and external for patrons. As should be clear access is part of the thinking at every stage of the process from creation to the development of finding aids. In actual practice we build finding aids throughout the process, even though each step requires special skills, such as cataloging and indexing. At the start I talked briefly about the needs to maintain records of the initiation of the project and research files as well as all correspondence with the interviewee and all agreements. Obviously any project must maintain files whose organization is understandable and which require a minimal amount of staff time. Cataloging and indexing are far more complex because they demand a settled upon language of description and a fairly extensive knowledge of the materials to be described. They are metadata crucial for locating and making available the information on the record or in the transcript.
At Columbia the two most important finding aids are CLIO, an online catalog organized according to the strictures of the Research Libraries Information Network (RLIN) that lists a sizable portion of the Collection, and a Master Biographical Index. We have major problems with both that should serve as object lessons for others. In the first case, that of the Catalog, the problem is one of what happens after you get a grant. In the second we face the limitations imposed by budget, size of collection and scope and tradition.
Cataloging of the Collection in its early years was a process of building descriptions of individual interviews and of projects and then publishing in hard bound books editions those descriptions with a listing of interviews in a few subject areas such as law or medicine. In the early 80s we received a grant to computerize the catalog, and were able to develop files for roughly 3500 of our then 5000 interviews. We were never able to secure funding to complete the project or to continue such cataloging. The experience, however, was useful because we did learn how to develop a common thesaurus for describing the Collection, as well as the developing the fields necessary for such cataloging. When and if we begin to try to construct an on line catalog this may prove useful.
Indexing has always been a problem. We have always indexed by proper name only. There is no subject index, although at one time one was attempted. While this index has worked all right for researchers, especially biographers, (although it must be admitted that one reason it did so was remarkable staff stability) it is useless in the age of the Internet. It offers no clue to the information available within the interviews. In our one attempt so far to post interviews on the web it was necessary to devise a Rube Goldberg form of Table of Contents to provide users with a better sense of the interview as well as a set of terms to which one could turn to get them to the proper place in the interview where something of interest would have been discussed.
The finding aids for an oral history collection, in general a catalog and index, are in many ways no different than those for other records. They must be durable in the sense that they do not change too much in style and format over time, especially in an age of ever changing web pages. The choices made in their formation (nowadays choices as to software and hardware) should be documented so that updating migration and refreshing are possible. They must be constructed in familiar formats easily understood by researchers. Wherever possible they should adhere to a consistent language and set of descriptors, even if, like Library of Congress subject headings used in the United States, there are significant ideological problems. And they should be general enough in the catalog to direct patrons to the collection and specific enough in the index to allow pinpoint searches but also allow for browsing.
The Office is currently working with Columbia’s Libraries Digital Preservation team to design an online portal for the searching and access of its oral history interview. In its beta phase, the portal allows researchers to limit their archival search to specifically oral histories, and to utilize keyword searching to find relevant material. For the first time, patrons will be able to search for an interview without a specific individual’s name in mind. While a simple concept, this is revolutionary for our office, opening up wonderful new opportunities for patrons to unearth material perhaps buried in the archive until now.
Comprehensive metadata is essential for audio preservation. The Columbia University Libraries, the institutional home of the Oral History Office, has chosen to employ METS with its digital management system , Fedora, not only for audio but for all digital content. While it is an established standard, METS has a great deal of flexibility and tends to be site-specific in the details of how it is implemented locally. Other institutions may choose other means of managing their digital content and have no need of METS for their audio files, or they may implement METS differently than the Columbia University Librries do. The important point is that each institution should record and maintain the data needed for file management and long term preservation of its audio files. Such data includes: the technical description of the original sound object and description of its content, a description of the digital process, the structural information that allows content to be accessed in correct order, and the technical description of the audio files. Thus from little or no concern for problems of preservation as they applied to access, we have moved to a sophisticated system of data management.
Oral history poses unique problems. Because the information is located within a conversation there can be, and often is, a great deal of repetition of story elements dispersed over the length of the interview. There is no consistent language and different people use different words for different meanings with ethnic, regional, class, or racial uniqueness, and an indexer often has to interpret what is meant in each case. A decision has to be made whether or not to index the questions of the interviewer, and most of all room has to be made for the subjective. An interview is more than a repository of facts and descriptions of events. It is also often an exercise in identity, an expression of ideology and values, and locus of metaphor and narrative power. In the world of today’s research we can no longer assume that anyone using our work is interested solely in recreating the events of the past. Some of the most interesting forms of information encased in the dynamic of the interview are those that document culture as well as society.
The complexity of these problems is not diminished by the new technology. Our future program, I think, is clear. The path to follow may not be so. Whatever it is it will be bounded by our past policies and practices. Some will have to be discarded. Others may be so intrinsic to the oral history process that we would destroy what it is we seek to accomplish when we embark on oral history.
An added consideration is the change in the nature of the Oral History Office over the years, Once, almost solely devoted to collection development and therefore focused on the problems posed by archives practice, the Office now engages in a number of projects and activities ranging over the full field of oral historical practice including: the hosting of seminars and workshops, the presentation of our holdings in various formats, educational outreach and community support efforts. Most significantly, in 2008 we founded a Master’s degree in Oral History (Master of Arts in Oral History, http://iserp.columbia.edu/education-programs/ohma) under the auspices of the Institute for Social and Economic Research and Policy at Columbia University. Mary Marshall Clark, current director of the Oral History Research Office, and Peter Bearman, one of the United States’ leading sociologists, collaborated to define the degree as a interdisciplinary form of social and cultural research, designed to deepen research in the arts and humanities as well as the socio-historical sciences. The archive is in many ways the heart of the MA program, as students can reflect on the ways in which interviewing methodologies have developed and changed over time, and must continue to change, in order to embrace the challenges of recording histories from a global perspective.
The Collection, of course, remains the heart of the work of the Office but more and more of our efforts are devoted to devising methods of making the Collection more available for the widest variety possible of research and teaching both internal and external in line with the aims of the larger university community.
Ronald J. Grele (PhD)
Former director of Columbia University Oral History Research Office