A Proposal for a Powerful New Research Tool – Organizing Information for Dissertation Writing – Part 3 of 3

In the first and second postings on this topic, I described my approach to a lack of connections between my notes on my sources and my broader dissertation outline. I explained how I organized my material and how I’m trying to use my task management software as way to create a link between the increasingly large number of note files and sections of note files on individual sources and the broader outline of the dissertation I will begin writing this year.

In this posting I will describe a kind of outlining software that could largely resolve the organizational problem I have described in my previous two postings without having to navigate between several applications. These could be easily added as a mode or layer of features to existing outlining software out there. In this case I’m thinking of OmniOutliner, which is what I use, but I think the kinds of modifications I am suggesting could be easily added to most other outlining software solutions out there, or serve as a foundation of a new solution based on the organizing principles described here. The result, I hope, will be an environment which will allow researchers to adopt a smooth workflow which can unite the highest level of a research outline and the most tiny fragments of notes on sources or the sources themselves.
My own favorite note taking application, OmniOutliner, like most kinds of outlining software, allows me to hierarchically and in a bullet point fashion record notes on various materials I come across ranging from historical sources to books and articles on any topic. I am always impressed with the great care to detail and flexibility that software created by Omni Group shows so if you are using a Mac, I recommend you give their offerings a look. When we make use of the information we find, we will usually mark the deployment of such information in our academic writing with the use of citations and sometimes direct quotations we have recorded. This means that an important part of historical research, as well as research in many other fields, is keeping careful track of exactly what information comes from what source. Two approaches in note-taking spring to my mind:

1) Somehow keep notes on sources separated by the source, whether in different sections of a single file, or different files. Make note of page numbers as one records notes from different parts of the source.

2) Organize all one’s notes by an arbitrarily determined collection of ideas (or chapters, or chapter sections, or themes, etc.). Each time one reads a new source directly input information worth remembering into the note file or section of a note file dedicated to the given idea, chapter, or theme.

The problem with the first approach, which I presume to be the more common, is that if one has a very large quantity of notes on many different sources, when one shifts into the writing mode, one has to hunt through one’s notes looking for the information relevant to the claims one wishes to make in that particular section of the dissertation. This is the problem of the lack of the “middle layer” of organization that I referred to in my first posting and for which I have presented a temporary and imperfect solution for in the second posting.

For example, I have somewhere between three and four hundred note files on various sources related or potentially useful to my PhD dissertation. Some of them are less than a page long with a few brief points of interest. Others, like my note files on various newspapers and archival collections, are extremely long files of several dozen pages divided into sections by year, issue, or specific document in a collection. Hunting through them all, or more likely, a quickly chosen sub section of these files in search of useful information I have recorded will be highly time consuming and risks missing some great gems that I have since forgotten about.

The second approach seems to provide a much faster transition from research to writing since the researcher can sit down and immediately begin writing a chapter based on the notes collected together under certain idea or chapter headings. The problem with the second approach is twofold: 1) When fragments are completely extracted from their original context you lose the important sense of connection between that information and other fragments or notes you have recorded from the same source. Call this the “problem of context.” 2) Each time you enter a fragment from a source into this chapter/idea/theme based note file you will have to also make note of the exact source and location in that source. This means the time required to record any fragment can potentially double. Call this the “problem of reference.”

Tagging and the Problem of Context

Those familiar with the dramatic rise of “tagging” of information online might have already thought of a way to resolve the problem of context. If information is tagged, you get the best of both worlds: a tagged object can be linked to many different ideas or multiple chapters while still remaining in whatever structure it was found in. There is no need to take the second approach above because we can dynamically create such an idea/theme list of fragments based on particular tags given those fragments within the note file for the original source. When a picture in a Flickr set entitled, say, “trip to the lake” is tagged “bird” then the addition of the tag does not remove it from its initial context, that is to say, the fact that it is a picture posted to the Flickr collection of kmlawson and that I have placed it in set.1

An organizational application which supports tagging, such as Yojimbo, Sente, Evernote, Leap, Yep, etc. allows you to easily tag and display a list of files by tags. iPhoto allows you to tag your images, and I use an applescript to add tags to my songs in iTunes to support my dynamically generated ‘smart’ playlists. However, from the perspective of the historian writing a book or a PhD student writing their dissertation, these tagging applications aren’t quite enough. All these applications are tagging at the level of files, which is not really the level of detail we need in creating a rich web of connections for our research.

Tags must go beyond the level of the file and down to the level of a bullet point within one’s notes. We need a way to easily and quickly tag individual fragments of information within the sources we find so can easily deploy them in our academic writing.

Let me give a concrete example. This is a fragment from one of my note files from a local police report from my trip to the archives yesterday. I also list where this fragment is within my system of organization:

In my folder “Dissertation”:
In my subfolder: “Related Notes”
In the file “Shandong Provincial Archives”:
Under the heading for file G042-01-0283-007 全县反奸诉苦大会总结 1946.4.3 日照县公安局
I have the following fragment from a local trial of a traitor:
“12/107 a woman’s father was executed by shooting by the accused traitor during the J occupation. When she 控訴ed the traitor she cried the whole time. She 上台 and using a stone, beat the 犯人. This scene made the masses 感動 and 落淚”

This is a very brief but rich piece of historical detail from a police report showing how a woman mounted the stage and beat an accused pro-Japanese collaborator who had executed her father when the town was under Japanese occupation, and it records the subsequent emotional impact this scene had upon the ‘masses’ in attendance. In a report that deals mostly with generalities and statistics this is a powerful little anecdote that might potentially make an appearance in my dissertation. This fragment might help me in one or two ways: 1) It can help me describe the way the ‘masses’ got directly involved during the local treason trials and carried out acts of direct violence against the accused during the course of the trials without the interference, despite party directives that no violence or beatings should be carried out upon the accused, especially prior to conviction, and any eventual executions only be carried out by a bullet fired by ‘special’ public security officers. 2) It can also help me make the argument, which I hope to make, which is that Communist cadres were very interested in recording the way that these treason trials aroused the masses, sparked emotion in them, and helped organize them in the contexts of movements carried out under party control. In this case, and in many others I have found, the reaction of the masses is carefully recorded.

Now, how shall I preserve this fragment for easy access later when I begin writing? I might want to give several tags to this fragment. Perhaps I want to tag it for things like, ‘Treason and Social Reform’ (the chapter this might be used in) ‘local trials’,’reaction of the masses’,’women’,’beatings’, etc. Of course, to reduce the problem of tagging things in slightly different tags, auto-complete should be available to guess tags as I begin typing them.

It would also be nice if there were cascading tags. That is to say, if I could assign certain tags to the entire file, Shandong Provincial Archives with tags like, “China”,”Shandong” which trickled down to every fragment bullet point in the file, and also tag the sub-section of that file for the document G042-01-0283-007 全县反奸诉苦大会总结 1946.4.3 日照县公安局 so that all fragments of notes taken from that file had the tags ‘日照’,’公安局’,’反奸訴苦’,’1946′, etc.

It does me little good to use Leap, or Yojimbo etc. to tag the whole file, now several dozen pages long, with all my notes from the Shandong Provincial Archives, or even a file specifically for G042-01-0283-007 which included other useful information that I might want to tag in other ways (like a table of statistics of how many ‘masses’ were ‘organized’ as a result of carrying out the anti-treason campaign). An ideal outline software solution for academic researchers would allow tagging at the level of the bullet point.

The Problem of Reference and Creating Smart or Dynamic Outlines

Of course, this system would have to work both ways. Let us say I have finished my research in the various archives, libraries, and online databases and completed the taking of all those note files. Let us say my software has allowed me to tag all the more useful bullet points, and allowed these bullet points to receive the cascading tags of their section headers and the note files themselves.

Ideally, I should now be given some kind of clean view of all fragments of information that match certain tags. Besides the fact that these have now been dynamically collected for me and displayed in a list, the most important thing I need to know now is what source they came from. Thus, every fragment listed in this way should be able to display a column or otherwise make apparent the source. This system would ideally account for the fact that the source does not always correspond to the name of the originating file, or the header for the section from which the fragment was taken.

To accommodate the fact that a fragment’s source is not necessarily reflected in the file name or section name of its origin, I suggest the outlining software allow the user to designate certain blocks in a note file, whether it is the whole file (I have many files dedicated to a single book or article), or merely a section of a file (I have a single file for all the documents I viewed from some archives such as Shandong Provincial Archive, Korean National Archive, RG242 of the US National Archives) as belonging to a source. The actual citation for this source might be kept internally within the application. However, since there are many great tools out there for managing academic resources and their citations that a student or academic might already have a preference for (Zotero, Sente, Endnote, etc.) I believe the best solution would be to provide some form of a link to a source entry in these external resources, whether they be within an offline application or an online format.

Finally, viewing a list of fragments by a single tag or combination of tags merely gives you an overview of one idea (or a chapter if you have tags for chapter themes). The application should allow you to create ‘smart outline’ files which are essentially dynamically created mega-outlines, ‘notes on notes’ or a kind of complex ‘smart playlist’ of points to be made for each argument or chapter. Here is what I am thinking of: The user could create a ‘smart’ note file called, say, “Dissertation Outline” and then write out their broad outline divided into chapters and the major arguments they wish to make. Then, they could non-exclusively assign certain tags to chapters or arguments within those chapters in a special way that allowed “and/or/not” constructions to limit the hits. This is similar to the process of tagging fragments within note files described above, with one important exception: in this case, tagging these chapters or arguments allows the user to list all fragments associated with those tags under those sections in this mega outline. Thus, at a glance, the researcher can view a dynamic and self-updating outline of their dissertation outline with the major sections and arguments directly inputted, but with each of these chapters or arguments containing within them a smart list of all fragments that contains certain tags associated with these chapters or arguments.

A few more features I believe this smart outline view ought to support: this view would also include a feature that could show a list of “orphans” which are tagged fragments which have not yet been assigned a location in the smart outline. It is very likely that we have tagged many fragments in ways that at a later date turn out not to be the most obvious when we enter the writing process. This orphan view can rescue important fragments from obscurity.

Also, since such a powerful and probably huge smart lists will probably result in a large number of duplicate or less than useful fragments getting listed, the user should be able to easily hide single or groups of fragments which, despite their promising tags, are irrelevant to a chapter/argument. There might also be check marks given so that a researcher can check off fragments as they include them in the written work.

This view should also account for the very likely possibility that the researcher anticipates using material from a source for which they have no notes for. Just as fragments are associated with sources, in this mega-outline or smart outline view, they should be able to easily drag and drop in references to sources they think will be relevant to specific arguments or chapters but for which they have no notes or fragments at all. Again this can be from an internally managed list of sources within the application but more ideally compatible with various existing solutions like Zotero, Sente, etc.

Let me give an example of the ‘Smart Outline’ feature of the software as I imagine it:

Let us say I’m writing a book just on the Treason Elimination Squads in China (rather than divided between two chapters as I currently plan to). After tagging hundreds or several thousand bullet points in dozens or hundreds of note files managed by my outlining application, I write up a (very boring) book outline using this ‘smart outline’ feature:

Introduction
Formation of the Treason Elimination Bureau in Shandong
Early Excesses and the Anti-Trotskyist Movement
Balancing the Three Treasonous Enemies
Reckless Arrests, Reckless Killings, and Attempts at Reform
Turning to the Masses
Liberation and the Anti-Treason Campaign
Conclusion: Continuity and Change in the Civil War

While I won’t list them here, let us say I also add some sections for each of those chapters with major arguments I want to make in the book under the headings for each chapter. Now, I set about attaching relevant tags for each chapter or argument/section within a chapter (and some of these tags might be tags I have created specifically for chapters):

Introduction
Formation of the Treason Elimination Bureau in Shandong
-([not 1940 or 1941 or … 1949],’Treason Elimination Bureau’,Shandong)
Early Excesses and the Anti-Trotskyist Movement
-(托匪,’Treason Elimination Bureau’,torture,executions,excesses,湖西錯誤,[泰山 and 1942],[濱海 and 1942])
Balancing the Three Treasonous Enemies
-(托匪,國特,敵偽,’Treason Elimination Bureau’ etc.)
Reckless Arrests, Reckless Killings, and Attempts at Reform
-(亂捕亂殺,’Treason Elimination Bureau’ etc.)
Turning to the Masses
-(群眾化, ‘Treason Elimination Bureau’ etc.)
Liberation and the Anti-Treason Campaign
-(反奸訴苦, 1946, etc.)
Conclusion: Continuity and Change in the Civil War
-(反奸防匪,[Shandong and [1947 or 1948 or 1949]]

Having thus assigned certain tags (in some cases the same tags are listed in multiple chapters) I should, in the software I am imagining, be able to view a dynamic list of all fragments with those tags (or in some cases, complex combinations of tags like Taishan and 1942 so I get only fragments that are likely to refer to the Taishan killings of that year) under the relevant headings. I should be able to independently re-order the displayed fragments and hide those that I determine are irrelevant in preparation for writing. These smart lists should be ‘live’ so if I go back to the library or archive and add more fragments in some of my note files with the relevant tags, they should appear in the ‘smart outline’ which lists these tags.

I believe that what I have described above can serve as a rough blueprint for a very powerful application that will allow researchers to have a fully integrated web of information between different levels of organization – at the highest outline level and the lowest level of note taking on sources.

Putting it All Together

So, putting it together, here is what I am imagining as a powerful note taking and organizational solution for academic research:

A powerful and flexible hierarchical bullet point outlining application, such as OmniOutliner

Which allows the ability to add multiple and autocompleting tags to any fragment of information within a file represented by a bullet point (and any bullet points it contains below its level)

Which allows cascading tags, so that note files and sections can be tagged and fragments within it inherit those tags

Which allows whole files or sections of files to be designated as coming from specific sources so that all fragments within those files/sections know what source they come from

Which allows sources that are associated with files or sections of files to either be managed within the application, or ideally, be linked to the entries for these files in external citation software (Zotero, Sente, Endnote, etc.) or some online equivalent (Refworks, Zotero, etc.).

Which allows the convenient listing of all fragments of information corresponding to certain tags

Which provides the means of easily viewing the source for all such fragments listed by certain tags

Which allows the creation of dynamic ‘smart outline files’ which are partially composed by the user. The sections composed by the user can be assigned a collection of tags (that might include logical boolean constructions of multiple tags)

Each section of a ‘Smart Outlines’ can expand to show all fragments from the tags assigned to that section

These displayed fragments in ‘Smart outlines’ are live so that fragments added with the given tags are dynamically added, can be arbitrarily re-ordered by the user, and hidden if they are determined to be irrelevant by the user.

Fragments displayed in ‘Smart outlines’ should optionally display check marks so they can be marked off when they are incorporated into the written work.

‘Smart Outlines’ should offer the ability to open a window displaying ‘orphaned fragments’ (all fragments minus tagged fragments already present in the smart outline) listed by tags to prevent important fragments that are badly tagged from being left out.

In the ‘smart outline’ the user should be able to drop in references to specific sources under certain sections to account for useful sources for which there are no note files and fragments.

In short, this application is an outline or note-taking application which supports sub-file level tagging of bullet points along with and a powerful ‘smart outline’ view that allows users to create powerful high-level and dynamic outlines that list all possibly useful fragments of supporting evidence, what sources they come from, or simply references to sources for which there are no notes. It should ideally interface with existing mature citation software solutions either on or offline which already has wide adoption within the academic world.

  1. Of course, in the case of such digital resources, we can of course imagine these initial contextual traits as really just being two more special kinds of tags, one exclusive tag to indicate the owner, and a list of tags with each set I have put the picture into. []

6 thoughts on “A Proposal for a Powerful New Research Tool – Organizing Information for Dissertation Writing – Part 3 of 3”

  1. What about VoodooPad?

    My friend Kerim made a number of comments via twitter which I think I ought to respond to. First is the suggestion of making use of Voodoo Pad.

    I love VoodooPad, which I bought in its earliest days and used for some time until eventually abandoning it. It is a great app for the Mac which allows you to create your own personal offline and online Wiki. It has an esasy way of creating wikilinks within each document that can connect to other pages on your own personal wiki. It also supports some hierarchical outline features like most traditional outline software applications. It also has the ability to check “backlinks” to see what links to a given page.

    Using this solution it is possible to emulate some of the features I have described here. You can use wikilinks as a kind of tagging feature, typing a wikilink corresponding to a tag or idea after lines of text but this does not allow you to do anything I described in the last half of this posting regarding the dynamic display of fragments in the form of a ‘smart outline’

    Also, even in the most recent version, the hierarchical outlining features of VoodooPad are still very basic – it is not really what it is designed for and programs like OmniOutliner are much more robust and rich in features as an outliner in this regard. Tagging is supported in the latest version of the application, but again at the level of files, not fragments from individual sources unless one creates a new file for every fragment from a source.

    Thus, while I heartily recommend this wonderful little application for many basic personal organizational needs, I don’t think it can serve as this kind of powerful solution as I described above.

  2. What is the relationship between what I’m proposing and software used by Ethnographers known as “qualitative research software”?

    Can some of their solutions work for a historian or other academic researcher in the way I described? I took a look at some of his proposed apps but will continue to explore them.

    The short answer – from what I can tell so far, none of these solutions provide anywhere near the simplicity and straight forward middle layer of organization I am looking for. The biggest reason why is that they all seem to share the fact that none of them work well at the primary task for the historian to take notes on documents and assume you are working with digital copies of your primary materials. They are also all prohibitively expensive.

    Kerim has pointed to a category of software which could potentially overlap closely with what I’m talking about and I’ll continue to investigate but so far I am very disappointed at the clunky and hugely expensive packages that are anything but intuitive to work with. I think what I have described above is both simpler and can serve as well or better than these solutions. The “coding” approach of social science is intriguing but doesn’t seem to fit well the workflow I’m familiar with as a historian. Perhaps I need a better appreciation for this alternative approach but I would need considerable persuasion.

    Hyperresearch $370 – I gave this a spin through their limited edition and I’m afraid I find the interface terrible. I would never want to spend more than a few minutes in this application. Also, I can’t imagine how one would adopt this as a note-taking environment, interface aside. The features I described above should be built upon a strong foundation of an outliner. If the application is not a good solid and comfortable outliner, above all things, then no one will want to actually use the software.

    MaxQDA (Windows Only) 900 Euros or 430 Euros educational or 115 Euros for students. This is primarily a text analysis application. It is designed to work with the primary texts themselves. You can replicate some of what I have described by treating your own notes as primary documents but it twists the purpose of the application well beyond what it is designed for. Still investigating once I get trial version up but I don’t see this serving what I described.

    Atlas TI (Windows Only) – $140 for students, $595 – still looking more into this. Will report back here. Looks very powerful but again, treating your own notes as the kind of primary documents of analysis here seems to stretch beyond what these are designed for.

    nVivo (formerly also NUD*IST) $240 ($595) – looking into it but from tutorials looks like has some of same problems of other “coding” apps mentioned above.

    AskSam (Windows Only) – better described as a database application and while it seems to overlap with above solutions seems to have no basic outlining features and trying to be “everyone’s” tool so that it doesn’t fit nicely the needs of a historian like me. Also, tags (keywords) appear to be only file based, not fragment based.

  3. Why not just do a full text search?

    Some readers might think a full text search through one’s documents will be enough when you go back to assemble the notes one will deploy in a dissertation. That may be the case depending on your academic field and the kinds of materials and sources one is working on.

    However, during the course of my historical research I have learned the hard way that for the several hundred note files I have (not to even include the digital copies of some of the sources I have) full text search yields less than a quarter of the potential. I have also heard many horror stories from my fellow PhD students on how hard it is to track down information when going back to hunt through one’s year or two of notes taken for a dissertation.

    The problems with this “full text search” approach of one’s own notes for a dissertation such as mine are legion. Let us say I want to write a section of my dissertation on women spies or traitors (I do). But the dozens of fragments I have on this through my many files may refer to words like “woman”,”female”,”女”,”破鞋“,or merely the names of the females in questions (川島芳子). Searches for some of these terms, like woman or female will result in hundreds of hits that are completely irrelevant whereas the approach I described will only turn up items I have tagged with something of this nature. While many irrelevant items may turn up, and I may have multiple related tags the software might offer the ability to look through a list of all tags (and merge some of them as delicious.com does) or at any rate, construct certain logical connections, like things tagged “female” and “spy”

    Adding a habit of tagging can promote the thinking of one’s fragments as belonging to certain categories of information.

    Full text search is powerful and one cannot live without it, but by itself I have time and time again found that I have failed to find (again) many great fragments of historical material, or, spent far too much time playing a search guessing game. Tagging, or the whole approach I suggested above does not completely eliminate the possibility of information getting lost due to poor tagging (or not being able to predict how certain information will be useful) but I think the success of tagging in many applications all over the net shows the advantages of this extra layer of meta data.

    Again this may not be the case if one is dealing with different sets of information, smaller sets of notes, or maintain disciplined consistency in terms of what words one uses to describe certain things when taking notes. However, this isn’t the case for me and I suspect many historians out there doing research.

  4. “Tags must go beyond the level of the file and down to the level of a bullet point within one’s notes. ”

    Another way of approaching this problem (although not identical) is to be able to view and edit several files as if they were one big file (sort of like the “scrivenings mode” in scrivener).

    Now if only scrivener had proper PIM functions…

Comments are closed.