A Simple Collaboration System
(Proposal)

Author: Eric Armstrong, Feb 2001

Related Links


Motivation for a Simple Tool

Ultimately, the world needs a highly-interactive, knowledge-based system for collaborating on large-scale, complex problems. Such a system is described in general terms in the Project Concept document, and discussed at length in Requirements for a Collaborative Design/Discussion/Decision System.

To summarize that goal: We need (and plan to build) a distributed system that will permit collaboration by team of people who are focused on a common goal. The Engelbart "knowledge repository" that results from the collaboration could be virtual (the product of peer-to-peer interactions) or real (instantiated in a centralized location). The system will allow users to interact with multiple repositories and cross-fertilize what they find from one project to another. The system will "participate" in team conversations, like a child at a banquet. In the same way that a child is taught, users will whenever possible empower the system to answer questions that are within its grasp. (For example, by recategorizing things that the system already "knows".) As a result of those interactions, the system will evolve, becoming smarter and more capable over time. (Over time, a special class of knowledge workers will undoubtedly emerge for that purpose.)

However, the project has a serious chicken-and-egg problem. Many motivated interested individuals are at far ends of the continent, and at points between. Many potentially valuable contributors are located in Japan, Europe, and points East. But to build a collaborative system collaborative, we need at least a rudimentary collaboration system! That is the nature of the "bootstrap" problem, that Doug Engelbart has spent a lifetime tackling.

So we need something much simpler in the near time frame -- something that can be put together with minimal development, using existing tools, but which is powerful enough to materially advance our collaborative efforts. This document proposes such a tool, centered around:

(These will be explained shortly. First, we'll explore the alternatives we've investigated.)

Evaluating the Alternatives

We thought for a time that a group of physically co-located individuals centered here in Silicon Valley could produce an initial system which would allow remote folks to subsequently enter into subsequent discussions. I'm glad to say that we are making (slow) progress towards that goal.

However, that alternative has turned out to be less efficacious than originally hoped, for two reasons:

  1. Because it is a part-time effort, these folks only come together once a week, at most, for a couple of hours. That is simply an insufficient amount of time to make any kind of rapid progress. Real progress is being made (honest!), but the pace proceeds so glacially that outside observers sometimes have a hard time discerning it.
     
  2. The group lacks the benefit of several keen thinkers who could make substantive contributions from their far-flung locations. (A few that come to mind: Paul Fernhout, Frode Hegland, Ken Holman).

The simple fact of the matter is that those of use who are physically co-located need an online collaboration tool as much as anyone! We are not so physically adjacent that we can walk down the hall to talk things out. Instead, we have a few spare moments here and there throughout the day. We need an online system so we can carry forward the collaboration in the time available, without subjecting ourselves to additional travel to do it.

So far, the group has experimented with different kinds of information sharing systems. All have proven unequal to the task, for the reasons noted below:

Shared Document Systems
Systems like ZWIKI allow multiple users to update a remote document. Those systems appear to work fine for an end-product document. If three of us are writing a paper, for example, and the final result will bear all of our names, then it doesn't matter who makes which edits. However, in a wide-ranging investigation like a design discussion, it is imperative to keep track of who said what. So attribution is a critical missing ingredient for such systems.

(Understanding the reasons for that fact could well be a study for psychologists. But experience with such systems seems to indicate that we have an intuitive undestanding of the notion that there is no such thing as a single "truth" and that, even if there were, no one set of statements would capture it exactly. In a conversational scenario, then, it seems that we unconsciously filter and interpret what we read according to who said it. We compensate for their biases, and fill in gaps with an understanding of their intentions. When statements are presented on their own, without being attributed to their author, they seem somehow dry and lifeless.)
 
Email Lists
Passing plain text or HTML documents in email is one way to solve the attribution problem, since the author of the revised version is clearly noted at the top of the message, and their comments are distinguished from the original text. However, only the latest author's comments are attributed (other attributions are lost as the document is forwarded). More importantly, after a few versions, the indentation that results from marking different contributions becomes excessive. (On each revision, the text is shifted to the right two spaces. After a few revisions, the document is no longer usable.)
 
Topical Chat Systems
Topical chat systems are great for party conversations, where a single subject is carried on for a while, and then dropped in favor of another. Comments are attributed and archived, which is helpful. But there is little if any structure to such conversations. For a complex, lengthy, multi-faceted discussion, such systems quickly break down, because they provide no mechanisms for organizing (and, more importantly, reorganizing) the discussion.
 
Word Documents
Although we haven't actually tried this, it is something to consider. Word documents would allow revisions to be highlighted. However, attribution is still an issue, the documents are not editable everywhere, and they are not web accessible. So maybe it is not something we should consider for very long...

So, as I originally proposed during Doug Engelbart's year-2000 colloquium, we need to design and develop a useful collaboration system for online design discussions -- a system that not only makes it possible to reach decisions, but which records decisions, and their alternatives, so that downstream it becomes possible to answer the all important question, "Why?". (The question "What?" can be answered by looking at code. But the question "Why?" is often virtually impossible to answer.)

Such a system would be of enormous benefit to the open source community, for the very simple reason that while we can share source code and manage bug reports over the web, there are no tools which make it possible to fruitfully carry on a design discussion.

Such system would be useful for the other kinds of Wicked Problems that Jeff Conklin and William Weil describe. Anytime an investigative discussion is required to clarify a situation and resolve issues, anytime a focused, goal-oriented discussion needs to take place to solve a complex problem, such a system can play an important role.

Finally, focusing on the issue of online design discussions has the most significant "bootstrapping" effect. By developing a tool that open source designers use, we will obtain the benefit of their advice (and ultimately their help) in improving the system. And by improving our own ability to carry on online discussions, we both improve our ability to design the target system, and open up our deliberations for significant contributions by others.

Overview of the Simple System

A simple tool for colloboration can be built around:

This section explores those concepts.

The Advantages of IBIS

In The IBIS Manual, Jeff Conklin gives a superb introduction to the concept of Issue-Based Information Systems (IBIS -- "eye-biss"). To summarize that paper very briefly:

In a totally successful discussion, all questions are answered, and all participants agree on the answers. One suspects that such halcyon scenarios are probably not the norm. But those who have experience with such discussions report several beneficial effects:

Calm
The single most noticable effect is on the participants. Rather than breaking up into heated, vehement battles, IBIS discussions tend to be calm, rational affairrs. That effect tends to result from the next two characteristics of IBIS-style discussions.
 
Alternatives are Always Allowed
Because no proposal is ever allowed to stand on its own, there is never an instance of a "bald assertion" which must be attacked and dragged down in order to make room for one's own pet proposal or theory. Instead, the question that the proposal addresses is first adduced. As a result, a "hook" always which upon which to hang an alternative. (As a result, discussions tend to be more cooperative and exploratory, rather than argumentative and confrontational.)
 
All Alternatives are Evaluated
In the IBIS methodology, a decision is never reached until all alternatives have been evaluated. Having once gotten an idea into the system, then, it's proponent can relax, calm in the knowledge that eventually it will get its "day in court", and the arguments in its favor will be heard.

Note that IBIS-style discussions are usually carried out in a meeting, and they are led by a moderator. Jeff Conklin took a stab at creating an online version of such a system with his Graphical IBIS (gIBIS -- "gibb-us"). However, a paper at http://web.uvic.ca/~ckeep/hfl0104.html (which no longer seems to be present), as well as anecdotal reports, have shown that users often have problems with automated versions of such systems. The major issues are:

Graphical complexity
If memory serves, this one of the issues that users mentioned when using gIBIS. If not, I suspect that it rapidly becomes an issue for any complex discussion.
 
Cognitive overhead
The difficulty of keeping track of things increased as the system grew. (This was the core observation of the paper noted above.)
 
The need to pre-define a comment category
For example, one might not know whether one was arguing "for" or "against" a proposition when starting out to trace the implications. Indeed, the same implication could be either a point in favor or a point against, depending on the context.

The Advantages of XML

Using XML as a basis for an IBIS-style discussion language has several advantages:

While the XML structure does not solve the pre-selection problem, it does eliminate the issue of graphical complexity (by eliminating the graphics). However, the fact that the documents are stored in XML makes it possible to retarget them for other systems at a later date -- graphical systems, when they are devised, or perhaps more robust knowledge repositories.

Requirements for the Simple System

There are a number of general requirements and several highly specific editing requirements that go into making a viable system. These requirements are fairly easy to meet, using off-the-shelf tools:
XML->HTML Server
Some mechanism is needed to translate XML documents into HTML pages and deliver them to web browsers. That way, documents can be browsed by anyone, even if editing is restricted to a chosen few. XSLT mechanisms can handle the translation, and servlets are fairly easy to build, so this requirement is easily met.
 
Source Control
A CVS archive, for example, makes it possible to ensure that only one person is editing the document at a time. CVS is not totally ideal, since it is a plain-text system, but it will do until a good system based on XML-differencing is devised.
 
Notifications
Given that the documents are shared, and that participants are remotely located, it is important that each document has a notification list associated with it, and that each person on that list gets a message when the document has been changed. This requirement is moderately difficult, but not terribly so.

The two requirements that pose the most difficulty are the issue-based discussion language and the editor. Those requirements are discussed in the remaining two sections.

An XML-based Language for IBIS

Carrying on an IBIS-style discussion in an online XML document requires an IBIS-equivalent language. This section discusses the required language.

In true IBIS, fashion, let's start by asking, "What should the language look like?". The table below presents one possible proposal, along with synonyms and other interesting alteratives.

Proposed Language Notes & Possible Alternatives
<query status="">
issue, question, ques, quest
  <alternative status="">
candidate, claim, option, possibility, proposal, proposition
     <pro>
for
     <con>
against
     <rating value="">
evaluation
     <endorse>username
vote, recommend

Occur anywhere:

Contents:
<info>
A single paragraph (node)
<notes>
  <item>+
A list containing one or more paragraphs (nodes)

Attributes:

Values:
query: status=
open, decided
alternative: status=
unrated, rated, rejected, selected
alternative: rating=
distasteful-1, implausible-2, viable-3, likely-4, favored-5
(or: preferred-5 )
query: rating_avg=
(average of ratings, as a single-decimal text string)

Notes:

More Questions

Here are some more questions, and possible answers:

?-What to call the language?
  • IBDL: Issue-Based Discussion Language
  • LIDIA: Language for Investigation and Discussion of Issues and Alternatives
  • BIBL (bible): Basic Issue-Based Langugage
  • LIBL (libel): Local Issue-Based Langugage
  • KIBL (kibble): Knowledge-oriented Issue-Based Langugage
  • NIBL (nibble): Nodal Issue-Based Langugage
  • QIBL/QUIBL (quibble): Questioned Issue-Based Langugage
  • ?-other words for the acronyms above?
  • ?-other names?
     
?-how to change an argument to "countered"
Suppose you add an argument <con> against an alternative, and someone counters that argument. How should the counter-argument and the final result be stored in the argument hierarchy? Should the counter argument go under the original argument? Should the <con> element change to some other element, like <con-countered>? Should an attribute be added: <con status="countered">? Or should a special <countered> section be added so countered arguments can be moved to that section and taken out of the way?

Note the recursion here. There is always an implicit question: "Has this argument been countered?" Answering that question may well require the <endorse> and <rating> on counter arguments. And since multiple counter-arguments are possible, perhaps a <counter> element is needed that can occur under <pro> or <con> elements. (But then, what about counter counter-arguments, etc? -- the element structure would be different at the top of the hierarchy than further down.)

Conclusion: <pro> and <con> tags should be allowed under <pro> and <con> arguments. A <pro> under an argument indicates support, while a <con> indicates a counter argument.

The DTD

Here is the Document Type Definition (DTD) for the language, assuming that Schematron is required, and that <content> elements are therefore not needed:

<?xml version='1.0' encoding='ISO-8859-1'?>
<!-- 
  DTD for an issue-based discussion language.
  -->
<!element query (#PCDATA | alternative | info | notes | query)* >
<!element alternative (#PCDATA | pro | con | rating | endorse | info | notes | query)* > <!element pro (#PCDATA | info | notes | pro | con | query)* > <!element con (#PCDATA | info | notes | pro | con | query)* >
<!element rating (#PCDATA)* > <!element endorse (#PCDATA)* > <!element info (#PCDATA)* >
<!element notes (item)* > <!element item (#PCDATA)* > <!attlist query status (open | decided) #IMPLIED> <!attlist alternative status (unrated | rated | rejected | selected) #IMPLIED > <!attlist rating value (distasteful | implausible | viable | likely | favored) #IMPLIED >

Requirements for an Issue-Based Editor

The one requirement (besides the language) that poses the greatest challenge is the requirement for a good XML-based editor, mostly because there are some specific requirements for editing that normal XML editors are unlikely to handle.

Fortunately, Warner Ornstine has been working on an editor for the Extensible Development Environment Project (eXtenDE) that may prove to be extensible in the directions we need. That open source development project is still in its early stages, so we can help define the architecture that produces extensibility in the editor.

Here are the specific editing requirements that add difficulty to the project:

Context Display
When questions are indented under questions, and you're down several levels deep in the hierarchy, it's hard to determine the context for the particular question you're looking at. (Was it under the last question, or beside it?) So the editor must have have some mechanism that makes it possible to determine the context (ancestor hierarchy, and ideally relatives of those ancestors) at any given point in the tree.

I believe Warner Ornstine has already anticipated this requirement in the eXtenDE editor, at least in theory. My own idea of how this should be down is tro drop a thin dotted line between the "+" icons that let you collapse and expand nodes. When the cursor hovers near that line, a popup could display the item at the top of the line.
 
Multi-line
As described in Design Notes for an XML Editor, a good structured editor must be capable of displaying multiple-line elements, and it must also be tree-structured. Most editors do one or the other, but not both. Or they make you switch modes.
 
Blank Sheet of Paper
One user who reviewed an early version of the eXtenDE editor used this image to describe the desired behavior for the editor. It's a great image. Although each element is a distinct object, the appearance of the objects in the editor should be as though they were on a blank sheet of paper. In other words, it should not be necessary to select an entry to begin editing it. Instead, the internal boundaries between elements should be seamless, so that cursor motions take you naturally from one element to another.
 
Auto-Fill Attribubes
These represent a novel concept in XML editing. But the editor needs to store the date/time that a node was created or modified, along with the date/time that the list under it was modified, as attributes of each and every node. It also needs to store the identification information for the person making the changes.

Note:
Ideally, it would be possible to note such attributes and elements as "auto-fill" in the schema. (I somehow doubt that XML-schema has anticipated that need, but haven't looked closely enough to tell.) The autofill=date attribute would instruct the editor to store date/time info. The autofill=text attribute would instruct the editor to prompt the user for the string to use -- and to save it for future use thereafter.
 
Ontology Mapping
The ideal editor would also make it possible to change the elements used in the display. For example, although <query> elements would be stored in the file, the user may well want to display such elements using <Q>, to conserve display space. The existence of the mapping makes a common interchange standard easier to swallow, since you don't have to look at it if you don't like it.
 
Multiple Schemas
The editor must stand ready to validate a document multiple times, using different schema mechanisms, before it can declare the document to be 100% valid. For example, the list of autofill-attributes might need to come from a separate file, given that the autofill specification can't be specified in a standard schema.

And, while in most cases a hierarchical schema provides more expressive power, in this particular case, an XML Document Type Definition (DTD) is the most natural way to express the structure relationships. The reason for making that statement is the fact that DTD definitions are not hierarchical. For highly structured data, that is typically presents more problems than it solves. But in this case, the modular reusability of elements is a big plus.
So, in a DTD, it's easy to specify:

<!element query (alternative | info | notes | query)* >
<!element alternative (pro | con | rating | endorse | info | notes | query)* >

Since queries can have queries under them, and alternatives can have queries under them, deep nested structures can occur, and be validated. In a hierarchical schema, such recursion is much more difficult to specify.

Finally, the editor needs the ability to validate with the assertion based Schematron schema system. That mechanism achieves two important goals.
1) It allows content (text and inline tags) to be placed in the most natural position in the element, as described in Design Notes for an XML Editor. (However, since most schema mechanisms have no such capabilities, a special <content> element may well be defined anyway.)
2) It allows sophisticated validation, like ensuring that the status-attribute for a query is "decided" only if no un-evaluated alternatives remain.
 
Validation Errors, Handling of
The importance of validating documents brings to mind the question of how and when the editor validates documents. On-the-fly editing ensures that the document is always in a legal state. But such restrictions typically hamper users who are making significant changes.

A better alternative is probably to provide a menu choice for validation, and to automatically validate when saving a document (while making it easy to find and correct errors). If validation fails, the user should be asked if they want to save the document, before doing so. (Since they might say say yes, the editor must be capable of reading in a document that fails validation.)
 
Hierarchy-aware Scripts
Ratings for an alternative need to be accumulated and averaged among all those who evaluate it. So the ideal editor will add a scripting capability that understands hierarchy and can propogate information either upward through the ancestor chain &/or downward through the descendant tree

Extensions to the Simple System

Given a basis for collaboration, it's easy to start thinking about extra features it could have. It is too early to say whether it makes more sense to build a more sophisticated system, or to extend the simple system. But I'll trot out the ideas for extensions here, for future reference.

Display Filters

Given that XML data is stored, and that it must be converted to HTML for display on the Web, it makes sense to think about the "Web-Based Intermediaries" developed at IBM that Doug Engelbart has favored. WBI filters (or other such filters) could for example sort alternatives so that the highest-ranking, most heavily endorsed ideas appear first.

An interactive filter would let you choose which items you want to display. You could filter out <info> and <notes>, for example, to get a better outline of the arguments. Or you could eliminate comments by certain individuals. You might want to put unrated items first in the list (because they have yet to be visited, and are therefore blocking a decision), or last (so you can focus on the most heavily rated, most thoroughly evaluated alternatives). Or maybe alternatives should be ordered alphabetically by name, alphabetically by user, or randomized to ensure unbiased investigation.

One filter would create the default view. It might operate when the data is stored, to put things in the order that the group has determined to be the most generally useful. Other filters might operate when the data is delivered. (Most important, of course, is the interaction mechanism that lets you specify the filter's operation.)

In the most ideal of all worlds, the editor would be able to understand and interact with the filter mechanisms, so you could have the same display when editing the XML that you can get when viewing the HTML versions.

Difference Highlighting

The ability to see what has changed is highly desirable. But adds several degrees of complexity:

Deleted Information in the File
Maintaining deleted information in the file requires an additional attribute. And it adds complexity when moving and inserting new information. But it is necessary to see what has been removed, as well as what has been added.
 
Difference Highlighting in the Editor
In the best of all possible worlds, the editor will make it esay to spot differences. Several possible mechanisms come to mind, including "modified" markers in the file, an XmlDiff process against a saved version of the file, or showing all changes after a given date/time.
(The date/time mechanism may be the only viable choice for shared multiple-user documents for the simple reason that user "A" will want to clear the change-highlighting on sections that have not yet been seen by user "B".)

Note:
For the date/time differencing mechanism to be effective, deleted information must be stored in the XML file so it can be displayed as a change. The correlary is that it must be possible to hide such information when differences are not being displayed.
 
Basis for Differencing
Since each document needs an associated notification list (to inform interested parties when a document has changed) the last date that the user accessed the document can be stored in that list. Changes after that date can be highlighted. (A useful option would be the ability to include or exclude your own changes). The control file can also keep track of modification dates and the comments that summarize the changes that were made (as, for example, in CVS).

Note:
With this mechanism you can turn off all highlighting all at once, or you can leave it on, but there is still no good way to turn off highlighting as changes are reviewed. Each client would need its own capability to do so, and ideally that capability would persist between editing sessions. If some mechanism were added to the language to handle this, it would have to contain a pointe to local file that kept track of which changes were visited and which were not.

Automated Implication Engine

A sophisticated query mechanism would be useful. Enabling such a mechanism requires extending the issue-based discussion language to allow more complex relationships to be described. For example, in a complex scenario, the answer to one question impacts the answer to several others. Consider a case like this:

  Q1:                 (query)
    A1:               (alternative)
    A2:
    A3:
  Q2:
    B1:
      +-if Q1A1       (pro, if A1 is chosen for Q1)
      /-if Q1A2       (con, if A2 is chosen)
    B2:
      +-if Q1A3
      /-if Q1A1

Were the language extended to permit such statements, perhaps with tags like <proif> and <conif>, it would be possible to:

Make suppositions
It should be possible to "suppose" that A1 has been selected, and derive Q2B1 as a result.
(suppose Q1A1 => Q2B1). Playing with the system and making different suppositions would make it possible to, in effect, experiment with different designs.
 
Find consistent sets of alternatives
Given the conditional knowledge embedded in the system, it should be possible to enumerate and list the set of possible designs, where each design consists of the set of consistent alternatives. In the example above, the design set would be
{Q1A1+Q2B1, Q1A2+Q2B2, Q1A3+Q2B2}

A special query tool would undoubtedly be needed for this purpose, but the beneficial impact on design discussions would be considerable.

Conclusion

Using an XML-based discussion language and shared files, it should be possible to engineer an efficient and effective tool for online collaborative discussions (like design discussions), in a very short period of time. Other than the language, the really big limiting factor is the existence of a good editor that is either extensible or targeted for the purpose.

References