A Simple Collaboration System
(Proposal)
Author: Eric Armstrong, Feb 2001
Related Links
Motivation for a Simple Tool
Ultimately, the world needs a highly-interactive, knowledge-based system for
collaborating on large-scale, complex problems. Such a system is described in
general terms in the Project Concept document,
and discussed at length in Requirements for a Collaborative
Design/Discussion/Decision System.
To summarize that goal: We need (and plan to build) a distributed system that
will permit collaboration by team of people who are focused on a common goal.
The Engelbart "knowledge repository" that results from the collaboration
could be virtual (the product of peer-to-peer interactions) or real (instantiated
in a centralized location). The system will allow users to interact with multiple
repositories and cross-fertilize what they find from one project to another.
The system will "participate" in team conversations, like a child
at a banquet. In the same way that a child is taught, users will whenever possible
empower the system to answer questions that are within its grasp. (For
example, by recategorizing things that the system already "knows".)
As a result of those interactions, the system will evolve, becoming smarter
and more capable over time. (Over time, a special class of knowledge workers
will undoubtedly emerge for that purpose.)
However, the project has a serious chicken-and-egg problem. Many motivated
interested individuals are at far ends of the continent, and at points between.
Many potentially valuable contributors are located in Japan, Europe, and points
East. But to build a collaborative system collaborative, we need at least a
rudimentary collaboration system! That is the nature of the "bootstrap"
problem, that Doug Engelbart has spent
a lifetime tackling.
So we need something much simpler in the near time frame -- something that
can be put together with minimal development, using existing tools, but which
is powerful enough to materially advance our collaborative efforts. This document
proposes such a tool, centered around:
- IBIS-style discussions
- XML documents
(These will be explained shortly. First, we'll explore the alternatives we've
investigated.)
Evaluating the Alternatives
We thought for a time that a group of physically co-located individuals centered
here in Silicon Valley could produce an initial system which would allow remote
folks to subsequently enter into subsequent discussions. I'm glad to say that
we are making (slow) progress towards that goal.
However, that alternative has turned out to be less efficacious than originally
hoped, for two reasons:
- Because it is a part-time effort, these folks only come together once a
week, at most, for a couple of hours. That is simply an insufficient amount
of time to make any kind of rapid progress. Real progress is
being made (honest!), but the pace proceeds so glacially that outside
observers sometimes have a hard time discerning it.
- The group lacks the benefit of several keen thinkers who could make substantive
contributions from their far-flung locations. (A few that come to mind: Paul
Fernhout, Frode Hegland, Ken Holman).
The simple fact of the matter is that those of use who are physically co-located
need an online collaboration tool as much as anyone! We are not so physically
adjacent that we can walk down the hall to talk things out. Instead, we have
a few spare moments here and there throughout the day. We need an online system
so we can carry forward the collaboration in the time available, without subjecting
ourselves to additional travel to do it.
So far, the group has experimented with different kinds of information sharing
systems. All have proven unequal to the task, for the reasons noted below:
- Shared Document Systems
- Systems like ZWIKI allow multiple users to update a remote document. Those
systems appear to work fine for an end-product document. If three of us
are writing a paper, for example, and the final result will bear all of
our names, then it doesn't matter who makes which edits. However, in a wide-ranging
investigation like a design discussion, it is imperative to keep track of
who said what. So attribution is a critical missing ingredient for
such systems.
(Understanding the reasons for that fact could well be a study for psychologists.
But experience with such systems seems to indicate that we have an intuitive
undestanding of the notion that there is no such thing as a single "truth"
and that, even if there were, no one set of statements would capture it
exactly. In a conversational scenario, then, it seems that we unconsciously
filter and interpret what we read according to who said it. We compensate
for their biases, and fill in gaps with an understanding of their intentions.
When statements are presented on their own, without being attributed to
their author, they seem somehow dry and lifeless.)
-
- Email Lists
- Passing plain text or HTML documents in email is one way to solve the
attribution problem, since the author of the revised version is clearly
noted at the top of the message, and their comments are distinguished from
the original text. However, only the latest author's comments are
attributed (other attributions are lost as the document is forwarded). More
importantly, after a few versions, the indentation that results from
marking different contributions becomes excessive. (On each revision, the
text is shifted to the right two spaces. After a few revisions, the document
is no longer usable.)
-
- Topical Chat Systems
- Topical chat systems are great for party conversations, where a single
subject is carried on for a while, and then dropped in favor of another.
Comments are attributed and archived, which is helpful. But there is little
if any structure to such conversations. For a complex, lengthy, multi-faceted
discussion, such systems quickly break down, because they provide no mechanisms
for organizing (and, more importantly, reorganizing) the discussion.
-
- Word Documents
- Although we haven't actually tried this, it is something to consider.
Word documents would allow revisions to be highlighted. However, attribution
is still an issue, the documents are not editable everywhere, and they are
not web accessible. So maybe it is not something we should consider for
very long...
So, as I originally proposed during Doug Engelbart's year-2000 colloquium,
we need to design and develop a useful collaboration system for online design
discussions -- a system that not only makes it possible to reach decisions,
but which records decisions, and their alternatives, so that downstream
it becomes possible to answer the all important question, "Why?".
(The question "What?" can be answered by looking at code. But the
question "Why?" is often virtually impossible to answer.)
Such a system would be of enormous benefit to the open source community, for
the very simple reason that while we can share source code and manage bug reports
over the web, there are no tools which make it possible to fruitfully
carry on a design discussion.
Such system would be useful for the other kinds of Wicked
Problems that Jeff Conklin and William Weil describe. Anytime an investigative
discussion is required to clarify a situation and resolve issues, anytime a
focused, goal-oriented discussion needs to take place to solve a complex problem,
such a system can play an important role.
Finally, focusing on the issue of online design discussions has the most significant
"bootstrapping" effect. By developing a tool that open source designers
use, we will obtain the benefit of their advice (and ultimately their help)
in improving the system. And by improving our own ability to carry on online
discussions, we both improve our ability to design the target system, and open
up our deliberations for significant contributions by others.
Overview of the Simple System
A simple tool for colloboration can be built around:
- IBIS-style discussions
- XML documents
This section explores those concepts.
The Advantages of IBIS
In The IBIS Manual, Jeff Conklin
gives a superb introduction to the concept of Issue-Based Information Systems
(IBIS -- "eye-biss"). To summarize that paper very briefly:
- Discussions are led by a moderator.
- Every issue begins with one or more questions.
- More questions are added as they become appropriate (either at the top most
level, or under other questions, thereby forming a hierarchy).
- As possible answers are proposed, they are collected under the question
they purport to answer.
- Pro's (arguments for) and con's (arguments against) are listed under each
alternative.
- Additional information is added anywhere it makes sense.
- A decision cannot be reached until all alternatives have been evaluated.
In a totally successful discussion, all questions are answered, and all participants
agree on the answers. One suspects that such halcyon scenarios are probably
not the norm. But those who have experience with such discussions report several
beneficial effects:
- Calm
- The single most noticable effect is on the participants. Rather than breaking
up into heated, vehement battles, IBIS discussions tend to be calm, rational
affairrs. That effect tends to result from the next two characteristics
of IBIS-style discussions.
-
- Alternatives are Always Allowed
- Because no proposal is ever allowed to stand on its own, there is never
an instance of a "bald assertion" which must be attacked and dragged
down in order to make room for one's own pet proposal or theory. Instead,
the question that the proposal addresses is first adduced. As a result,
a "hook" always which upon which to hang an alternative. (As a
result, discussions tend to be more cooperative and exploratory, rather
than argumentative and confrontational.)
-
- All Alternatives are Evaluated
- In the IBIS methodology, a decision is never reached until all
alternatives have been evaluated. Having once gotten an idea into the system,
then, it's proponent can relax, calm in the knowledge that eventually it
will get its "day in court", and the arguments in its favor will
be heard.
Note that IBIS-style discussions are usually carried out in a meeting, and
they are led by a moderator. Jeff Conklin took a stab at creating an online
version of such a system with his Graphical IBIS (gIBIS -- "gibb-us").
However, a paper at http://web.uvic.ca/~ckeep/hfl0104.html
(which no longer seems to be present), as well as anecdotal reports, have shown
that users often have problems with automated versions of such systems. The
major issues are:
- Graphical complexity
- If memory serves, this one of the issues that users mentioned when using
gIBIS. If not, I suspect that it rapidly becomes an issue for any complex
discussion.
-
- Cognitive overhead
- The difficulty of keeping track of things increased as the system grew.
(This was the core observation of the paper noted above.)
-
- The need to pre-define a comment category
- For example, one might not know whether one was arguing "for"
or "against" a proposition when starting out to trace the implications.
Indeed, the same implication could be either a point in favor or a point
against, depending on the context.
The Advantages of XML
Using XML as a basis for an IBIS-style discussion language has several advantages:
While the XML structure does not solve the pre-selection problem, it does eliminate
the issue of graphical complexity (by eliminating the graphics). However, the
fact that the documents are stored in XML makes it possible to retarget them
for other systems at a later date -- graphical systems, when they are devised,
or perhaps more robust knowledge repositories.
Requirements for the Simple System
There are a number of general requirements and several highly specific editing
requirements that go into making a viable system. These requirements are fairly
easy to meet, using off-the-shelf tools:
- XML->HTML Server
- Some mechanism is needed to translate XML documents into HTML pages and
deliver them to web browsers. That way, documents can be browsed by anyone,
even if editing is restricted to a chosen few. XSLT mechanisms can handle
the translation, and servlets are fairly easy to build, so this requirement
is easily met.
-
- Source Control
- A CVS archive, for example, makes it possible to ensure that only one
person is editing the document at a time. CVS is not totally ideal, since
it is a plain-text system, but it will do until a good system based on XML-differencing
is devised.
-
- Notifications
- Given that the documents are shared, and that participants are remotely
located, it is important that each document has a notification list associated
with it, and that each person on that list gets a message when the document
has been changed. This requirement is moderately difficult, but not terribly
so.
The two requirements that pose the most difficulty are the issue-based discussion
language and the editor. Those requirements are discussed in the remaining two
sections.
An XML-based Language for IBIS
Carrying on an IBIS-style discussion in an online XML document requires an
IBIS-equivalent language. This section discusses the required language.
In true IBIS, fashion, let's start by asking, "What should the language
look like?". The table below presents one possible proposal, along with
synonyms and other interesting alteratives.
| Proposed Language |
Notes & Possible Alternatives |
<query status="">
|
issue, question, ques, quest |
<alternative status="">
|
candidate, claim, option, possibility, proposal, proposition
|
<pro>
|
for |
<con>
|
against |
<rating value="">
|
evaluation |
<endorse>username
|
vote, recommend |
Occur anywhere:
|
Contents: |
<info>
|
A single paragraph (node) |
<notes>
<item>+
|
A list containing one or more paragraphs (nodes) |
Attributes:
|
Values: |
query: status=
|
open, decided |
alternative: status=
|
unrated, rated, rejected, selected |
alternative: rating=
|
distasteful-1, implausible-2, viable-3, likely-4, favored-5
(or: preferred-5 ) |
query: rating_avg=
|
(average of ratings, as a single-decimal text string) |
Notes:
- The goal in selecting names was to choose the shortest possible intuitive
name. The word "alternative" is the least desirable according to
the shortness criteria. But it has the desirable connotation of "something
to be investigated", unlike a proposal or a claim. Also, when "Question"
is shortened to "Q", it is possible to reinterpret "Q &
A" as "Question & Alternative", rather than "Question
and Answer". That interpretation works especially well in a trouble-shooting
FAQ, where "How do I solve X?" can be answered with a number of
different alternatives. (Were it not for that desirable abbreviation, "option"
would be a more succinct choice.)
- Every node (element) in the discussion must obviously have some text associated
with it (and xHTML tags, as well, which allows links and formatting to be
added to the text). If the Schematron mechanism is used exclusively, that
text can lie directly under the node. But if other schema validation mechanisms
are to be supported, then a <content> element that contains the text
(and inline elements) will need to exist under each of the elements that can
have substructure. (Schematron is already required for decided-issue validation
-- to ensure that an issue is marked "decided" only
- Those few elements which cannot have substructure (<link> may be the
only one) create an undesirable exception to the principle that every element
has a <content> subelement. In the interests of regularity, such elements
might be defined with a <content> subelement anyway. Or, to prevent
that stupidity, Schematron validation could be required.
- If nodes wind up with <content> subelements, it is worth being clear
that a node "is" its content, while it "has" a list. Conceptually,
we want to think about nodes that have sublists. If a <content> subelement
has to be defined, that it should be an internal implemenation detail that
is not exposed to the user. So a user should see: "<query>What
should we do?" even though internally the data may be stored as "<query><content>What
should we do?</content></query>".
- All nodes have a 3-by-3 cube of attributes:
{content, list} x {created, modified} x {by, date}
For example:
content-created-by="Fred", list-modifed-date="xxx".
More Questions
Here are some more questions, and possible answers:
- ?-What to call the language?
- IBDL: Issue-Based Discussion Language
- LIDIA: Language for Investigation and Discussion of Issues and Alternatives
- BIBL (bible): Basic Issue-Based Langugage
- LIBL (libel): Local Issue-Based Langugage
- KIBL (kibble): Knowledge-oriented Issue-Based Langugage
- NIBL (nibble): Nodal Issue-Based Langugage
- QIBL/QUIBL (quibble): Questioned Issue-Based Langugage
- ?-other words for the acronyms above?
- ?-other names?
- ?-how to change an argument to "countered"
- Suppose you add an argument <con> against an alternative, and someone
counters that argument. How should the counter-argument and the final result
be stored in the argument hierarchy? Should the counter argument go under
the original argument? Should the <con> element change to some other
element, like <con-countered>? Should an attribute be added: <con
status="countered">? Or should a special <countered>
section be added so countered arguments can be moved to that section and
taken out of the way?
Note the recursion here. There is always an implicit question: "Has
this argument been countered?" Answering that question may well require
the <endorse> and <rating> on counter arguments. And since multiple
counter-arguments are possible, perhaps a <counter> element is needed
that can occur under <pro> or <con> elements. (But then, what
about counter counter-arguments, etc? -- the element structure would be
different at the top of the hierarchy than further down.)
Conclusion: <pro> and <con> tags should be allowed under
<pro> and <con> arguments. A <pro> under an argument indicates
support, while a <con> indicates a counter argument.
The DTD
Here is the Document Type Definition (DTD) for the language, assuming that
Schematron is required, and that <content> elements are therefore not
needed:
<?xml version='1.0' encoding='ISO-8859-1'?>
<!--
DTD for an issue-based discussion language.
-->
<!element query (#PCDATA | alternative | info | notes | query)* >
<!element alternative (#PCDATA | pro | con | rating |
endorse | info | notes | query)* >
<!element pro (#PCDATA | info | notes | pro | con | query)* >
<!element con (#PCDATA | info | notes | pro | con | query)* >
<!element rating (#PCDATA)* >
<!element endorse (#PCDATA)* >
<!element info (#PCDATA)* >
<!element notes (item)* >
<!element item (#PCDATA)* >
<!attlist query
status (open | decided) #IMPLIED>
<!attlist alternative
status (unrated | rated | rejected | selected) #IMPLIED >
<!attlist rating
value (distasteful | implausible | viable | likely | favored)
#IMPLIED >
Requirements for an Issue-Based Editor
The one requirement (besides the language) that poses the greatest challenge
is the requirement for a good XML-based editor, mostly because there are some
specific requirements for editing that normal XML editors are unlikely to handle.
Fortunately, Warner Ornstine has been working on an editor for the Extensible
Development Environment Project (eXtenDE)
that may prove to be extensible in the directions we need. That open source
development project is still in its early stages, so we can help define the
architecture that produces extensibility in the editor.
Here are the specific editing requirements that add difficulty to the project:
- Context Display
- When questions are indented under questions, and you're down several levels
deep in the hierarchy, it's hard to determine the context for the particular
question you're looking at. (Was it under the last question, or beside it?)
So the editor must have have some mechanism that makes it possible to determine
the context (ancestor hierarchy, and ideally relatives of those ancestors)
at any given point in the tree.
I believe Warner Ornstine has already anticipated this requirement in the
eXtenDE editor, at least in theory. My own idea of how this should be down
is tro drop a thin dotted line between the "+" icons that let
you collapse and expand nodes. When the cursor hovers near that line, a
popup could display the item at the top of the line.
-
- Multi-line
- As described in Design Notes for an XML Editor,
a good structured editor must be capable of displaying multiple-line elements,
and it must also be tree-structured. Most editors do one or the other,
but not both. Or they make you switch modes.
-
- Blank Sheet of Paper
- One user who reviewed an early version of the eXtenDE editor used this
image to describe the desired behavior for the editor. It's a great image.
Although each element is a distinct object, the appearance of the
objects in the editor should be as though they were on a blank sheet of
paper. In other words, it should not be necessary to select an entry to
begin editing it. Instead, the internal boundaries between elements should
be seamless, so that cursor motions take you naturally from one element
to another.
-
- Auto-Fill Attribubes
- These represent a novel concept in XML editing. But the editor needs to
store the date/time that a node was created or modified, along with the
date/time that the list under it was modified, as attributes of each and
every node. It also needs to store the identification information for the
person making the changes.
Note:
Ideally, it would be possible to note such attributes and elements as "auto-fill"
in the schema. (I somehow doubt that XML-schema has anticipated that need,
but haven't looked closely enough to tell.) The autofill=date attribute
would instruct the editor to store date/time info. The autofill=text attribute
would instruct the editor to prompt the user for the string to use -- and
to save it for future use thereafter.
-
- Ontology Mapping
- The ideal editor would also make it possible to change the elements used
in the display. For example, although <query> elements would be stored
in the file, the user may well want to display such elements using <Q>,
to conserve display space. The existence of the mapping makes a common interchange
standard easier to swallow, since you don't have to look at it if you don't
like it.
-
- Multiple Schemas
- The editor must stand ready to validate a document multiple times, using
different schema mechanisms, before it can declare the document to be 100%
valid. For example, the list of autofill-attributes might need to come from
a separate file, given that the autofill specification can't be specified
in a standard schema.
And, while in most cases a hierarchical schema provides more expressive
power, in this particular case, an XML Document Type Definition (DTD) is
the most natural way to express the structure relationships. The reason
for making that statement is the fact that DTD definitions are not
hierarchical. For highly structured data, that is typically presents more
problems than it solves. But in this case, the modular reusability of elements
is a big plus.
So, in a DTD, it's easy to specify:
<!element query (alternative | info | notes | query)* >
<!element alternative (pro | con | rating | endorse | info | notes |
query)* >
Since queries can have queries under them, and alternatives can have queries
under them, deep nested structures can occur, and be validated. In a hierarchical
schema, such recursion is much more difficult to specify.
Finally, the editor needs the ability to validate with the assertion based
Schematron
schema system. That mechanism achieves two important goals.
1) It allows content (text and inline tags) to be placed in the most natural
position in the element, as described in Design
Notes for an XML Editor. (However, since most schema mechanisms have
no such capabilities, a special <content> element may well be defined
anyway.)
2) It allows sophisticated validation, like ensuring that the status-attribute
for a query is "decided" only if no un-evaluated alternatives
remain.
-
- Validation Errors, Handling of
- The importance of validating documents brings to mind the question of
how and when the editor validates documents. On-the-fly editing ensures
that the document is always in a legal state. But such restrictions typically
hamper users who are making significant changes.
A better alternative is probably to provide a menu choice for validation,
and to automatically validate when saving a document (while making it easy
to find and correct errors). If validation fails, the user should be asked
if they want to save the document, before doing so. (Since they might say
say yes, the editor must be capable of reading in a document that fails
validation.)
-
- Hierarchy-aware Scripts
- Ratings for an alternative need to be accumulated and averaged among all
those who evaluate it. So the ideal editor will add a scripting capability
that understands hierarchy and can propogate information either upward through
the ancestor chain &/or downward through the descendant tree
Extensions to the Simple System
Given a basis for collaboration, it's easy to start thinking about extra features
it could have. It is too early to say whether it makes more sense to build a
more sophisticated system, or to extend the simple system. But I'll trot out
the ideas for extensions here, for future reference.
Display Filters
Given that XML data is stored, and that it must be converted to HTML for display
on the Web, it makes sense to think about the "Web-Based Intermediaries"
developed at IBM that Doug Engelbart has favored. WBI filters (or other such
filters) could for example sort alternatives so that the highest-ranking, most
heavily endorsed ideas appear first.
An interactive filter would let you choose which items you want to display.
You could filter out <info> and <notes>, for example, to get a better
outline of the arguments. Or you could eliminate comments by certain individuals.
You might want to put unrated items first in the list (because they have yet
to be visited, and are therefore blocking a decision), or last (so you can focus
on the most heavily rated, most thoroughly evaluated alternatives). Or maybe
alternatives should be ordered alphabetically by name, alphabetically by user,
or randomized to ensure unbiased investigation.
One filter would create the default view. It might operate when the data is
stored, to put things in the order that the group has determined to be the most
generally useful. Other filters might operate when the data is delivered. (Most
important, of course, is the interaction mechanism that lets you specify the
filter's operation.)
In the most ideal of all worlds, the editor would be able to understand and
interact with the filter mechanisms, so you could have the same display when
editing the XML that you can get when viewing the HTML versions.
Difference Highlighting
The ability to see what has changed is highly desirable. But adds several degrees
of complexity:
- Deleted Information in the File
- Maintaining deleted information in the file requires an additional attribute.
And it adds complexity when moving and inserting new information. But it
is necessary to see what has been removed, as well as what has been added.
-
- Difference Highlighting in the Editor
- In the best of all possible worlds, the editor will make it esay to spot
differences. Several possible mechanisms come to mind, including "modified"
markers in the file, an XmlDiff process against a saved version of the file,
or showing all changes after a given date/time.
(The date/time mechanism may be the only viable choice for shared multiple-user
documents for the simple reason that user "A" will want to clear
the change-highlighting on sections that have not yet been seen by user
"B".)
Note:
For the date/time differencing mechanism to be effective, deleted information
must be stored in the XML file so it can be displayed as a change. The correlary
is that it must be possible to hide such information when differences
are not being displayed.
-
- Basis for Differencing
- Since each document needs an associated notification list (to inform interested
parties when a document has changed) the last date that the user accessed
the document can be stored in that list. Changes after that date can be
highlighted. (A useful option would be the ability to include or exclude
your own changes). The control file can also keep track of modification
dates and the comments that summarize the changes that were made (as, for
example, in CVS).
Note:
With this mechanism you can turn off all highlighting all at once, or you
can leave it on, but there is still no good way to turn off highlighting
as changes are reviewed. Each client would need its own capability to do
so, and ideally that capability would persist between editing sessions.
If some mechanism were added to the language to handle this, it would have
to contain a pointe to local file that kept track of which changes were
visited and which were not.
Automated Implication Engine
A sophisticated query mechanism would be useful. Enabling such a mechanism
requires extending the issue-based discussion language to allow more complex
relationships to be described. For example, in a complex scenario, the answer
to one question impacts the answer to several others. Consider a case like this:
Q1: (query)
A1: (alternative)
A2:
A3:
Q2:
B1:
+-if Q1A1 (pro, if A1 is chosen for Q1)
/-if Q1A2 (con, if A2 is chosen)
B2:
+-if Q1A3
/-if Q1A1
Were the language extended to permit such statements, perhaps with tags like
<proif> and <conif>, it would be possible to:
- Make suppositions
- It should be possible to "suppose" that A1 has been selected,
and derive Q2B1 as a result.
(suppose Q1A1 => Q2B1). Playing with the system and making
different suppositions would make it possible to, in effect, experiment
with different designs.
-
- Find consistent sets of alternatives
- Given the conditional knowledge embedded in the system, it should be possible
to enumerate and list the set of possible designs, where each design consists
of the set of consistent alternatives. In the example above, the design
set would be
{Q1A1+Q2B1, Q1A2+Q2B2, Q1A3+Q2B2}
A special query tool would undoubtedly be needed for this purpose, but the
beneficial impact on design discussions would be considerable.
Conclusion
Using an XML-based discussion language and shared files, it should be possible
to engineer an efficient and effective tool for online collaborative discussions
(like design discussions), in a very short period of time. Other than the language,
the really big limiting factor is the existence of a good editor that is either
extensible or targeted for the purpose.
References