Summary
Speculations on what a knowledge repository might consist of, and how it might be used in the context of software design -- especially in an online design discussion.
1,800 words]
![]()
by Eric Armstrong
[abstraction sequences, complex abstractions, simple abstractions],
[CKC, content, context, design discussions, granularity, IBIS, knowledge, knowledge
nuggets, knowledge products, nuggets],
[concrete ratings, ratings, situational ratings]
This piece considers knowledge repositories -- how they might actually work, and how they might interact with a mechanism for carrying on an online design discussion. To keep things managable, let's focus on a concrete issue in system development: Implementing a linked list. (The same kinds of thinking undoubtedly applies in other areas, but focusing on software development has the greatest "bootstrap" effect.)
Note:
Lee Iverson has made a point of distinguishing the "content" (documents), "knowledge", and "context" aspects of a respository-based system. This paper represents insights gained from investigating what really resides in each of those bubbles (or, possibly, layers) -- especially in the "knowledge" arena.
In the knowledge layer, the subject "Linked List" has several useful subcomponents:
- Linked List
--what is a...
--how does one look...
--how does one use...
--how good is...
--how does one implement...(verbal description)
(diagrams, animations)
(examples, explanations)
(situational ratings)
(recipe or template)Notes:
- Of course, there are actually multiple different kinds of linked lists: singly-linked, doubly-linked, straight lists, circular lists, null-pointer terminated, and null-value or special-end-node terminated. But for now, let's keep things simple and assume that "linked list" means a singly-linked list...
- In a recent conversation, Art Freidman reminded me that there many possible choices for the API to such a list, as well. I'm just going to pretend I didn't hear that...
- In the Knowledge Technologies conference I recently attended, it became clear that Topic Maps constitute an ideal vehicle for linking to and identifying the many useful components of a "topic". (It is not sufficient to merely link to a thing. It is also necessary to know in advance what kind of thing the link points to, so you can select the links you want to follow. Topic Maps fill that role admirably.)
The first three are the kind of things you would expect in a good tutorial. They represent the first thing you would go to if you got the answer, "What you need is a linked list" in response to a question you asked on some interactive forum. In fact, they might well simply point to a section in one of Donald Knuth's books, for the clearest possible illustrations and explanations.
The last two subcompoents, situational ratings and templates, have some interesting characteristics and implications that are worth exploring.
Timeout: What is Knowledge?"Knowledge" in such a system can take several forms. The following list probably is not comprehensive, but it's enough to get started:
Returning now to your regularly shceduled programming, we'll get back to the topics of situational ratings and take a deeper look at templates... |
At the knowledge layer, ratings are necessarily provisional. Thus, a linked list is good if you can afford the extra space, don't mind a little extra overhead when you're looping, and you need to do a lot of inserts and deletes. On the other hand, for a fixed list, an array is typically going to carry less overhead. (But that evaluation, too, is situational -- in Lisp, a list is the *only* way to go.) But if at the knowlege layer ratings are provisional, for any specific project they are fairly concrete. One might argue, based on the characteristics of the project one is working on, that a linked list is appropriate. To give that argument some weight, one would reference the "knowledge nugget" that was stored on the subject of Linked Lists.
In other words, the design discussion (presumably an IBIS-style discussion carried on within the scope of the repository) defines the *context* within which the knowledge is used. (The knowledge, meanwhile, rests on a foundation of content. More on that in a bit.) So, in a design discussion, it would be possible to say, "I think we should use a linked list" and cite as rationale the user specs that say items will constantly be added and deleted, along with the "knowledge nugget" that gives Linked Lists a high rating for such purposes. The recipient of your wisdom (who may never have heard the term), can then get a tutorial on the subject from the knowledge base.)
Now consider a "recipe" as a collection of steps, or an ordered set. The recipe, or template, for a linked list algorithm will (to qualify as knowledge) be very abstract. If an actual implementation is available, for example, in legacy assembler code, then the recipe might well contain a link to that implementation. But the recipe itself would like something like the implementation comments extracted from the source code.
Now, a full implementation of a concept like Linked List is great, if one exists. The question, "How do I implement a linked list in language X?" can be answered with a pointer to the implementation. But what if you are working on a project in a new language? (As we'll see, idioms hold the key.)
It now makes sense to introduce the concept of an "idiom". An *idiom* is the syntactic mechanism for achieving a task in a specific language. For example, the "loop idiom" breaks down into for-loops, while-loops, and until-loops, each with situational ratings. For each loop, the specific syntax used in the C language constitutes the idiom for that concept in that language.
Now, given a template T, consisting of an ordered set {s1, s2, ...} of steps and I, a collection of idioms that can be expressed in the langauge {i1, i2, ...} the expression
produces a *knowledge product* -- literally, the product of two different kinds of knowledge stored in the repository. In this case, the knowledge product
KP = T x I
To return again to the Content-Knowledge-Context (CKC) picture, nuggets in the knowledge layer must (or should) have fine-grained links to items in the content layer.
Recall that the template for a linked list is only one nugget of knowledge stored for that concept. Related kinds of information is frequently important to producing an answer.
For example, should I ask "How do I implement a linked list in Java?", the Linked List topic *header* should link directly to the Java idiom:
new java.util.LinkedList()
In this case, no template-instantiation is needed! For a language like C, on the other hand, in which many implementations exist with different APIs and performance characteristics, multiple responses could be returned, with situational ratings for each. As another example, should you ask "How do I implement a linked list in Cobol?", a human developer might very well respond with the questions, "Are you sure you want to do that?", "What is it you're trying to do?", "Is Cobol the right language for the job?" "Do you absolutely have to use Cobol and, if so, could you consider using another structure that would be more suitable for that language?"
These are questions that a knowledge repository will probably not be smart enough to ask any time soon. If it *were* able to do that, so much the better! On the other hand, the knowledge repository *should* make it possible to deduce the implementation in Cobol, and put it forward in an online design discussion. Other members of the discussion, should then have access to reference material in the repository to support arguments for why the performance would suck, why another data structure should be used, why a different language should be used, etc. In this sense, the knowledge base is a partner to and support for the discussion, which is carried on the higher level, in the context of the discussion.
Note:
It turns out that Topic Maps do indeed carry the capacity for at least some forms of query classification. Because topics can have multiple names, and because the names can be attached to other topics to define the scope of that name (as well as the scope of the topic), a query system can at least be smart enough to find out what you mean by a given term. So for example if you ask for a "program guide", the system can ask whether you want information pertaining to software development, a marketing strategies, or the latest theatre release.
§ Home ·
Health · Software
· Dance ·
Essays · Links
§
www.TreeLight.com
Copyright © 2000
by Eric Armstrong. All rights reserved.
Contact me to send feedback,
register for updates, or make a donation to help support the site.