The (Re)Design of GEEK This document describes the (re)design of GEEK. First design- 17-Jul-1998. This revision: 14-Mar-2000 The Big Picture GEEK is the Gnu Epistomological Evaluation Kernel - a set of routines that provide evaluation and consistency propagation of epistomological object networks, both static networks and dynamically extended networks. GEEK is targeted toward any application where logic, mathematics, or domain knowledge is combined with (possibly incomplete) ground truth, to provide both higher- and lower-level information than was present in the original database and ground truth. In short, GEEK exhibits some of the properties of a thinking intellect engaged in rigorous analytical thought in some reasonably well-known domain. GEEK systems define a network of objects, which have values and relationships. The GEEK engine then walks this network, attempting to achieve a consistent (if not harmonious) state for all of the objects in the system. GEEK is similar to KnowledgeCraft and Rock in that it is an object-oriented, rather than procedurally-oriented, knowledge representation. However, it is written in pure ANSI C for ease of integration with other programming languages. GEEK is different than LISP for AI in that it uses consistency as a leading principle; GEEK is different from Prolog in that consistency is not mandatory. GEEK is different than OPS5 in that a relation is not refractory but a continuing requirement. GEEK is perfectly capable of computing with statements about statements, and as such, is subject to the limits of Godel's theorem, as well as Turing's. ------ Licensing, Legal Issues, and Business Model License: this design, and all derivative works, is licensed under the Gnu Public License, version 2 or later. Full details can be found at Free Software Foundation web site at www.fsf.org . Rather than simply publishing the design for GEEK and then GPLing an implementation, I am licensing the design itself under the GPL and including all implementations of the engine as derivative works. This means that: 1) ALL compatible implementations of this design ARE "derivative works" and hence are subject to the GPL distribution requirements, including source redistribution rights. and 2) additional knowledge bases and domain databases are NOT "derivative works" and so may be licensed and destributed in any way desired. I (the author and copyright holder) want to make it very clear that although the design, kernel and supplied individual evaluation functions are themselves licensed under the GPL, there is _no_ requirement that any databases, knowledge bases, or external programs written to use the kernel be licensed under the GPL, beyond the requirements in the GPL itself for linked static executable images. Thus, a strong business model can be made for the semicustom knowledge-base engineering model for various industries. Having been involved in the business of Expert Systems for more years than I care to think of, I can assure the reader that "the money isn't in the evaluator kernel, it's in the per-customer knowledge base.". To my knowledge, nobody has ever made a profit on an ES kernel, but several companies have made very good money on building and maintaining the customer knowledge bases, and having an ES kernel is necessary to do that. Trust me on this. Or not- do a business case study and see for yourself. ----- The Basic Pieces GEEK has only three important conceptual items, and a lot of window dressing. The three important items are Objects (named trees of attribute-value-flag tuples) Functions (procedures) Conventions (mutually agreed-on ways of representing knowledge) GEEK objects have names and a lot of associated attribute-value sets (the same attribute can have multiple values). There are also flags associated with each set, so the tuples are really AVF tuples. The AVF tuples are kept in a big multi-level tree for fast access, and each object is really just the root of one such tree. GEEK functions are called whenever an object needs to do something. Functions are not wired to any particular object; rather they are called "on" an object and ask that object various things in order to set themselves up. In this sense, GEEK objects are not "member functions". When GEEK is in use, the usual pattern of execution is that the UI or some other runtime toplevel system (which will usually be a separate program) will tell the GEEK engine to do something - usually to set, unset, create, destroy, connect, disconnect or otherwise alter some value in the epistomological base. The engine will do so, but in the process of doing so a number of functions will need to be called to change other values in the epistomological base. These functions, called _propagator functions_, are responsible for doing nearly _everything_ in GEEK. Unlike most functions in procedural languages, a propagator function does not usually return the result of an evaluation; it returns the _status_ of the evaluation ("OK - updated", "OK - no change needed" and "Error- that would be a contradiction" are examples of return values from propagator functions. In the meantime, each propagator function may well call _other_ propagator functions, to propagate the state change out to the rest of the system. To make life easier for all concerned, every propagator function that a particular object will need to call to propagate a new value is listed in the multi-level tree as a value for the multi-valued attribute "on-propagate". An object will often have a top-level "propagate-me" function which uses local knowledge to optimize propagation speed. It is recommended that the "propagate-me" function be used as the interface to activate the "on-propagate" methods. Additionally, there are several other functions that an object may have. Because they aren't tagged "on-propagate", they aren't called at propagation time, but at other times (such as object creation, destruction, instantiation, etc.) We'll go into these functions as they come up. Lastly, the propagation of values from one object to the next is eased by the "dependent" tagged attributes. Each dependent attribute points to another GEEK object that in some way is affected by the value(s) of the current object. Thus, updating an object is simply the act of calling all of the "on-propagate" functions in the object, followed by calling the propagator functions of each object in the "dependent" tags, recursively. ----- The "No Central Brain" Principle Now it can be seen how GEEK works - whenever a value changes, call the propagate-me function, which calls the appropriate per-object on-propagate functions. IF they all return a successful indication, then this object itself should return a successful propagation. Thus, there is _no_ "central brain" in GEEK, there are only a bunch of objects that hold values, a library of propagator functions, and a set of conventions that the propagator functions can assume all objects will adhere to. Because there is no central brain in GEEK, systems with chaotic elements or systems exhibiting emergent behaviors can be modeled as effetively as those that behave strictly as classical linear systems. Similarly, GEEK provides an excellent testbed for the study of emergent behaviors. Further, whenever a new kind of relationship is to be used, it is not necessary to redo the entire GEEK engine; rather it is only necessary to add the additional propagator relationship to the library of available propagators. A downside of this "no central brain" principle is that global transformations do not exist in GEEK. Although it would be possible to write an FFT or a global optimizer to run against a GEEK knowledge representation, there is no such function in the current design. ----- Dealing with Contradictions In some circumstances, a propagator function may be unable to succeed, and may be unable to change the local value to the desired state. In that case, the object must un-do it's changes, repropagate the undo, and itself return the failure. To make life simpler, an object's functions can change the value attributes of an object, or change the values of any dependent object. However, whenever an object gets it's value changed, the changer is responsible for making sure that the "on-propagate" functions get called, and for monitoring the return status of "OK" versus "Failure" and rolling back failures. These details are usually taken care of by the propagate-me function. It _is_ necessary for a propagate-me function to check to see if the value is actually different, and only activate the on-propagate functions if there is a real change. Indeed, this is how we avoid an infinite loop of : A lists B as a dependent B lists C as a dependent C lists A as a dependent Since the value of A is either mutually consistent with the value set by the B-C chain, or inconsistent, then one of two situations will occur: either A will be revisited with a consistent value attempt, or it will be revisited with an inconsistent value. In the first case, the propagation will cease immediately with "OK, no change", and in the latter with "Failure - contradiction." ----- Propagation, Contradiction and Propagate-Me Implementation Details: Here's a pseudocode fragment that describes how propagation and contradiction backout work. Note that this is the general system for all propagations, not just logical propagation. propagate-me: if new value is the same as the previously existing value: { return "OK, no change"; } if the new value is different than the old value: { create a record of the old values; append the old and new values in the back-out log; set the new values; fire off all of the propagation functions on the object; If all of the propagators returned "OK" { fire off "on-propagate" for each listed dependent ; return "OK - value updated"; } otherwise { set the value back to the stored old value; roll back the backout log pointer; return "Failure - contradiction"; } [... never get here...] } Note that it's not necessary to examine the "OK - no change" versus "OK - update successful", because either set-a-value immediately finds no change, or at least one value is changed, in which case the proper return value is "OK - update successful" The calling sequence for setting a value by propagate-me is: propagate-me ( *GEEKobject, long new-stickiness, void * AVF-tree NULL ) Implementation Note: we use a list of disembodied AVF nodes to pass values to be set, rather than a sequential calling sequence of unknowable length. We can then pass an entire (skeletal) template of an object to another object as an argument to propagate-me and all of the values in the originating object will be copied to the new object. Implementation Note: many places in this design could factor out a function-on-a-function which applies a function to every node of an AVF tree. We should write this function early on in the implementation and then use it as our generic tree-walker function. Proposed function-on-a-function to run a function against every attribute of an AVF tree that matches a particular wildcard: avf-apply ( *GEEKobject, ----- Objects in GEEK The base unit of knowledge in GEEK is the object. Objects act as the containers that hold values and functions, and are self-identifying. Even the relationship between A, B, and C in the logical statement A AND B = C has it's own object, so a total of four objects are needed to represent A AND B = C . Here are the details of implementing a GEEK object: typedef GEEKOBJECT struct { char *name; // SOI-wide unique identifier long soi; // identifies the SOI long lvalue; // value in N-level logic float fvalue; // value in a floating-point realm GEEKlist * slotlist; // list of properties of this object long flags; // a bunch of flags to do useful things. long default_logical; // preferred logical value long default_float; // preferred floating value long stickiness; // what resistance to change should this have? }; Objects might have (or might have a NULL): GEEKobject * archetype; // pointer to archetype, if any GEEKlist * function_data; // local data for the function to use GEEKlist * dependent_list; // list of dependent objects GEEKlist * justification_list; // list of justifications At any point, the slots above can be migrated between direct existence on the object as declared or as virtual slots on the slotlist. Since a slot is always accessed by an accessor function in the GEEK library, it's no problem to migrate slots between the explicit and the implicit. For example, the pref_logical value might be stored on the slotlist, or stored as a genuine slot; it really doesn't matter. Implementation Note: it may in fact be faster to use the slotlist to access all values than to use a big CASE statement, because the case statement is sequential search and the slotlist is a binary tree (with log2 (n) search time). Note that there's no explicit function on a GEEK object; functions are stored on the slotlist (more on the slotlist below). As a shortcut, the geek kernel provides the function int execute-by-name (*geek_object, char *func_name) which executes each of the functions in the object's slotlist that happen to be stored under the name func_name. Design Note: if we store the slotlist lists as trivalent trees (less-than, equal-to, greater-than) then what kind of algorithm can dynamically maximize speed of reference by node migration? Maybe something like AVL or node-stepping might make it more efficient to find the desired slotlist entries.) Design Note: the calling sequence for all propagator functions in GEEK is this: int propagator ( *geek_object, *geeklist_my_entry); In this way, each GEEK propagator called gets a pointer to the overall object, and a pointer to it's own argument list. Design Issue: in this calling sequence, the propagator does not know what changed; perhaps that itself should be considered, in which case every "on-propagate" function can have the same calling sequence as "propagate-me". This is a point to consider in favor of having "propagate-me" get it's update values as a list of disembodied AVF nodes rather than as sequential calling-sequence values. ----- GEEKlists - What Are They? A GEEKlist is a tagged data structure that implements multiple values in a log-N binary tree. Each GEEKlist cell is trivalent, with a pointer to a char* tag, a pointer to a datum, a flag word, and the pointers to the equal-to, before, and after binary subtrees. For the curious, here's the definition. typedef struct a_GEEKlist { char *tag; void *datum; long flag; struct GEEKlist *before; struct GEEKlist *equal; struct GEEKlist *after; } GEEKlist; The flag word is meant for future expansion; it really isn't needed since BIBOP will eventually make all objects self-identifying. But at first, I suspect we'll cheat and use some bits of flag to indicate object typing. Note 1: Since it is expected that GEEK will use a BIBOP method of object allocation, it is not necessary to have a tag word associated with each value. Instead, the BIBOP allocator can be handed the pointer and return a type indication. Note 2: BIBOP stands for Big Bag Of Pages. The base concept is that for each type of object one might want to store and maintain zero-overhead typing information on, one allocates a range of pages just for that object type. For example, one might allocate a range of pages for GEEKlist cells, another for strings, etc. Whenever the object-identifier function is called, it can merely see what range of pages the passed object resides in, and from that determine type. ----- GEEKlists Can Form Trees Note that the datum of a GEEKlist itself can be another GEEKlist, so it is entirely possible to have a data structure that's a tree. We use the Unix file system syntax to denote this. For example: FOO/FUGGA coffee means: in the list called "slotlist" find the element with the tag "FOO" then, in that element's list, find the element tagged "FUGGA" and set that to the object "coffee". which would be expressed as: slotlist --> gcell1.tag --> "FOO" gcell1.data --> gcell2.tag --> "FUGGA" gcell1.before.. gcell2.data --> coffee_object gcell2.before--> ... Important Thing To Remember: the subtrees in the GEEKlists are _not_ the trees of binary structure in the trivalent tree. The data structuring above is what's visible to user code, the underlying structure (using .before, .after, and .equal) is seen only by the accessor functions. Second Important Thing To Remember: an object pointed to in a GEEK cell is really "pointed to", as in a machine-level pointer sense (so the C syntax of "*" and "&" are appropriate). This is an important speedup at runtime, but has strong implications on load/save and garbage collection. ----- Forward References of Object Names In the above example, we had not yet defined the object "coffee". This is called a "forward reference to an undefined identifier". It might or might not be considered an error; some language designers (such as Wirth) would say that it is an error. Others (such as myself) claim that it damages the readability of the source file to have the first thing in the file be the lowest-level description of the system, eventually working up to the highest level as the very last thing in the file. Design Decision: Object references that might be forward references or might be errors are held in abeyance until file loading is complete. Implementation Note: this can be done by having the pointer point to string space "temporarily". If the string is referenced, the hash table is searched for a matching object name, and if found, the string pointer is replaced by the address of the object; if not found, it's an UNDEFINED OBJECT ERROR. Implementation Note 2: to provide error-checking, undefined forward references can be kept on a lookaside list and removed as the objects are defined. If any remain after all file loading is complete, the appropriate error message is issued. ----- Multiple Values With The Same Tag As denoted above, a slot can hold multiple values with precisely the same tag string, and without some restriction or syntax around this, there's no manageable way to access elements multiple elements with the same tag. Therefore, we will introduce the '#' syntax which is similar in concept to a "gensym" but retains tagging information. A '#' name contains a alphanumeric tag, the character '#', and an integer, all concatenated together into a null-terminated string. The numbering starts at 0 and continues upward, with leading zeroes suppressed. Thus, the first datum tagged "foo" is actually "foo#0", the second is "foo#1", and so on. The '#' syntax is also used to number multiple objects created as instances of an archetype object (see below). See below, under "Archetype Objects" for more details). ----- The System Of Interest (SOI) There is a "root" object in any GEEK system. That root is the SOI, or System Of Interest object. There can be more than one SOI object, but only one at a time can control the context of the GEEK library. The SOI object is like any other GEEK object (indeed, the SOI usually _is_ a GEEK object). The one promotion that the SOI has is that the SOI is the object that is referred to whenever a syntactic reference is made to "the root". Although only one SOI is active at any one time, there is a SOI stack that keeps track of which SOIs were active before the current SOI. The previous SOIs can be returned to, in sequence. (this SOI stack may in fact be implemented as local values on the hardware call/return stack) A very powerful aspect of this nesting of SOIs is that the context of a search or configuration can be constrained to the few objects or archetypes that might actually be useful to that computation, while leaving the "outside" computation intact and in a state of suspension. For details on this, see below on the activate_soi() and deactivate_soi(). The SOI carries the "catalog" of available resources which is used by some of the HAS/NEEDS propagation functions, and the catalog of available plugs, jacks, and placks. [ Brain-bending note: an archetype can be shared among more than one SOI. Hopefully, an instantiation object can also be shared, but that has some very interesting aspects about how SOIs might need to be automatically activated and deactivated. Perhaps the real meaning of an SOI should be narrowed to be "a set of objects and archetypes that are implicitly connected in some nonexplicit way, such as availability for search or connection. Consider this to be "open design issue" ] ----- Syntactic Sugar Department At this point, we need to start describing some example representations in GEEK. To do that, we need to introduce some syntax. Note that this is NOT the only possible textual description of a GEEK environment; other styles of description are encouraged. (In particular, a graphical environment would be very nifty. Yes, that's a hint. :) ) Syntactic Sugar Department: GEEK syntax for the current implementation of the interpreter frontend is as follows: Line length input buffer is limited by implementation, but will be at least 1024 characters. Indentation is _optional_. Recommended indents are to one tab per level of structure indirection. The statement separator is the semicolon ";" or a newline The item terminator is any of the following: 1) any whitespace ( space, tab, or newline ) 2) semicolon IF the first item in a statement is a recognized keyword, then it _always_ represents that keyword. Keywords are: archetype - starts an archetype definition object - starts an object definition IF the first item in a statement is a recognized shortcut, then it's that shortcut. Shortcuts are: soi - starts an object def'n, and sets the SOI object - starts creation of a new object archetype - starts creation of a new archetype function - adds a tag attribute and function pointer propagator - adds the datum tagged as a propagator function to the slotlist assert - adds the datum tagged as a propagator to the slotlist, sets the value to "TRUE", and stickyness to STUCK, and sets the stage for additional data to be added as slot-within-slot data dependent - adds datum tagged "dependent" to the slotlist as well as a reciprocal "dependent" link. IF the first item in a statement is not a recognized keyword, then the first item is taken as a slot reference, relative to the most recently declared object. (that is, declared as "object ".) Thus, it's _optional_ but not required to wrap all of the slots of an object inside "{" and "}" as described below. ----- Syntactic Sugar Department: Slotnames have syntax as well: slotname strings use the following syntax; wildcards _are_ allowed on any but the first reference (the first reference defines the slot, and it makes no sense to define a slot name that contains a wildcard. Well, at least it makes no sense to _me_.). IF it _starts_ with a '/' , then it's a global slot reference (that is, relative to the current SOI) and the first thing after the '/' should be the name of an object or the name of a slot in the current SOI) IF it matches a local slot reference tag such as "foo", it refers to the first item with that tag (e.g. "foo" is really "foo#0" ) IF it contains a '#' followed by an integer N, then it's '#' syntax and refers to the Nth instance that matches the tag in full. IF it contains a '#' followed by NO integer, it refers to the first encountered value of with a matching tag (essentially wildcarding on the "#N"). IF it contains a '#last' it refers to the _last_ value with a matching tag. Otherwise, it's a new local slot (which might include use of the "'/' notation to get to a subslot.) IF it _starts_ with a '{', then it's the start of a group of subslot references, all referring to the slot created or referenced most recently. To end the subslot group, use a '}'. It _is_ permitted to nest subslot groups. Implementation note: this implies that there's a slot-context stack in the interpreter. Syntactic Sugar Department: if a statement can accept any of an object name, a slot name, a numeric value, or a string, then the followng default is taken: IF it's a valid floating point number (E notation is allowed), then it's a floating point number IF it's a valid integer, then it's an integer IF it's a quoted string, then it's a null-terminated string IF it's an object's name, then it's a pointer to that object IF it's a valid slot reference, it's the value of that slot. IF it contains embedded "/" characters, then it's a subslot reference, and is the value of the subslot pointed to. IF it starts with an '&' followed by a valid slot reference, it's a "transparent forwarding pointer" (TFP) and will hereafter have the value of whatever the valid slot reference had, for both reading and writing. (what's actually stored is the string "&whatever", so there's an additional level of interpretation present whenever a TFP is used. Aiming a TFP at a slot that contains an object can have very large effects.) ELSE it's a "pending pointer to an object". (you _must_ quote strings!) [OPEN ISSUE] [ how do we address an object in a different SOI?] Note the "transparent forwarding pointer" clause above. This implies that if slot A forwards to slot B, and slot B forwards to slot C, then A has the effective value of whatever slot C contains. This can be very powerful when such forwarding slots are placed on the root object. [ Design Note: this particular mapping is intentional as it is both familiar to Unix programmers as well as directly connectable to file systems such as ReiserFS, which is optimized for handling small files efficiently. ] This of course begs the question - what is the meaning of ".."? In the Unix FS, ".." refers to the diretory "one level up", but since the SOI is always the immediate "one level up" of each object and architect in an SOI, the ".." can only mean "in the SOI." But that means the same as "/", that being the root. How then do we access previous (i.e. "outer" SOIs?). Can we co-opt the meaning of ".." to indicate the previous SOI, as in "/../foo" to mean the object named "foo" in the SOI previous to the current SOI? Something to think about, leave open. Design Note 2: given the above, it's entirely reasonable to have a SOI be just another kind of object... and/or allow nested SOIs. Seriously funky this is... but maybe very useful for hypothesis-based engines. (n.b. indeed, this is the suggested implementation method. )] ----- Relationship Objects In GEEK, "relationship" objects are objects of 'first class', like datum objects. For example, the AND of A and B being C could be described and "enforced" by: object this_AND function { propagator LOGICAL_AND antecedent A antecedent B consequent C lvalue TRUE stickiness STUCK } ... which says that the states of A, B, and C are: 1) linked such that the statement "A AND B = C", 2) the truth value of the above is TRUE, all the time, 3) and this truth value is STUCK in the true state, so no propagator function will try to set it FALSE. This is implemented by adding "this_AND" to the dependent lists of each of A, B, and C. Whenever A, B, or C have their values changed, this_AND is notified and runs it's propagator function (in this case, "LOGICAL_AND"). In turn, LOGICAL_AND checks each of the inputs A, B, and C (it's antecedents and consequents) and if necessary, alters them, thereby triggering yet more local propagators (if any of A, B, or C are altered). Several possibilities can arise: Say A, B, and C are all unknown. Then if A is set true, no further propagations can arise and so LOGICAL_AND can return OK without other evaluations. If A, B, and C are all unknown, and A is set FALSE, then we don't need to know B to know that C must also be FALSE. Thus, we can set C FALSE and call C's propagator functions to cause further propagations (if needed) from the C's dependent list. Note that one of those propagations is right back to "this_AND" which can immediately confirm "OK - no change" and return back to C. Optimization - figure out some way for an object to not fire back on all members of the dependent-objects list. Not all that important, since worst case is a factor of two in callbacks and this back-and-forth negotiation might actually be useful in some problem-solving strategies. If A is already true, and B is set true, then C is set to true and the C "set-a-value" function is called, to propagate the new value of C by checking through C's list of dependency objects. Finally, if C is set false and one of A or B is already known to be true, then the other of A or B must be false and we again propagate this state. ----- Basic Logical Operations The GEEK library supports the following basic logical operators: AND OR NAND NOR XOR IMPLIES NEGATION N_OF_M TRUTH_TABLE XOR is defined as the arithmetic-mod-2 XOR, so with ten inputs, XOR's output is logically TRUE if 1, 3, 5, 7, or 9 of the inputs are TRUE. N_OF_M is a parameterized operator; it allows two parameters of the minimum and maximum number of elements that are set to true. For example, N_OF_M (min 2, max 5) of A, B, C... Z would be TRUE if between 2 and 5 of A through Z were TRUE and the rest were FALSE. It's common to say N_OF_M (min 1, max 1) to indicate that exactly _one_ of the arguments must be TRUE and the rest must be false. Setting any one of the arguments TRUE will force all others false, and if all but one of the arguments is set FALSE then the last will be forced true. TRUTH-TABLE is also parameterized, and has complete flexibility in respect to functionality. The first parameter of a TRUTH-TABLE is how many arguments exist; the second parameter is a comma-separated, coded string of the logical state outputs of the relationship for that input set, starting at [0,0,...0] (all false), [0,0,...1] (last variable true, all else false), up to [1,1,...1] (all true), coded as T, F, and U, with certainty optionally coded with a prefix of ++, +, -, and -- to indicate output stickiness (stickiness is described below). For example, a truth table relation that implements the "XOR" function over two variables would be written: assert TRUTH_TABLE { parameters 2 table "F,T,T,F" } Design Option: we can allow the truth table to further encode input certainty and stickiness as well as the logic values, i.e. a truth table for one variable's being stuck: STUCK TRUE --> TRUE STICKY TRUE --> FALSE TRUE --> FALSE UNKNOWN --> FALSE STICKY FALSE --> FALSE STUCK FALSE --> TRUE Design Decision: Later, dude. For now TRUTH-TABLE looks only at logical inputs and takes the union of possible inputs as the outputs. ----- Stickiness and Propagation Stickiness is used to determine which value or values is altered by a logical propagation function. Positive values of stickiness correspond to inhibition of change during propagation; negative values indicate an increased propensity to change. A zero value of stickiness indicates a neutrality to change. Several predefined values are available : S_FREE - maximum freedom to change. Represented as the most negative long S_LOOSE - somewhat less freedom to change. Represented as S_FREE/2 . S_NORMAL - slightly resistant to change. Represented as zero. S_STICKY - quite resistant to change. Represented as S_STUCK/2 . S_STUCK - minimum freedom to change. Represented as the most positive long It is a design suggestion that propagation functions examine the stickiness of antecedents and consequents in order to choose which variables are most suitable for alteration. Since stickiness is defined (for now, at least) as a long, at least 2^32 different values and gradations of stickiness are available. ----- Figure-6ing the database - or "epistomological ouroboros" One danger is that it is possible to have a system such as A or B = C C implies D D implies B In this situation, if A is set TRUE, then B, C, and D are all set to true as well. However, when A is set back to UNKNOWN (or even to FALSE) then D still says that B is true, and B says that C is true, and C says that D is true... so B, C, and D all remain TRUE even though there is no real logical basis for this. The database of values will remain wedged in this state until some reset is performed. There are several ways to deal with such situations: 1) basis states: each logical value carries pointers to the set of basic values that "cause" that value to be so. When one of the values changes, each state is reset to UNKNOWN and then allowed to repropagate. 2) epistomological loops such as the above are just bad design... shoot anyone who is so foolish to think that they are useful. 3) detect such loops and issue an error message 4) detect such loops and insert a dependency-breaker. This de-circuits the loop gracefully. our example above: A or B = C C implies D D implies B can be rewritten (adding " and A" to the last statement as a dependency-breaker): A or B = C C implies D D and A implies B so the logical implication loop is broken at the reconnect. Design Decision: For now, we'll ignore the problem, because any of the four solutions is more than adequate and any of them can be retrofitted later. ----- Arithmetic relationships GEEK allows arithmetical relationships as well as logical relationships. One difference is that in general, an arithmetical relationship cannot provide an output until all inputs are known, and vice versa. For example (note that antecedents supplies a list of multiple antecedents on one line): object this_totalizer assert SUMMATION antecedents A B C D consequent MY_SUM will set the floating-point value MY_SUM to the sum of A, B, C, and D. Likewise, changing D will attempt to alter the values of A, B, C and MY_SUM to accomodate the new value. Because GEEK does not do global constraint checking (yet - this is a known hole in the design) arithmetical operations are useful more in terms of keeping totals than in terms of solving problems of irregular structure. This isn't to say it won't work (stickiness and recursive propagation can do a lot) but it's less efficient than a purpose-built solver for such problems - IF such a solution is known and tractable. Otherwise, the GEEK local-computation model may provide a workable solution to an otherwise difficult problem. For example, it would be fairly trivial to write the propagation functions for a finite-element structural analysis program to fit into the GEEK knowledge representation format. The supported arithmetic relationships in GEEK are SUM DIFFERENCE PRODUCT QUOTIENT REMAINDER MOD MAX MIN ----- Contagion of Stickiness As you might note, when trying to find a set of variables to satisfy a SOI, it is often the case that it can be determined that a variable will have stickiness of at least the minimum of any of it's antecedent relationships. This stickiness carries forward and can often allow the GEEK propagation functions to avoid explorations of the parts of the solution space where a solution can't exist because of outside constraints. It also indicates that when a propagation function is written, it should update not only the value of an object, but also the stickiness. This updating should happen both on propagation of a truth value and on _retraction_ of that truth value if needed. TRUTH_TABLE relationships are an exception to this determination of contagion of stickiness, as the output of a truth-table can be radically different in stickiness than the inputs. ----- Contagion of Certainty Certainty is different than stickiness. Certainty is used when dealing with "noisy" inputs, where in English we would like to say something like "I am pretty sure this is TRUE, but there's a chance it's really FALSE." This isn't full fuzzy logic (that's a reasonable extension) but it's a start. Fuzzy logic can be added later but it will require awareness in all of the propagator functions. For now however certainty is pretty much unused. Design Decision: don't use certainty at all for now, in anticipation of adding fuzzy logic later. ------ Equivalence between Logical and Float Values In GEEK, an object has both a logical value and a floating-point value. The mapping between the two will eventually be determinable by the user, but for now the mapping will be: if > 0.0 then TRUE if <= 0.0 then FALSE if TRUE then 1.0 else 0.0 Note that this setting has to take place _before_ the dependent object list is checked, because we don't know if a dependent object uses the logical or the float form. Design Extension: put a per-object function on each object to permit arbitrary mapping between the logical, float, and private_data forms of value. The function names CTB (Convert To Boolean) and CTL (Convert To Real) are hereby reserved for this conversion. ----- Stickiness of Logical Relations In all the relation examples we've seen so far, the logical "value" of a logical relation is STUCK TRUE. This means that the logical condition is required to be satisfied at all times. This is not always a desired case. Indeed, it is useful to have logical relations whose output is not a side effect to other values, but rather the result of the query "does relationship X hold for the arguments given". Note that for the simple relationship "A and B = C" then there is no difference between using a STUCK TRUE relation object and a FREE value object C versus a FREE relationship and a STUCK TRUE constant for C. Note also that if it cannot be determined what the logical value of an argument is, it is entirely possible that the logical value of a relation-bearing object may be FREE UNKNOWN. ----- Single-Valued Nature Of Objects As will soon be seen, a GEEK object may have numerous attributes, and each attribute may have multiple values, but certain attributes are, by convention, not beset with multiple values. Chief among these are the "lvalue" and "fvalue" attributes, which respectively hold the logical value of an object, and it's floating-point equivalent. (of course, if you _do_ put multiple output values on a GEEK object, it's up to you to make some sense of the results you get.) ----- Archetype objects An archetype object is an object that isn't really there - it exists only as a model which can be easily instantiated as needed. More than one instance of an archetype object can be created (and often is). Archetype objects are denoted as "archetype" instead of object. They are called into existence as other objects are created, either explicitly or to satisfy relationship requirements. The instantiation of an archetype object itself can cascade the instantiation of other archetype objects as well. When an instantiation is created, it is an exact clone of the archetype object except that the name carries an appended index, specifically #n. Thus, a ping-pong-ball archetype is: archetype ping-pong-ball ... and the third instance of a ping-pong-ball created has the name: object ping-pong-ball#3 For this reason, it is recommended that the '#' sign not be used inside an object name or name wildcard unless you really know what you're doing. Design Decision: issue a warning whenever an object or prototype name with a '#' is encountered. ----- Archetypes and Inheritance Archetypes are objects which themselves are instantiable repeatedly; however this does not imply that the properties of an archetype are inheritable. Inheritance of behavior (slots, functions, etc) is provided by the "inherit" statement. Inherit statements are load-time directives to insert the behaviors of an object into another object. Changing the inherited object after the fact does _not_ change the behaviors of objects already created; those objects are complete unto themselves. "Inherit" would be equivalent to a cut-paste operation except that the class hierarchy "knows" about it. For example (this isn't quite legal syntax, and it ignores the reality of "iced coffee", but bear with me) archetype hot-beverage assert temperature > 150 archetype coffee temperature 170 inherit hot-beverage assert contains-caffiene TRUE archetype expresso inherit coffee assert strength > 500 Additionally, instances of "coffee" and "expresso" are valid objects to match a "forall" and "instanceof" class restriction (see below for an explanation of "forall" and "instanceof") Implementation Detail: the reason to always do cut/paste when creating an object with inheritance is to avoid a figure-six in the inheritance tree. For example, if we used the alternative (linking pointers) then if someone were to use inheritance "wierdly" (say, A inherits B, B inherits C, and C inherits A) then an attempt to actually READ the slots of A would cause an infinite loop around the slotlists.) ----- Wildcarding on object names, ancestry, and properties If a relationship object contains wildcards (expressed as "*"s in the dependency list, then the relation itself must be an archetype object. [ This is because a particular relationship object is always pointed to and from by object* pointers, so there isn't any level of interpretation to slow GEEK down [excepting of course a transparent forwarding pointer, if one exists ] ]. When the relationship object is instantiated (which can happen either manually or automatically), the wildcards are removed and replaced by pointers to the actual object in question. The wildcards can wildcard on the archetype name, on the properties list contents, or on the relationship the object has with other objects. ----- Instantiation Functions When an archetype is created, it automatically gets a function tagged "instantiate-me". Like it's cousin "propagate-me", the "instantiate-me" function runs all of the "on-instantiate" functions belonging to the archetype. If special initializations are needed, they can either be tagged "on-instantiate", or if the initialization is complicated enough, "instantiate-me" can be replaced with a more appropriate function. Individual on-instantiate functions are responsible for setting up relations that are inherent in the object, and for triggering instantiation of auxilliary objects (see below). The instantiate-me function and on-instantiate subfunctions have an inverse as well- the "delete-me" top function and "on-delete" sub functions which are executed when an archetype object is deleted. [Question: should the delete-me functions be required to do unlinking of mutual dependencies, or should we do that in some other way? If we do it by lazy checking, we have to incur overhead of existence-checking on each propagation. If we remove dependencies of deleted objects actively, then we can speed up propagation by assuming that every listed dependency object really does exist. This isn't to say we shouldn't have a magic number in the object flags to trap activations on a deleted object for erroroneous access, but that one test can be localized and turned on/off as the application requires. ] ----- Linkage Objects A major part of GEEK is that objects can be related to each other in defined ways. Like logical relationships, linkage relations connect multiple objects. Linkage relation itself can be reinforcing or inhibitory to other linkage relations. Linkage relation objects can create and place instances of archetype objects in relation to other objects. The basic functionality we desire is to support the following type of situations: A: Connection associations * An object of type A always has a second object of type B associated with it. (and may be empowered to create that object if needed, and/or delete it when no longer needed.) * An object of type A may optionally have an object of type B associated with it. * An object of type A always has Q, R, S, ... objects of type B (this is the generalization of the first two cases) * Objects of type B may be archetypes or superclasses, in which case instances of any subclass may be suitable. * The superclass/archetype object may carry a function which encapsulates a priority or other method to choose which subclass is most appropriate to instantiate. B: Production/consumption relationships * An object of type A produces (up to) X units of some named resource * An object of type A can act as a conduit for (up to) X units of some named resource * An object of type A requires (and consumes) X units of some named resource * An object may be the producer or consumer of more than one named resource; names of resources are important and not confused. C: Exclusivity relationships * An object of type A may have a relationship with an object of type B only IFF there is no object of type C already associated with the B. D: Bundling of relations * An object may require that multiple relations all be satisfied by association to the same object. For example, an ethernet PCI card must have it's power, PCI interface, and bulkhead area all supplied by the same computer motherboard. ----- Representation of Connection, Production, and Exclusivity The model we propose to implement the CPE relationships is based on a plug-and-jack model. Of course, these are propagation functions just like AND and OR, and the functional propagation works in the same way. The model is as follows: each object may have zero or more plugs and jacks available. Plugs and jacks represent connectors of opposite gender; a plug can connect to a jack, but a plug cannot connect to another plug, nor a jack to a jack. A kind of connector called a "plack" fits another plack, and allows a non-chiral, homogeneous connections between peers. A good example of a plack is a knot between two ropes. The "truth state" of a plug or jack relationship determines whether the plug or jack must be used, can be used, or must not be used. STUCK TRUE implies that the plug must be used, while STUCK FALSE means it must not be used. Intermediate logical states describe the condition under which the plug or jack may be used. Because this logical state is shared with every other propagator on the object, the plugs and sockets of an object can be thought of as "ganged together"; they connect and disconnect from the rest of the SOI as a unit. When it is not desired to have this "as a unit" connection, the conventional representation is to have an ancilliary object which carries each set of plug and jack functions represent each ganged connection; the connection between the ancilliary object and the primary object is a connection object of type "hard_link" which passes dependency information but is not dependent on the states of either object. ----- A Simple Example of Connection As an example, an Ethernet card must be plugged into a PCI slot (assume that this is not an ISA ethernet card). Thus, the Ethernet card object has an ancilliary plug object that is asserted (i.e. "STUCK TRUE"). The motherboard, on the other hand, has three PCI slots, of which any _can_ be used but none _must_ be. Thus, these PCI slots are not ganged together (and so must not be all objects of the motherboard directly). Each of these PCI slots is represented by an ancilliary object, with logical state LOOSE UNKNOWN. Note that the ancilliary object cannot be connected to the primary via the standard plug/jack connection, because that would still be a "ganged connection" at the primary object's end. So, we set up the connection between primary and ancilliary objects with an on-instantiate function, and specify the connection function as "hard_link", which does not use the logical value of the ancilliary object to maintain the linkage between primary and ancilliary. Design Option: "hard link" may want the ability to pass specified resources in and out of an object. (note: we hereby add a shortcut for create_and_hard_link that's more euphonous, something like "auxilliary" would be easier to type and more apropos with respect to the conventions of GEEK) Here's the example of the motherboard, PCI slots, and Ethernet card: archetype PCI_slot jack standard_pci_interface lstate UNKNOWN sticky LOOSE archetype motherboard on-instantiate auxilliary PCI_slot on-instantiate auxilliary PCI_slot on-instantiate auxilliary PCI_slot object my_motherboard inherit motherboard object pci_ethernet_card assert plug standard_pci_interface When this is loaded, first the archetype for PCI_slot is created, then the motherboard. As the motherboard is instantiated, it instantiates three PCI_slot objects (named PCI_slot#0, PCI_slot#1, and PCI_slot#2 respectively). Each of these objects is hard-linked to the motherboard, so they show on each other's dependency lists but the relationship is not altered by changes in logical state. Each of these PCI_slot objects carries a single PCI_slot_interface, of chirality "jack", with state LOOSE UNKNOWN. Thus, the slot can be used or not, without causing any contradictions. When the pci_ethernet_card object is instantiated, it has a plug function asserted, it too has a PCI_slot_interface, but this propagator is 'asserted', meaning that the logical value and stickiness are forced to state STUCK TRUE. Thus, for contradiction to be avoided, the plug must be connected to a socket of the appropriate interface type. Since there is only one way for this interface requirement to be satisfied, the pci_ethernet_card must be connected to one of the auxilliary PCI_slot objects. ----- Implementation Of Basic Connections The implementation of connectins in GEEK follows the standard GEEK conventions of each object being self-controlling and that there is, as usual, "no central brain". Each object performs only locally-referenced observations and actions, and causes other objects to make their own accommodations to overall state In terms of efficient implementation, the SOI is used as a clearinghouse of available interfaces. More specifically, the SOI contains looksaside lists of each kind of plug, jack, or other resource that might be required by an object requesting or requiring that resource. More specifically, available connections in the SOI are stored as multiple values under the tag "available_connections", with subtrees labeled with each kind of available connection and carrying the value of the ancilliary object offering that value. For example, the above PCI_slot objects would, on instantiation, add the following values under the "available_connections" tag of the SOI, prefaced with the type of connection required (e.g. plug, jack, plack, resource, etc) available_connections -> jack:standard_pci_interface : PCI_slot#0 and after all three PCI_slot objects had been instantiated: available_connections -> jack:standard_pci_interface : PCI_slot#0 -> jack:standard_pci_interface : PCI_slot#1 -> jack:standard_pci_interface : PCI_slot#2 When the pci_ethernet_card object instantiates, the instantiate_me function attempts to propagate the plug propagator function. The plug propagator function first attempts to find an already instantiated, by inspecting the SOI's available-connection list. The plug function looks for something that matches it- specifically a resource tagged "jack:pci_standard_interface". It finds this on the SOI's connections list, and finds that the object that can satisfy this requirement is PCI_slot#0. The plug function then sets itself to point to the PCI_slot#0 object, specifically, sets plug -> connected-to : PCI_slot#0 and tells the PCI_slot#0 object's connection to point to the pci_ethernet_card object, to set the logical value of the PCI_slot#0 object to STUCK TRUE, and then tells the PCI_slot#0 object to propagate itself. (this again is the standard paradigm of GEEK propagation- telling the object what to change, and to propagate those changes; the object itself knows what to do). More specifically, the pci_ethernet_card passes the PCI_slot#0 object a few AVF packets to change as follows: { attribute: jack/connected-to value: pci_ethernet_card flags: } { attribute: dependent value: pci_ethernet_card flags: APPEND } The PCI_slot#0 object saves the old value of connected-to (which is "no previous value" and then proceeds to propagate itself via the propagate-me function. The PCI_slot#0 object has only one function on it's propagate_me list, and that's the "jack" function that controls whether the object should be on the available_connections list of the SOI. In this case, the jack function sees that it's connected to the pci_ethernet_card object and thus the PCI_slot#0 object should _not_ be listed on the available connections list of the SOI. So, the jack function takes itself off of the available_connections SOI list and returns "OK - updated" to the PCI_slot#0 propagate_me function. The PCI_slot#0 propagate_me function then finds no further on-propagate functions to be called and so returns with success to pci_ethernet_card "plug" function. The "plug" function then returns success to the pci_ethernet_card propagate_me function, which then continues calling the on-propagate functions of the pci_ethernet_card. There being no further on-propagate functions on the pci_ethernet_card, the propagate_me function returns success to the card's instantiate_me function, which then returns success. Thus, the card is _automatically_ connected to the first slot of the PCI bus simply by the processes of instantiate_me, while adhering to the basic principle of "no central brain". ----- Instantiation Of Missing Resources In the above example, we were careful to instantiate the PCI_slot auxilliary objects before we instantiated the ethernet_pci_card, so when the card looked on the available_connections list, it would find a suitable pci interface. If we had not done it in this order, (say, we instantiated the ethernet_pci_card first), the ethernet_pci_card would have attempted to instantiate but the instantiate_me function would have called the "plug" on-instantiate function, which would have looked on the available_connections list, found nothing, failed to instantiate successfully, backed out it's changes (including it's existence) and returned with a failure status. There is, of course, a third ordering of possibilities, where the archetype objects of the PCI_slot and motherboard are defined first, and then the ethernet_PCI_card is instantiated. In this case, there is still nothing on the available_connections list to serve the card's need, but the PCI_slot archetype still places it's interface on another list in the SOI, the "possible_connections" list. The ethernet_PCI_card first looks on the "available_connections" list, as before, and finds nothing usable. It then looks on the possible_connections list and finds the archetype for a PCI_slot object. It instantiates a PCI_slot object, and tells the object to initialize itself. The PCI_slot object does so, and immediately encounters the on_instantiate auxilliary-object hard link, and thus must instantiate the motherboard archetype. With the motherboard object and it's PCI_slot auxilliary objects created, the ethernet_PCI_card can now establish it's required plug/jack connections. ----- Extension of Connectionism to Partially Expendable Resources In the case of systems where the total amount of an available resource is what matters rather than specific connections between objects, then the relevant on-propagate functions are "produce" and "consume" rather than "plug" and "jack". Within this domain of production/consumption, there are two separate sub-domains: a model where the resource is shared throughout the entire problem implicitly, versus a model where the resource must be explicitly carried from the producer through zero or more intermediate steps, to the consumer. In the case of the former, where sharing of resources is implict and no special connection required, the modifications to the prior plug/jack system are simple; a producer doesn't remove itself when in use, it merely decreases the available amount of resource. The corresponding function names are "produce" and "consume". More explicitly, each consumer tells the producer to add an attribute named "consume", for some number of units of whatever the producer produces. The producer object keeps these consumers as multiple values on it's attribute list, and returns "success" as long as it has at least enough production to satisfy all consumers. As an example, let's look at how the alternator in a truck can be verified to be adequately sized for all possible load situations. object alternator assert produce { resource amps amount 100 } object ignition assert consume { resource amps amount 4 } object antilock_brakes assert consume { resource amps amount 50 } object big_headlights assert consume { resource amps amount 44 } When "alternator" becomes instantiated, it places an available resource on the SOI that looks like this (remember, a resource lookaside table entry merely says what object might be able to supply some resource, not how much directly.): available_connections -> resource:amps : my_alternator When the ignition system is instantiated, it must consume 4 amps. So, "consume" goes to the available_connections list on the SOI, and finds that indeed there is a supply of the resources "amps", and that the object alternator might be a viable supply. The "consume" function adds a pointer to the my_alternator object as the source of the resource "amps". The ignition system now is of the opinion that it has enough amps (i.e. that it is self-consistent), but it must still tell the alternator that some part of it's output is being used. The "consume" function then packages up a set of changes to the alternator object, in this case adding the ignition system as a dependent object, and adding a consumption of four amps going to the ignition object (the { FLAGS: ADD } below denotes that the particular AVF tree node carries the flag that this node is to be _added_ as a multiple-attribute value, rather than replacing the current value or being appended as a multiply-valued slot.) The actual propagate_me AVF packet set looks like this: { attribute: amps/consumer value: ignition flags: APPEND } { attribute: amps/consumed value: 4 flags: APPEND } and starts the propagate-me function of the alternator running. The alternator runs it's on-propagate functions, which include the assertion of being a producer of "amps", as well as the assertion of having four amps consumed. The "consumed" tag is inspected by the "produce" function, and in this case "produce" finds one consumer, which uses four amps leaving 96 amps available. alternator->produce then determines that there are still amps available, and so leaves itself on the SOI's available_resources lookaside list. (Implementation note- perhaps it's easier to just always put it on, and verify that we don't have too many versions). This process then repeats for the antilock_brakes and the big_headlights objects, where the available amps decreases from 96 to 46 to 2, respectively. ----- The Available, Possible, and Requested Lookaside Lists The SOI actually carries three lookaside lists. All three lists are used to speed up the search for connections in linkage relationships. The three lists are: available_connections: connections to already-instantiated objects that are available for use possible_connections: connections that would exist if an archetype was to become instantiated (or instantiated again). requested_connections: connections that correspond to an outstanding "request to connect" that has not yet been satisfied, but on an object that does not require the connection to be fulfilled (otherwise it would be a contradiction). All three of these lists are used as "accellerators". The format of all three lookaside lists is the same: a GEEKlist with the attribute name formed as T:N and the value being the object or archetype that offers that resource. The T:N format can currently have values of "plug", "jack", "plack" (a plack connects to another plack, like plug/jack), "produces", "consumes", "fallback", "generate" and "expend". The meanings of the functions that you haven't seen yet are explained below. ----- When A Producer Doesn't Produce Enough When the producer finally runs out of resource, it can do oneof several things: 1) it can be stupid, and simply return "FAILURE - insufficient resource" (this is the default behavior if no other is specified) 2) it can de-instantiate itself and instantiate the next producer in the possible_connections list that is listed as "produces:" for that resource. This means that each of the previously supported consumers must be notified of the change and be allowed to re-propagate. Hopefully, this next producer in line will have enough of the resource to satisfy all of the needs, but there is no gaurantee of that. (this is smarter, but can lead to "exponentially long" propagation times, as each producer is sequentially instantiated, found inadequate, and repropagated) 3) it can execute a domain-specific fallback procedure. ----- Domain-specific Selection and Fallback Procedures Selection and Fallback procedures allow the application to invoke a special-case procedure when propagation fails. A fallback procedure is a procedure that attempts to salvage a failed propagation by using domain-specific knowledge, while a selection procedure is invoked when a propagation has domain-specific knowledge that can be used "ahead of time" to prevent excessive propagation. Not every failed propagation situation is salvageable by fallback. For example, if A AND B = C and we have A == true and C == FALSE, we know that we can't salvage a request to make B true except by changing either A or C. However, we may be in a problem domain where it _is_ possible to save a failed propagation- in fact, fixing the propagation is the easiest way to describe our desired behavior. For example, say we are putting appliances in a house, and we find that the 50-amp fusebox is no longer adequate. Instead of saying the house is unusable (i.e. "FAILURE" ) it would be better to specify the use of a better fusebox. A fallback procedure allows us to do precisely this sort of thing. Whenever an object would return failure from a call to propagate-me or instantiate-me, the fallback procedures on that object are executed, each in turn, starting with on_fallback#0 and working upward. After each fallback is executed, the propagate-me or instantiate-me routines are retried. For the above example, the fallback routine for a power resource may be to search for the next larger object that generates the resource power, and, if that exists, to instantiate that new power resource, and then to remove itself from the available resources list, "break" each connection to each of it's supplied objects, and have that object go looking for another generator of that resource. The objects will find the newly instantiated generator on the SOI's available-resource list, and connect to that object. When all of the users of the obsolete resource have been relocated to the new generator, the old generator can be deleted from the network as the final act of it's fallback routine. It's important to remember that fallback procedures are never required, and are never automatically generated. ----- Selection and Fallback Procedures on Objects versus Resources Each object can itself specify a selector or fallback procedure, but in general entire subclasses or entire resources will use the same fallback procedure. This can be done either by inheritance of the fallback procedure from some common ancestor, or by having each object specify a particular fallback. Fallback procedures usually exist on the resource itself. In that case it is the responsibility of the object that is about to report a failure to use a requesting function that is smart enough to activate the resource's fallback. The actual storage setup is shown below. Using the previous example of available PCI slots on a 3-slot motherboard: available_connections -> jack:standard_pci_interface : PCI_slot#0 -> jack:standard_pci_interface : PCI_slot#1 -> jack:standard_pci_interface : PCI_slot#2 are available connections, and adding a fallback would give available_connections -> jack:standard_pci_interface : PCI_slot#0 -> jack:standard_pci_interface : PCI_slot#1 -> jack:standard_pci_interface : PCI_slot#2 -> fallback:standard_pci_interface : insufficient-PCI-slots-fallback-proc In this way requesters that are smart enough to look for a fallback procedure can find it, and requester functions that don't know how to deal with fallbacks never see the fallback. ----- Connection-Oriented Resources As seen above, "plug/jack/plack" connections deal with connections as unitary resources, and "produce/consume" deal with resources that are freely shared. A third kind of connection is required, where a connection itself carries a resource which is limited in magnitude. This kind of connection is controlled by the propagator functions known as generate/distribute/expend relationships. In this kind of relationship, one or more sources generate a resource, which is then transferred via other objects in an unbroken chain to one or more objects which expend the resource. In order to track this unbroken relation from generator through distributors to expenders, the following propagate-me functions are used: generate resource-name amount distribute-from resource-name input-limit "input-formula" distribute-to resource-name output-limit "output-formula" expend resource-name amount The formulas are a simple algebraic formula that states how much of a particular output resource is produced from the inputs, or required to do so. By requiring both input- and output-formulas, we avoid the issue of inverting a possibly-nonlinear requirement formula. [ since algebraic evaluation is not precisely trivial, we should probably use something like a callable interface to a desk calculator for the initial version, rather than writing one for ourselves. ] Like other resources, "generate" and "distribute-to" relations are put on the possible_connections resource list until the object itself is instantiated, and are moved to the available_connections list when the object actually is instatiated. Similarly to producer/consumer, here's the pseudocode for an expend function: expend: while (suppliers of resource exist on available_resources) create AVF list to tell the supplier object to supply us; pass the AVF list to the supplier object; tell the supplier object to execute "propagate-me"; if ( successful propagation ) return whatever success return value we got; ;; the targeted supplier failed, try another if any ;; other suppliers exist next supplier; if (a domain-specific fallback procedure exists) execute the fallback procedure if (success) return; while (suppliers of the resource exist on possible_resources) instantiate one of the possible resources create AVF list to tell the supplier object to supply us; pass the AVF list to the supplier object; tell the supplier object to execute "propagate-me"; if (Z = success) return Z (whatever success return value we got ); ;; the targeted supplier failed, try another if any ;; other suppliers exist else delete the instantiated resource next supplier; Fall thru: return FAILURE - unable to supply resource ----- Classes and Groups A GEEK class is the set of things descending from a common archetype. As such, all members of a class are either subclass archetypes, or instantiations of those archetypes. All objects in a GEEK class are in some way geneologically related. A GEEK group, on the other hand, is a collection of objects by descriptive rather than geneological means. For example, a GEEK group might be "the set of all objects which carry the attribute FOO, set to value BAR, and also carry the attribute BAZ". Once a group is formed, it is automatically updated whenever an object gets a value set or changed. This is done by checking the outstanding list of group definitions against the new object. Design Question: because _all_ objects should check the group definitions on every attribute change, the "propagate_me" function might be the the caller of this group update. Alternative: for objects that we can anticipate will never be members of a group, this is just wasted CPU cycles. Decision: for now, an "on_propagate" function named "group_membership_check" will check group membership, and we'll work on whether we can insert or delete this function automagically. Design Issue: an object entering and leaving a group can itself can cause a contradiction- so we need to be able to roll back the alteration that led to entry/exit of a group. If we use the above method of group membership checking as an on_propagate function, then there's no problem with this; contradictions due to group membership violation are dealt with the same way any other contradition is dealt with. ----- Definition Of A Group The definition of a group looks like most other GEEK objects, but the "group" function is treated like a resource and so sits on the SOI in it's own lookaside list. (it does not share this lookaside list with the other resources, for speed reasons) The lookaside list for groups is sorted on the tested attributes of each group, and there's an entry for each tested attribute of each group. The value of the entry is the group object itself, so that when an attribute matches, the group object can test the altered object by whatever means desired to see if it should become a member of the group. In our example above, we have object my_group assert filter { attr_name "foo" attr_value "bar" } assert filter { attr_name "baz" } In the group_filters lookaside list in the SOI, this would generate two entries: group_filters --> foo : my_group --> baz : my_group both of these entries point back to the my_group object so that the my_group object can perform whatever tests are desired. If we then instantiate: object my_test_object foo bar baz "this isn't tested and doesn't matter" then the lookaside list is searched first for an attribute named "foo". This attribute is found and leads to the object my_group. Object my_group is then told to add my_test_object as a member to itself. The AVF node passed is handed off to my_group's propagate_me function, which then calls each of the on-propagate functions as usual. In this case, the AVF node handed off is: { attribute: member value: my_test_object flags: APPEND, REPLACE } Here, the only on-propagate functions are the two filter functions, so the on-propagate functions can be "smart" and not reverify each prior member of my_group. Instead, the filter function can simply test the AVF "value slot", obtain my_test_object, and (in this case) return a SUCCESS, so my_group's propagate_me function allows the AFV ADD modification to stand, and returns SUCCESS - MEMBER ADDED. The other on-propagate functions must run to conmpletion, but those should be relatively fast (or, more likely, nonexistent) on a group object. Nota Bene: consider what would have happened if the test object did not meet the filter requirements. In this case, it would return "SUCCESS - MEMBER NOT ADDED" to the caller- in this case, group_membership_check. Note that this is a SUCCESS code, NOT a failure per se, the SOI is still _consistent_, but the object is not a member of the group. Now, consider the situation where we modify the value of my_test_object to have a bar value "sorry". In this case, the dependents of the my_test_object must be notified. The on_propagate function group_membership_check is invoked as part of test_objects' propagate_me execution, and sends the group object the following AVF for propagation: { attribute: member value: my_test_object flags: APPEND, REPLACE } The group runs it's propagate_me function, which runs the filter functions, which find that my_test_object is no longer appropriately a member of my_group and should be removed. The first filter function that fails removes my_test_object from the attribute "member", and returns a "SUCCESS - MEMBER NOT ADDED". This is where things get a bit hairy- (or rather- we may need a second propagator function) - because here "member not added" should _not_ allow the add to proceed. As an obvious way out, we can: 1) special-case out some SUCCESS codes and not others in propagate_me or 2) have a mid-level propagator function that is called by propagate_me, and encapsulates the special case of multiple filters or 3) allow a filter function to return a FAILURE - OBJECT NOT ADDED, and special-case this situation on the group_membership_check end (that is, from the object applying for membership in the group). Group_membership_check can check for this special situation and return a "success" state even if the object fails to obtain any group membership. (this is the recommended method) Note: putting a filter on "lvalue" or "fvalue" can really slow down propagation in the GEEK system, at least in whatever SOI has this filter. However, it's also a good way to "hook" the changing of an attribute to affect external systems. As usual, the programmer must be aware of the engineering tradeoffs. ----- Automatic and Nonlocal Relationships It is possible to automatically create relationship objects on the creation or modification of the appropriate objects in the SOI. For example, a relationship archetype concerning nonlocal interactions could automatically apply themselves whenever a particular set of (otherwise unconnected) objects with particular state occur. These relationship archetypes are brought into existence by successive operations of filter functions, the same functions as are used in determination of group membership. Indeed, it is both appropriate and recommended to use membership in a group or class as filter, to centralize the membership issues and improve the modularity of the software. For example, one might want to express the nonlocal relationship that one cannot have two interface cards with the same hardware address. One obvious way to solve this is to use plugs and jacks on all possible hardware addresses (which will work, but can be traumatic if the set of hardware addresses is large). Alternatively, we can use nonlocal relationships. Assuming we have pre-defined the multiple inheritance class "i/o_card", then: archetype bus-io-address-space-exclusion assert filter { is-a i/o_card } assert filter { is-a i/o_card } assert instantiate-me To see how this works, let's step through the creation of an i/o card. First, some routine (possibly in the UI) instantiates the card itself. The i/o card immediately attempts to propagate itself, by calling the propagate-me function. This probably will involve some plug/jack and probably produce/consume propagation, (which we've stepped through above) but will also include the activation of the filter functions that live on the SOIs lookaside list. If it happens that i/o_card is a group, then one of those SOI-included filter functions is the group membership function group_membership_check , which dutifully adds the newly created i/o_card to the group i/o_card. If it happens that i/o_card is a class, then there is an implied filter function on the SOI that checks the inheritance hierarchy of the new i/o card and finds that, indeed, it is a member of the class i/o_card (and incidentally, of every superclass of i/o_card) and so adds it to the i/o_card archetype as an attribute under "member". Major Design Issue: if the membership in a class is stored on the archetype, then copying the archetype during instantiation also copies the list of all previous members. This is bogus. How do we fix it? Options: 1) Create a holding object, per-class. 2) Put the objects on the archetype, but somehow flag the membership attributes as "non-copy". (the ability to flag attributes as "no-copy" might be useful in other ways as well. OTOH - the strictly correct implementation of this would imply that the entire archetype would need to be repropagated on each member instantiation- which would be a high overhead design). 3) Put the membership information in the SOI somewhere, perhaps in multiple values off of the object AVF list. In our case we find that the assert filter is true- that is, that we are a member of class i/o_card. In this case, the "instantiate-me" assertion is fired, just as if it were a normal propagator. In fact, "instantiate-me" is not a normal propagator at all, but a function that merely masquerades as a propagator while actually executing for side effect- specifically, of creating a relationship on the fly.