A dataset in the False Hamster datasystem consists of a series of "statements", which are so called because each of them states one fact. Each statement contains the minimum information necessary to state one fact and does not state multiple facts.
Statements are divided into property, entities and value. The property is the type of information being stated. The entities are the things the information is about. The value is the information itself.
An example statement in English might be "Tim has brown hair". In this case the property is hair colour, the entity is Tim and the value is brown.
Statements are implemented in Scheme by a list where the first element is the property, the last element is the value and those in between are the entities.
In order to be any use to a computer system all statements regarding the same type of information, that is, all statements with the same property, must share the same structure. Such a standard structure is called a form.
All statements explicitly contain all of the information necessary to completely define the statement by themselves. No information is assumed or defined elsewhere. For instance the particular dataset containing a statement does not imply any information about the actual statement: all statements are complete and absolute.
Objects in this system generally fall into the categories of physical and logical objects. A physical object is one where the object itself means something, such as a number or string, where the actual value of the object is meaningful and functional and can do things. A logical object is one where the only meaning of the object is to provide a unique identifier.
The False Hamster datasystem is implemented by the software component (hamster datasystem), which contains functions for querying, altering, loading and saving False Hamster data, although actually altering the datasystem can often be done with ordinary Scheme functions as a dataset is just an ordinary list of statements.
Some forms for very general purpose information would include the forms for the names and types of logical objects. They look like this:
(name object name)
(type object type)
name presently has to be a string. type is itself a logical
object of type type
.
(match tournament round players)
This defines one match in the tournament and states which players are involved. players is a list containing all players involved in the match. round is a number. tournament is a logical object which, as stated above, could be represented by anything.
(bye tournament round player)
This states that player is not playing in round round of tournament. player is a logical object.
(score tournament player round score)
This states the score which player got in round of tournament. This represents the score gained in a single match.
(attends tournament player)
This states that player is participating in tournament.
(norm-players tournament number)
This states the standard or preferred number of players per match in tournament. This is used to configure the match-making process which can accommodate different numbers of players per match in order to cope with different types of game.
(min-players tournament number)
This states the minimum number of players per match in tournament.
(max-players tournament number)
This states the maximum number of players per match in tournament.
For example in a 2-player wargame all three of min-players, max-players and norm-players would be 2. In a Shadowfist tournament (a collectible card game usually played with 3 or 4 players) min-players might be 3, max-players might be 4, and norm-players could be 3 or 4 depending on preference.
The remaining forms in the tournament program configure how the match making process actually works. Two types of functions are involved: raters and cachers. Raters rate the preferability of a potential match based on a particular criteria. Cachers are just there to speed things up; they read the necessary information for a rater so that it only has to be done once.
In the case of the rater
form the order
in which the statements occur in the dataset is significant. The match maker
will use the functions in the order given, which means that the first function
has the most significance. In my standard usage of this system I give the
function for avoiding repeated matches preference over the function for
preferring matches between players with similar total scores so far in the
tournament.
(rater rater)
This states that rater is being used by the match maker to decide which potential match is the most appropriate.
(cacher rater cacher)
This states that the cacher for rater is cacher.
The querying is done by the get function, which has many variants which are created using specify from misc-utils. Generally when get is used one of the shorthand variants will be used.
The actual code for get is tiny as nearly all the work is done by the (hamster mmatch) pattern matcher. The work not done by (hamster mmatch) is mostly done by fairly ordinary Scheme mapping functions.
Generally what happens when get is used is that a pattern is supplied to get, that pattern is matched against every statement in the dataset and something is done with the values returned from the pattern matches. Precisely how this is co-ordinated depends on the Scheme mapping function used to control the operation. Using get with the misc-utils function first-true as in the get variants get1 or sget1 results in statements being matched against until a match returns true in which case get returns that true value. If the dataset is all matched against without a match returning a true value, get returns #f.
If get is used with the misc-utils function filter-map as in the get variants getm and sgetm, all statements in the dataset are matched against and get returns a list of all the true match results.
get can also be used to selectively delete statements by using the misc-utils function delf! to control the operation as in the get variants delm! and sdelm!. In this case get deletes every statement for which the pattern match returns true.
Because (hamster mmatch) is extensible the pattern language used to form the queries is limitless. One of the arguments to "get" is a list of (hamster mmatch) match-procedures and this list defines the pattern language. The get variants beginning with the letter "s" use the standard (hamster mmatch) "stdtp" list, however this is not actually optimal in most cases. Since the list of match-procedures has a big effect on how fast get runs, if you are working in an environment where speed matters it is best to use a list of only the match-procedures you are actually using.
In interpreted Scheme environments speed does seem to be an issue with get. I have done some work on a "compiled" version of (hamster mmatch) which creates a procedure from the pattern which can then be applied to any number of objects, and this seems to work between 3 and 80 times faster depending on how the time is measured, but this version of (hamster mmatch) is nowhere near as complete yet and more complex. Short match-procedure lists, small datasets, and caching information to reduce unneccessary calls all help.
The actual functions involved look like this:
(get collate mpl pattern dataset)
collate is the Scheme mapping function controlling the operation. Its
arguments must be (proc list)
where proc is a
procedure of one argument and list is a list. collate will be
called with (pmmatch mpl pattern)
as its first argument and
dataset as its second. This means that proc
will be a procedure which takes a statement as its only argument and matches
the pattern pattern against that statement according to the functionality
represented by the match-procedure list mpl.
(sget collate pattern dataset)
This functions as get does but with the mpl argument fixed as stdmp. This means that sget will use the standard (hamster mmatch) pattern matching functionality.
(get1 mpl pattern dataset)
get1 matches pattern against each statement in dataset according to the pattern matching functionality of mpl until one of these matches returns a true value. If this happens, that true value is returned. If the entire dataset is matched against without a true value being returned, #f is returned.
(getm mpl pattern dataset)
getm matches pattern against every statement in dataset according to the pattern matching functionality of mpl and returns a list containing all of the true results from these pattern matches.
(sget1 pattern dataset)
sget1 is the same as get1 except that it always uses stdmp as its matching functionality.
(sgetm pattern dataset)
sgetm is the same as getm except that it always uses stdmp as its matching functionality.
(delm! mpl pattern dataset)
delm! deletes every statement in dataset which matches pattern according to the functionality of mpl.
(sdelm! pattern dataset)
sdelm! is like delm! except that it always uses stdmp as its matching functionality.
(fhset! new-statement dataset)
fhset! replaces an existing statement with one which overrides and overwrites it. A new statement is supplied as an argument and any existing statements which conflict with it are deleted. Statements with the same property and entities are considered to conflict with each other for this purpose, i.e. statements which are the same except for the last element.
This documentation is woefully incomplete and (hamster datasystem) can do a lot more than this which is not yet documented.