sister (X, Y):- female (X), parent (Z, X), parent (Z, Y).
In English, this says that one person is the sister of another if the first
person is female and the two people share a parent. There are several
problems with this definition that derive from Prolog itself and not from my
interpreter.
The first problem is that Shaked will be shown as her own sister (as will
all the other names defined as females that have a parent)! This is
because X binds to Shaked as does Y. Nowhere was it said that X has to
have a different value than Y. This is easily solved by adding another
term to the rule, using the intrinsic operator \= to signify inequality:
sister (X, Y):- female (X), parent (Z, X), parent (Z, Y), X \= Y.
Whilst this solves the problem in Prolog, it opens up a can of worms in my
implementation. Until now, every term has been in the form functor (any
arguments); this can be described as prefix notation. Unlike this, the
inequality relation as an infix statement, where the functor (or operator)
\= appears between the two operands. This causes one routine in the parser
to fault, as no parentheses are in this term. So the procedure ParseLine has
to be extended to allow for infix expressions that effectively turns that
term into the canonical representation \= (X, Y).
But more importantly, the ResolveGoals function has to know how to resolve
such a term; obviously \= is not going to match any term in the database so
first a check must be made to see if the current goal contains the
inequality operator, and if it does, ResolveGoals should succeed if the
operands - actually, the values of the operands - are equal (and fail if
not). Otherwise, the current goal should be matched against the terms in the
database.
Once that problem is out of the way, the query returns four solutions
(the actual order of the solutions depends on the order in which the terms
are stored in the database):
Solution 1: X = shaked Y = lior Solution 2: X = lior Y = shaked Solution 3: X = netta Y = nir Solution 4: X = netta Y = nir
The first two solutions are as expected, even though as humans we would
regard the second solution as redundant. Ths requires us to know that both
Lior and Shaked are girls, or at least are represented as females in the
database; both names are unisexual in Hebrew so in other databases, Shaked
(female) could be the sister of Lior (male) but not the other way around,
or Lior (female) could be the sister of Shaked (male) - or maybe not at
all, if both names are defined as male.
But what about solutions 3 and 4? This had me scratching my head for some
time as to what could be the cause of this duplication as the interpreter is
perfectly correct in displaying two solutions. After a while, I understood
what was happening: in the first case, the facts parent (noam, netta) and
parent (noam, nir) were being matched, whereas in the second case, the facts
parent (sarit. netta) and parent (sarit, nir) were being matched.
As CoPilot puts it, accept that
duplicates reflect multiple derivations.
This is actually faithful to Prolog’s semantics: each solution reflects a distinct proof path. If you want to show provenance, you could annotate which parent led to the match.
This is actually faithful to Prolog’s semantics: each solution reflects a distinct proof path. If you want to show provenance, you could annotate which parent led to the match.
One minor piece of programming that came from me and not from CoPilot was the
addition of a crude profiler: this counts how many times ResolveGoals is
called and how many times Unify is called. These data show how hard CProlog
is working and hints at the efficiency of the rules. I wrote a few
paragraphs back that the actual order of the solutions depends on the
order in which the terms are stored in the database; the order of the
solutions also depends on the order of the terms in the query.
But more importantly, a distinction has to be made between the logical meaning of a query and its practical
(procedural) meaning. The two rules that I show below have the same
logical meaning, but one is more efficient:
sister (X, Y):- female (X), parent (Z, X), parent (Z, Y), X \= Y.sister (X, Y):- parent (Z, X), parent (Z, Y), female (X), X \= Y.
The first rule requires 48 calls to ResolveGoals and 346 calls to Unify,
whereas the second requires 68 calls to ResolveGoals and 548 calls to
Unify (the actual numbers depend on the number of terms in the database
and how many match; if the queries are run agains the same database then
the profiler information will show the relative efficiency). Obviously the
first rule is more efficient; why is this so? This requires real-world
knowledge: one would expect that 50% of the people mentioned in the
database are defined as female and 50% are defined as male, so putting the
female term first should exclude quite a few facts. The second rule checks
the gender of the parent only after finding matching terms for parent, so
one assumes that many facts with a male parent will be checked before the
gender check, at which time Prolog discovers that it's been wasting its
time.
This is a heuristic that I use frequently in SQL: order the terms in terms
of their specifity. In other words, if a term will ignore 50% of the data,
then it's better to put this before a term that will ignore only 25% of the
data.
As it happened, because the 'parent' clauses about Netta came before those
about Shaked, the solutions were displayed in reverse order. This is merely
a curiosity and not imporant.
Two more very subtle problems arose. Here's the first:
?- father (X, Y). One of the solutions will be X = noam, Y = netta ?- father (noam, X). Did not return any solutions ?- father (noam, Y). Returned solutions as expected.
This is clearly the case of what might be called the variable renaming
bug. Putting it briefly, the X in the rule father (X, Y) should not
be the same as the X in the query father (noam, X). That wasn't too
difficult to fix.
I wanted to check wheter recursive rules work. Here is a simple
example
ancestor (X, Y):- parent (X, Y). ancestor (X, Y):- parent (X, Z), ancestor (Z, Y).
Thus I am the ancestor of any of my children, but I am also the ancestor
of their children and will be of their children's children (the oldest is
only 9 so there are no great-grandchildren!). Again, if I simply queried ?- ancestor (X,Y), then all the correct solutions would be presented (although at first
there was a problem that might be called 'environmental leak'), but if I
queried
?- ancestor (noam, X), there would be presented five solutions where in all of them X = noam.
The number of solutions is correct (two children, three
grandchildren) but of course the displayed answer is wrong. This
took quite a time and several iterations of code before it was
fixed. Part of the problem was the hack that CP had added to prevent the same variable appearing more than once.
I am stopping development here, as the interpreter now does everything that
I want it to - which isn't actually very much. I suspect that I am much more
interested in learning how to write the interpreter than actually
use it!
As I wrote previously, [it's] more interesting is that CP didn't really 'understand' at first what was
necessary. I could compare the development process as someone (for example,
the OP) asking for something complicated from me; I would create a first
version and test it to a certain extent. Then she would say, yes but it
doesn't do something, so I would have to rewrite part of the program in
order that it could do that specific something. Yes, but it doesn't do
something else ... so on and on. The complete solution was arrived at after
time - this is called the spiral methodology of programming. In this
respect, CP was 'human': we started off with a short program that was
continually expanded as more and more functionality was required. This
interpreter get quite complicated - and only implements a subset of pure
Prolog. For example, the whole topic of lists is not
included.
This day in blog history:
Title | Tags | ||
---|---|---|---|
413 | Watching the weight / 2 | Diet, Acupuncture | |
764 | Continuing to watch the weight | TV series, Health, Walking | |
981 | Pain whilst walking | Walking | |
1080 | Hersonissos | Holiday, Greece | |
1263 | Cosmology | Personal, Gateway, Non-fiction books | |
1345 | Simple baked salmon | Cooking |
No comments:
Post a Comment