Monday, October 06, 2025

Correcting CoPilot Prolog

I wrote yesterday about getting CoPilot to write an interpreter for a simple subset of Prolog in Pascal. There were only one or two syntax errors, so I was quickly able to fix them and get the interpreter running. Once I did, though, it quickly became clear that there were problems with the code, probably in the parsing of facts and rules, so I added a 'show' command that would display the facts and rules as they had been parsed.

This produced a great deal of rubbish. After tracking this line by line, I found the reason. For example, when the interpreter parses the fact 'parent (noam, netta)', a record of type TPredicate is created; such a record has a name field (this case, parent) and an args field, which is a dynamic array of terms, where each term consists of a kind (either an atom or a variable) and a value. For the current fact, the data will be 
name: parent args[0].type : tkAtom args[0].value : noam args[1].type : tkAtom args[1].value : netta

The fact has been parsed correctly, but when that fact is added to the global list of facts, something goes wrong. Eventually I tracked this down to the statement 

Move(Args[0], Result.Args[0], Length(Args) * SizeOf(TTerm));
Too much data was being added which is why the 'show' command produced a great deal of rubbish. I replaced that line with the following
for i:= 0 to high (args) do result.args[i]:= args[i];

Moving onto rules, I saw that the same problem existed regarding the use of 'move'. This was easily corrected, but I could see that the parsing was going haywire. If, for example, one has the rule
grandparent (X, Z):- parent (X, Y), parent (Y, Z).
The 'head', i.e. grandparent (X, Z) is parsed correctly, but the brackets cause the parsing of the 'tail' as per CoPilot to return wrong values. CoPilot uses a tstringlist to store the various tokens, where the comma is used as a delimiter. As a result, the following tokens are returned
parent (X Y parent (Y
Z
if not worse. This is clearly wrong: the parser should initially create two terms, parent (X, Y) and parent (Y, Z), then use exisiting code to parse each term correctly. After asking, CoPilot produced code that handles multiple terms in a rule body correctly. Showing data that has been loaded produces the correct output. It would have been better had CoPilot create a recursive descent parser from the beginning instead of something more ad hoc.

Unfortunately, running a query such as grandparent (noam, Y) does not return the correct results, so now I'll have debug the query part of the code. That's the most interesting part. 

To start with, the code does not backtrack, i.e. find multiple solutions. Well, it does but only the last solution is displayed. This is because the local environment that contains the values for the various variables overwrites the global environment, so even if several solutions are found, only the final solution was saved in the global environment and then displayed. Fixed.

And in fixing that part of finding multiple solutions for matching facts, I've inadvertently caused problems for finding solutions to rules. For a rule, first a possible solution should be found for the first term (in the case above, parent (X, Y)), then use those values for the second term and possibly more. Saving all the values for the first term and then trying to solve the second term ruins everything.


This day in blog history:

Blog #Date TitleTags
41206/10/2011Rest in peaceRIP
98006/10/2016Health updateHealth, Theanine, Donating blood
153306/10/2022Weight and bp confirmedHealth
167306/10/2023Belated discovery in Delphi: defining indexes on calculated fieldsDelphi, ClientDataSet

Sunday, October 05, 2025

CoPilot writes a Prolog interpreter!

The other day I was idly leafing through a book entitled "Computer Science from Scratch" by David Kopec; amongst other things it promises to show one how to build interpreters. All code examples are written in Python, which is how code is shown these days, but for an old codger like me, so much seems to be hidden away, making comprehension difficult. 

The second chapter describes writing an interpreter for Tiny Basic, a language that was often implemented on the 8-bit home computers of the late 1970s/early 1980s. This subset of the language is very simple: for example, there can be at most 26 variables that each have a one letter name (i.e. A, B, C); this translates into a simple 26 member array of integers (the only data type allowed).

When walking the dog afterwards, I contemplated writing a similar interpreter in Pascal. This shouldn't be too hard, I thought, although I missed in the discussion how GOTO and GOSUB were implemented (this turned out to be not particularly difficult). One problem with BASIC is that arithmetic is written using what is called 'infix notation' - the operator is inbetween the operands. For example 2 + 3 is a simple expression with infix notation: 2 and 3 are the operands and + is the operator. Some languages use prefix notation, e.g. Lisp, where the expression would be written as + 2 3, and some languages use postfix notation, e.g. Forth, where the expression would be written as 2 3 +. It's relatively easy to evaluate postfix notation by means of a stack.

Then I thought that maybe I could ask CoPilot to write a simple infix expression parser in Pascal. No problem, I was told; almost immediately code was shown that first turned the infix notation into postfix notation then showed how to evaluate this expression. Impressed, I then asked CoPilot to write a Tiny Basic interpreter in Pascal - this went through several iterations as I asked for more and more functionality. The code seemed to have correct Pascal syntax, although when I tested the most simple version, I discovered several minor mistakes that will have to be corrected throughout the versions.

Later on in bed, I thought about these capabilities and wondered whether CoPilot could write a simple Lisp interpreter or even (gasp) a Prolog interpreter. As a result of these thoughts, I found it very difficult to fall asleep.

This morning, in a free moment, I asked CoPilot to write a simple Lisp interpreter that too went through a few iterations (for example, car, cdr and cons were missing at the beginning). No lists were used at all in the implementation, only dynamic arrays. I have yet to test the final version, although to be honest, I'm not too interested in it.

If CoPilot can write a simple Lisp interpreter, then can it write a simple Prolog interpreter? Indeed it can! This too went through a few iterations as more functionality, especially backtracking which is the raison d'etre of Prolog, was added. Again, no lists or stacks using pointers: everything is implemented by means of dynamic arrays. This is a totally different way of implementing the language; no doubt I will spend no small amount of time single stepping though the code in order to understand how it works.



This day in blog history:

Blog #Date TitleTags
1605/10/2005SudokoProgramming
20305/10/2009Old computer in new caseComputer
41105/10/2011Firebird DB management tool (3) - Bells and whistlesProgramming, SQL, dbExpress
97905/10/2016DBA: Entering the final third of the doctorateDBA
107905/10/2017A kibbutz dayJewish holidays, Obituary, Kibbutz
117805/10/2018Geoff Emerick, 1945-2018Obituary, Beatles
134205/10/2020Continuing the development of the Priority Cross Referencer (PrioXRef)Programming
142805/10/2021Continuing the BP sagaHealth, CPAP, Blood pressure, Aldosterone
167205/10/2023The neuroendocrine system and my blood testsHealth, Nutrition
183405/10/2024The swimming season finished rather abruptlySwimming