I've been working on this program for several days, adding more and more definitions. I thought that I had reached a saturation point this morning, and so decided to write a new tokeniser that would be more streamlined rather than the ad hoc structure that exists at the moment. But in doing so, I discovered that I also have to rewrite a fair amount of the lexical analyser that too was very ad hoc.
One advantage that the program has is that it is intended to work on procedures that have already passed the internal syntax checker and so there is no need to check all kinds of pathological structures.
This is going very slowly as I spent a few hours working solely on handling variables and their initialisation. I am concentrating on the following three statements
:A = 10; :A = :B = 10; :A = (:B = 1 ? 2 : 3);
The first statement should be easy to handle, but it required a change in the new 'GetChar' routine. The first version of this along with the new tokeniser would return four tokens for the first line, namely :A, =, 10 and ;. This makes parsing the line unnecessarily difficult: there should be a 'one token lookahead' but the routine needed to look at the following two tokens. So I changed the routine so that there are now two tokens (:A and 10), and the equals sign can be 'seen' when handling :A. The next token still has to be retrieved; if its first character is a digit then the identifier is being initialised and everything is good (this is the first line). But if the first character of the following token is a colon, then apart from handling :A, the following token (:B) should be put back into the input stream.
Tomorrow I will handle the third statement. I think that the key to handling this properly is the opening bracket before :B; this will indicate that some form of expression is about to appear. :B will be referenced but it won't be marked as an initialisation.
No comments:
Post a Comment