Friday, October 09, 2020

Converting a Delphi 7 semi-unicode program to Delphi 10

Possibly the simplest exam in the Occupational Psychologist's armoury is a program that requires the user to list 20 phrases that describe herself ('I am ... ') and 20 phrases that do not describe herself ('I am not ...'). The program displays itself in four languages - Hebrew, English, French and Russian - but until recently, the output that the user entered appeared as either Hebrew or English. We bought a 'Russian' keyboard and added Russian to one computer, in the naive hope that words that the user types in Russian would appear as such in the output file (an INI file).

Extremely naive: the output file consists of question marks where there should be Russian. After researching the topic, it turns out that the output file should be opened as follows 

Old style: datafile:= TIniFile.create (ffn); New style: datafile:= TMemIniFile.create (ffn, TEncoding.UTF8);
The 'TEncoding.UTF8' is the magic invocation required to create an INI file with Russian (or other unicode) characters; this also requires using TMemIniFile and a modern version of Delphi. Painstakingly I created a version of the exam in Dephi 10 (Seattle), replacing all the Hebrew characters that had been inserted directly into the program with either 'updated' Hebrew or appropriate resource strings. I eventually managed to test this program on the 'Russian computer', and indeed the INI file contains Russian characters.

At the same time I have been trying to create a program that will successfully read a database file. I have come to the reluctant conclusion that there is a bug with the ClientDataSet component that prevents me from using it. But the TSQLQuery component works correctly: I remembered that I wrote one or two programs using a TListView instead of a TDBGrid, where the list view is populated by a TSQLQuery, so I can use this alternative approach. But that wasn't the real problem....

It turns out that the 'surname' and 'forename' fields in the database table containing the examinee details were defined with the wrong character set - WIN1251, Russian - instead of WIN1255, Hebrew. I tried several times to convert from one to the other but with no success. Eventually I decided to delete these fields from the database, redefine them as WIN1255 then read in all the original data files in order to populate the name fields. This worked well, although there are about 15 entries in the database with identity numbers but no names. I can copy these from a working version of the database.

Going back to Delphi 10, every attempt to read this new database file failed. I'm not sure what the real reason is, but I discovered that I could access the database via my program only after I moved it to a directory that does not descend from 'users'. In the program that is to read this database, I replaced the dbGrid with a list view, and after dealing with the required method of inserting data into the list view, I could finally see data - but the Hebrew was not displaying correctly (each character appeared as a black diamond). I had researched this topic earlier this morning and had found the solution: the Hebrew fields should be accessed in queries in the following method cast (surname as varchar (16) character set unicode_fss). This method did indeed give me readable Hebrew text. Below is a screen shot of the program.


What's to come? I had removed all the functionality of the results program so that I could concentrate on displaying the data. Now I will restore all the functionality, which hopefully will not require many changes. After that, I can concentrate on reading an INI file with Russian characters, inserting the data into the database and then displaying it. Fortunately, the data (i.e. the phrases that each examinee wrote) appear to be coded as WIN1255 so I won't have to reinsert the data. As the data is sent directly to Word for output, the 'cast' trick may not be required. 

Of course, I will have to see what happens with Russian characters: I may have to add a new field to the table in UTF-8 coding to store the Russian properly.

No comments: