Saturday, April 10, 2010

Speaking the statements

My last blog, from a few days ago, hinted that the next entry (ie this one) would be about programming. Shortly after writing those lines, I went to bed and woke up the next morning feeling extremely weird. It turns out that I had a temperature of 38.5 degrees, which is why my head felt like it was full of cotton wool. Fortunately, it was only a mild case of flu, but it had the effect of taking me out of the loop for a few days. Wonderdog was giving me strange looks all this time, implying that I could take her for a ramble in the fields instead of lying around and doing nothing, but I think I'm back on the case now.

As I have written before, my occupational psychologist's flagship exam consists of 400 statements: each person has to mark whether they agree or not with the statements (eg "I usually feel nervous and ill at ease at a formal dance or party", "I have at one time or another in my life tried my hand at writing poetry"). The theory behind the exam is that each statement belongs to one or more scales, and dis/agreeing with a statement either increases or decreases one's affiliation to that scale. From these scales are built one aspect of a person's psychological profile.

Completing this exam takes about 40 minutes, making it harder than one might expect. The two statements which I quoted above make it fairly easy to decide whether to agree or not, but some are more complicated, generally revolving around moral issues. Some tend to be ambiguous, so it's difficult to decide if one agrees or not. Anyway, the psychologist suggested that people with learning disabilities might have difficulty in reading and understanding the statements, let alone answering them. Thus was born the idea of the exam speaking the statements.

Well, of course, the exam itself, being a computer program, can't talk (although 20 years ago I spent much time with a blind person, a speech synthesizer and text-to-speech software which was years ahead of its time). So in January I recorded over two evenings a well spoken lady reading each of the 400 statements. Since then, I have been editing those recordings into a series of 400 wave files, each file containing one statement being spoken. This is a very tedious process, which is why it took me so long to do. During the Passover holiday, I decided that I would edit one big file (5 minutes, about 35 statements) a day; I actually managed to keep to this schedule and finished creating all the files.

Then came the fun of getting these files into a Delphi program. This is actually a lot easier than it sounds. First, I needed to declare a resource file listing all of the wave files and their statement numbers. This is a text file (rc) which looks like this:
q1 WAVE q1.wav
q10 WAVE q10.wav
q101 WAVE q101.wav
q102 WAVE q102.wav
q103 WAVE q103.wav
q104 WAVE q104.wav
q105 WAVE q105.wav
There are 403 similar lines. This text file is then passed to the resource compiler, which produces a compiled resource file (res). As there are many resources used and they're all large, this file is about 40 megabytes in size (whilst today this may be considered small, I recall that the first hard drive I ever bought for my XT-compatible computer in the late 80s had a 20MB capacity!). Linking this file into the Delphi executable is trivial; accessing each wave file within the resource is also fairly simple, with the use of the following procedure:

Procedure TMivhan.PlaySound (n: integer);
var
 hFind, hRes: THandle;
 song: pchar;
 question: string[6];

begin
 question:= 'q' + inttostr (n) + #0;
 hFind:= FindResource (HInstance, @question[1], 'Wave');
 if hFind <> 0 then
  begin
   hRes:= LoadResource (hInstance, hFind);
   if hRes <> 0 then
    begin
     song:= LockResource (hRes);
     if assigned (song)
      then SndPlaySound (song, sndflags);
     unlockResource (hres)
    end;
   FreeResource (hFind)
  end
end;
The procedure is called with a statement number (eg 1). The first line in the procedure creates a string variable, prefixing the statement number with 'q', so that the string will match the name of the wave file in the resource file (that's the "q1" in the line "q1 WAVE q1.wav"). The trailing zero in the string is because the string has to be passed to an internal Windows function (FindResource), and this needs a zero terminated string.  "@question[1]" is a zero terminated string which is also lacking the length byte at the beginning of the Delphi string - this is a time honoured method of converting Delphi strings to C strings, as needed by Windows. The rest of the procedure deals with extracting the required resource from the executable file and 'playing' it. This is very much black box code and needn't be explained (and in case one asks, I took this code from an excellent article on the subject.

A note about the parameters passed to the Windows procedure SndPlaySound: the first parameter is a pointer to the wave file, and the second parameter controls how the sound should be played. Earlier in the program, the global variable 'sndflags' was set to the value 'snd_Async or snd_Memory': play the sound asynchronously from memory. This is generally the way that one wants the sounds to be played, so why did I use a variable instead of using the constant values each time, as the article does?

The answer has nothing to do with my procedure and everything to do with the program. At the beginning, the instructions how to use the program are read out, via this procedure. I discovered that sounding these instructions asynchronously and then sounding the first statement displayed asynchronously caused the instructions not to be 'sounded' at all. Should the procedure be called whilst a wave file is being played asynchronously, the second wave file will cause the termination of the first wave file. I thus caused the instructions to be 'sounded' synchronously - the user can't do anything with the program during this time - and only afterwards are the statements 'sounded' asynchronously. Instead of putting a conditional statement in the PlaySound procedure to define the flags, I used a global variable which is set to 'snd_Sync or snd_Memory' before reading the instructions, and then permanently to 'snd_Async or snd_Memory'.

No comments: