Friday, August 17, 2018

Using indexes

One of my "rules of life" is never to program after 9pm, as I won't sleep well afterwards. Last night I broke that rule and indeed did not sleep well.

Earlier in the evening, I had been working on a problem in the "ERP" program which I wrote for the Occupational Psychologist. One module is concerned with what are called in English 'reconciliations', although a better name might be 'match ups': these record which bill has been paid for by which receipt. There was a problem with one reconciliation for a customer, but I had difficulty in finding this problematic reconciliation.

After walking the dog (at nearly 9pm), it occurred to me that I could write a report which listed all the dubious reconciliations: these would have one or more bills and one or more receipts sharing the same reconciliation number. The query needed was not difficult to write
select bills.recon, count (*) from bills inner join receipts on receipts.recon = bills.recon where bills.customer = :p1 and bills.curdate >= :p2 and bills.recon > 0 group by bills.recon having count (*) > 2;

The only unusual part of that query is the final line: the 'having' clause allows one to compare a calculated value with an ordinal value. The ':p1' and ':p2' variables are parameters to the query.

I ran this query with the appropriate values for the variables and waited ... and waited ... and I thought at first that I had written a query which causes the database manager to crash. But no, after a few minutes - which is comparatively a lifetime - I got results: there was a dubious reconciliation. As I already had determined the same reconciliation manually, I was pleased that the query agreed with me, but I was more concerned with the performance of the query.

Adding a few indexes to the database should help - one on bills.customer (strangely, this was missing as it would help many queries), one on bills.recon and one on receipts.recon (these last two indexes wouldn't have much general use) - and indeed they did. After adding them, the time required for the query was reduced from about 2:15 minutes to 0.15 seconds, which makes the second query 900 times faster than the first! This speed increase was not a function of having the data cached, as I ran the first query a few times to see whether the long time was repeatable (and it was).

Theoretically, there should be an index for every foreign key in a table, but sometimes the value of doing so (faster queries) is outweighed by the cost of the index (slower inserting and updating).  The index on 'bills.customer' should have been there from the start as there are many queries which will benefit from this. The indexes on the 'recon' fields are less important but make a definite improvement here.

And to sign off: yes, I know that index is a Latin word, and as such its plural should be indices. But the original SQL programmers presumably did not have a classical education and so thought that the plural of index is indexes. It makes me shudder every time I write that false plural.

Saturday, July 28, 2018

100 years

Today would have been my mother's 100th birthday, had she lived that long.

1918 seems so very far away from 2018 in terms of changes. Even when I was a child, in the 1960s, 1918 seemed very far away. The most obvious agent of change was the second world war: science and technology changed to meet new demands, but also the semi-rigid class structure of pre-war Britain began crumbling. Whilst  there were many changes between an urban childhood in 1818 and 1918 - running water, indoor plumbing and such like - a child would still have occupied himself with similar games and hobbies. Today's children wouldn't know what to do with themselves if they were sent back one hundred years.

Huge advances in medicine have occurred in the last one hundred years; modern hygiene had become accepted by 1918, but antibiotics were still to come. Life expectancy was in the mid-50s; now it is in the early 80s. My father will soon be 96. But despite these advances, the human body is still the same and wasn't built to last so long. If previously major causes of death had been infectious diseases or malnutrition, today people die of cancer and heart failure. No one had Altzheimer's disease because no one lived long enough to suffer degenerative diseases.

I'm sorry for the sad tone of this blog. I think that it would have been written differently had I written it yesterday. Today my father came for his weekly visit and there seemed to be a massive deterioration in his condition; he seems to be wasting away, losing the will to live. It may be just the very hot weather of the past week which is causing this, but it may be permanent.