Sunday, August 02, 2009

Speed IV

Honestly, Ma, I tried as hard as possible using threads, but there was always some problem. So I dropped using threads and tried to improve my code algorithmically.

The problematic module is basically composed of four sections: an initialisation section, and three different queries. Most of the time was being spent executing the second query, so I concentrated my efforts here. The first optimisation was to note that data from one table was being read several times in several different queries; although it goes against the grain, it proved quicker to extract that data with a simple query, store the data in an array in the program code, and then simplify the other queries which would look up the missing value in the array. This speeded up the execution, but still not enough.

Then I went down to the metal. There was a query which basically was returning the same data again and again which would then be sent to an expensive operation. I had implemented a check before the expensive operation several months ago to make sure that the same data wasn't being sent, but now it was clear that the check was occuring too late. I was checking after the query had returned a dataset instead of checking before accessing data. Once I had repositioned and recoded the check, the program ran much faster.

So basically I've spent two weeks running around, learning about threads and emitting HTML code, when in fact I could have saved all that effort by concentrating on improving one query.

I read somewhere in this blog that SQL joins slow down a program and should be avoided at all costs. This sounds like shooting oneself in the foot, because if there are no joins, then all the power of SQL is lost. Well, not quite. Although it's messy, it makes no sense to extract the same data continually from the server; better to extract it once, save in in RAM and use lookup statements in code.

No comments: