Friday, August 26, 2011

Implicit vs explicit joins (SQL)/2

Following up my recent blog on the subject, I thought that I would look at the SQL statements in one small program of mine. About 50% of the queries were based on one table only, so of course there was no need to change them. Of the other 50%, about half were simple to change; in the remaining 25%, I had cleverly put the limiting clause before the connecting clause. Correcting these queries was simple, of course, and I did become aware of the organising power of the explicit syntax.

I had to write a new report for the OP's management program, so this was a chance to use the new syntax from day one. As luck would have it, this query was somewhat more complex than the usual type of query, involving two joins to the same table (in other words, I had to use the same table twice in order to display different values). Here I discovered my first 'gotcha': if the same table is being used twice, it has to be aliased both times. Initially, I was only aliasing the table once, but when I saw that the returned results were not as they should be, I added the extra alias. This probably explains why I see the frequent use of aliasing in answers about SQL on Stack Overflow: people have been taught to alias everything to prevent problems. And I thought that they were just being lazy.

Here is the new query in all its glory:
select dockets.id, c1.name, dockets.opendate, c2.name as xname, xtracust.curdate
from dockets inner join customers c1
on dockets.customer = c1.id
inner join xtracust
on dockets.id = xtracust.docket
inner join customers c2
on xtracust.customer = c2.id
where xtracust.xtra = 1
Probably all the (Firebird) world knows this already, but here is another tidbit which I picked up this morning: in order to limit the number of rows being returned, one uses the non-standard syntax select first 1 [skip n] .... I'm not aware of having needed this (otherwise I would already have known this), but an extra piece of knowledge is always good to know. I sometimes have to check whether a query returns a dataset, but this is not the same thing. It might be useful in programs where I display all the records (but not all the fields) from a dataset on program startup; normally I try to limit this by date, but it might be easier to display the first 50 records and then require the user to click on a button if she wants more records displayed.

No comments: