Sunday, December 20, 2015

Zero values in Priority tables - a blessing and a curse

I am writing this for various reasons: to document for myself a quirk in Priority, to prepare an article for the Priority Users Internet Forum, and as a way of getting myself back into the habit of writing. If you don't use Priority, then this post will have little value.

In SQL databases, there is a way of distinguishing when a field has a value (even if that value is zero) and when it is not known whether a field has a value (there may be a value for what the field represents but it hasn't been entered into the database) – NULL. Handling null generally is problematic for programmers and can frequently lead to incomplete results. Two reports which ostensibly work on the same data can give different results as a result of null fields.

The designers of Priority decided to skip this minefield by removing the keyword 'null' from their SQL subset. A null value is represented by zero, which for 99% of the time is what one would naively expect. This solves a great deal of problems but ironically creates a problem, which is what I want to write about here.

SQL tables generally have one or more foreign keys – these are values which point to other tables. A simple table – such as supplier type – will have at a minimum three fields:
  1. 'id': in Priority-speak, an 'A' key; not displayed
  2. 'code': in Priority-speak, a 'U' key, displayed
  3. 'des': the description of the type, generally not indexed
This table has no foreign key, but its 'id' field will be a foreign key in other tables, for example, suppliers: a supplier may have defined a supplier type, in which case its 'suptype' field will contain the appropriate id value from the supplier types table. A supplier for which no supplier type has been defined will have zero in its 'suptype' field; this is allowed as it is not compulsory (or in Priority-speak, mandatory) to define a value for the 'suptype' field.

In a simple report which lists suppliers by their supplier type (and ordered by type), first will appear all the suppliers which have no supplier type, followed by those that do. This is good as such a minimal report would allow us to see which suppliers do not have a defined supplier type (the fact that we can establish this by means of a query within the suppliers' screen is beside the point). We can also easily add the sum of purchase orders for each supplier within a given time period. Should we wish to see the sum of purchase orders for a given supplier type, we simply make the 'supplier type' a parameter. If there were a code '100' and we wanted to know the sum of purchase orders for all suppliers whose supplier type is '100' within a  given period, we would simply enter that value for the parameter; no suppliers without a defined supplier type would appear.

The problems begin when we use a 'step' procedure to calculate data which are later displayed in a report. The procedure receives as a parameter a list of all supplier types that we wish to include in the report; we use this linked table of types in the query which selects the suppliers to be examined. If we want to see data for all the supplier types, then there is no problem, but should we wish to see data for only one supplier type, we will be astounded to discover that the final report contains data not only for the chosen supplier type but also for all the suppliers without a supplier type!

This is because each table – even linked tables passed to procedures – contain a dummy zero tuple. This is needed so that queries which link a supplier without a supplier type to the supplier types table will not fail (the zero in the supplier's 'subtype' field matches the zero in the supplier type table).

The standard way of writing such a query would be
SELECT ... FROM SUPPLIERS, SUPTYPES WHERE SUPPLIERS.SUPTYPE = SUPTYPES.SUPTYPE AND ...
The above is fine if we want all the suptypes, but is not so good when we only want a subset. Trying to correct the 'zero problem' by writing the below query will cause problems when we want all the suppliers – including those without a subtype.
SELECT ...
FROM SUPPLIERS, SUPTYPES WHERE SUPPLIERS.SUPTYPE = SUPTYPES.SUPTYPE AND SUPPLIERS.SUPTYPE > 0 ...
[Actually, there is a subtle definition problem with the word 'all' in the previous sentence: in certain cases, use of '*' will return only records which have a non-zero value in the given field, whereas passing an empty string would return all records. Sometimes this can be used as a solution to the 'zero problem'.]

So: sometimes we need an extra clause in the query, but of course, we can't predict when such a clause is needed.

The way which I have found to solve this problem is to calculate how many records there are in the suptypes table before it is linked as a procedure, then do the same after the table has been linked. If the results are the same then the user wanted everything, but if they differ, then suppliers without a suptype should not be included. This is fairly easy to do in a cursor based procedure  (the statements required to solve the zero problem are displayed in bold font below) …
:LINKED = :UNLINKED = :SUP = :TYP = 0; SELECT COUNT (*) INTO :UNLINKED FROM SUPTYPES; LINK SUPPLIERS TO :$.SUP; ERRMSG 1 WHERE :RETVAL <= 0; LINK SUPTYPES TO :$.TYP; ERRMSG 1 WHERE :RETVAL <= 0; SELECT COUNT (*) INTO :LINKED FROM SUPTYPES; DECLARE CUR CURSOR FOR SELECT SUPPLIERS.SUP, SUPPLIERS.SUPTYPE FROM SUPPLIERS, SUPTYPES WHERE SUPPLIERS.SUPTYPE = SUPTYPES.SUPTYPE; GOTO 300 WHERE :RETVAL <= 0; LABEL 100; FETCH CUR INTO :SUP, :TYP; GOTO 200 WHERE :RETVAL <= 0;> LOOP 100 WHERE (:LINKED <> :UNLIKED) AND (:TYP = 0); ...
This is more difficult to perform with the INSERT INTO/SELECT syntax:
:LINKED = :UNLINKED = :SUP = :TYP = 0; SELECT COUNT (*) INTO :UNLINKED FROM SUPTYPES; LINK SUPPLIERS TO :$.SUP; ERRMSG 1 WHERE :RETVAL <= 0; LINK SUPTYPES TO :$.TYP; ERRMSG 1 WHERE :RETVAL <= 0; SELECT COUNT (*) INTO :LINKED FROM SUPTYPES; INSERT INTO STACK4 (KEY, REALDATA) SELECT SUPPLIERS.SUP, SUM (PORDERS.DISPRICE) FROM SUPPLIERS, SUPTYPES, PORDERS WHERE SUPPLIERS.SUPTYPE = SUPTYPES.SUPTYPE AND SUPPLIERS.SUP = PORDERS.SUP AND PORDERS.CURDATE BETWEEN :$.FDT AND :$.TDT AND SUPPLIERS.SUP > 0 AND SUPPLIERS.SUPTYPE = (((:LINKED <> :UNLINKED) AND (SUPPLIERS.SUPTYPE = 0)) ? -1 : SUPPLIERS.SUPTYPE) ...
One advantage of using a cursor is that suppliers who have no purchase orders within the given time period can be included in the record set, although such a left join (as it is known in standard SQL) can be achieved by adding a question mark after the word PORDERS in the FROM clause. Of course, using a cursor enables further calculations which can't be squeezed into the INSERT INTO/SELECT type.

No comments: