Friday, October 23, 2020

Overcoming the 'leading zero' problem in Excel

I see that I haven't written on the topic of Excel and automation for two years - I must be doing something right. In the management program that I wrote (and always extending) for the OP, there is frequently a need to output data to Excel. In order to facilitate this, I wrote a few standard routines that extract data from some form of query (this could be a clientdataset, an SQL query or a 'simple' query, hence the need for a few routines): these extract the data and output a CSV file along with column headings. Another routine is then called that inputs the CSV file into an Excel spreadsheet and displays it.

There is a well known problem with Excel - fields that appear to be numeric but are actually alphanumeric have leading zeroes truncated, as Excel 'thinks' that the field is numeric. In the OP's world, this problem applies to identity card numbers: many (including mine) begin with a zero (mine begins 0176), and when a csv file with this number is inserted into a worksheet, the field displays 176. Until now this hasn't been a problem, but a new (and somewhat nasty and patronising psychologist/analyst) has been working on data output from the management program and it annoys him that the leading zeros from identity numbers are being dropped.

My first solution was something similar to what Priority does: surround the field with quotation marks. I innocently thought that Excel would not display these marks but that's not the case. My 'solution' probably annoyed this psychologist even more as now he had to contend with strings like "0176". I had to find another solution.

Theoretically the problem could be solved easily by formatting the column that would hold the identity numbers, but this is a chicken and egg problem: formatting the data after it has been read in doesn't help as the leading zero has already been stripped, and formatting the column before reading in the file doesn't help because a new sheet is always created with the data. I thought that I might be able to hack the problem by including a non-displayable character before the string but this didn't work either.

Excel does have a method for inserting data into an existing spreadsheet - I used to see this frequently when trying to input text files - via a wizard that helps the user to insert the file correctly, handling character sets and delimiters with ablomb. I recorded the process as a macro but there were too many problematic statements for me to translate into the required Delphi format so I discarded this approach.

Then I remembered that I already have implemented a solution - the varArray approach. This allows one to create a sheet, format its columns then input the data in an 'underhand' manner. I adapted the code that I already have in one module to the module where I need it, and lo and behold, the leading zero problem had been solved.

I then removed the new code from the module and inserted it into the same general utility module that contains the other Excel routines. This required a few minimal changes and now there is an alternative method for exporting to Excel that both solves the leading zero problem and doesn't require an intermediate file.

After I reported on this improvement, the annoying psychologist condescendingly wrote to me to make sure that nothing else has been affected by this change. Obviously he's never heard of modular code.

No comments: