Saturday, October 5, 2013

DBA Article: Documentation

A while back, I wrote a series of articles for beginning DBAs.  Not on the technical side of how to handle SQL Server, but on the subject of “things I wish I’d known when I was starting out”.

Here’s the first:

DBA Tips & Tricks: Document “Why” Not “What”

One thing I find in reviewing and refactoring older database code, that makes it much more difficult to figure out what to fix and how, is the documentation provided.

We’ve all run into situations where there simply isn’t any documentation, or it consists of “--TODO: Document this section” or something equally unhelpful.

Many DBAs and developers will even claim that “good code documents itself” as a justification for never getting around to documenting it. Sometimes this is seen as lazy, sometimes it’s seen as clever. What it really amounts to, in my opinion, is a complete misunderstanding of what documentation is about.

Here’s an example of what some might call “well-documented code”:


--smalldatetime fix
ALTER FUNCTION [dbo].[fDateFix] (@dateval DATETIME)
RETURNS SMALLDATETIME
AS
    BEGIN
   
-- Declare Variables 
       
DECLARE @sdate SMALLDATETIME
       
DECLARE @edate SMALLDATETIME
       
DECLARE @newdate SMALLDATETIME
  
   
-- Set values
       
SET @sdate = CAST('1900-01-01' AS SMALLDATETIME);
       
SET @edate = CAST('2079-06-06' AS SMALLDATETIME);
  
   
-- Check value of input, correct if needed
       
IF @dateval IS NOT NULL
           
BEGIN
                IF
@dateval < @sdate
                   
BEGIN
                        SET
@newdate = @sdate
                   
END
                ELSE
                    IF
@dateval > @edate
                       
BEGIN
                            SET
@newdate = @edate
                       
END
                    ELSE
                        BEGIN
                            SET
@newdate = @dateval
                       
END
            END
        ELSE
            BEGIN
                SET
@newdate = NULL
           
END
 
   
-- Return corrected date         
       
RETURN @newdate
   
END

Each section says what it does. It clearly states that the function has the purpose of “smalldatetime fix”. The first part declares the variables, the next part assigns values to some of them, and so on.

This is a clear example of where “the code documents itself” is both true, and completely and utterly useless!

Why is it necessary to “fix” a smalldatetime? What are those “magical number” type dates that those variables are assigned to? Why does this function exist?

How about this, instead:

/*
   Purpose:    This takes DateTime input, checks if it is out-of-range
               for datatype SmallDateTime, and sets to the closest
               allowed value for that datatype if so.

               Use when tables are designed with SmallDateTime and inputs
               are not policed (free-form) in applications.
*/
ALTER FUNCTION [dbo].[fDateFix] (@dateval DATETIME)
RETURNS SMALLDATETIME
AS
    BEGIN
        DECLARE
@sdate SMALLDATETIME
       
DECLARE @edate SMALLDATETIME
       
DECLARE @newdate SMALLDATETIME
  
       
SET @sdate = CAST('1900-01-01' AS SMALLDATETIME); -- Minimum allowed SmallDateTime value
       
SET @edate = CAST('2079-06-06' AS SMALLDATETIME); -- Maximum allowed SmallDateTime value
  
       -- Check value of input, correct if needed
       
IF @dateval IS NOT NULL
           
BEGIN
                IF
@dateval < @sdate
                   
BEGIN
                        SET
@newdate = @sdate
                   
END
                ELSE
                    IF
@dateval > @edate
                       
BEGIN
                            SET
@newdate = @edate
                       
END
                    ELSE
                        BEGIN
                            SET
@newdate = @dateval
                       
END
            END
        ELSE
            BEGIN
                SET
@newdate = NULL
           
END
 
        RETURN
@newdate
   
END

(Note, this function is one I copied out of a third-party application and is not intended to be an example of good code. It could obviously be improved by a blind man with a sledgehammer. It is used here simply to show the difference in documentation styles.)

Note that anyone reviewing the “what it does” version gets no help at all from the “documentation”. “Oh really? So this part declares variables? I would never have guessed that!”, isn’t something that one says without sarcasm.

But the “why was this written” documentation says things the code itself can’t say, and is useful to someone reviewing the code at a later time. It raises questions like, “Is there a comparable function that corrects from DateTime2 to DateTime? If so, why isn’t it a single function with the output type determined via a parameter?” And, “Why don’t they just police this in the application? Shouldn’t this be handled closer to the user, so they know that we’re just going to arbitrarily change their input?” And so on. Very useful to someone doing a refactor.

Documenting why changes were made, in the DDL scripts that implement the changes, can allow for a chronological study of either source control or a DDL log. Imagine the different between logging “Updates the table” type comments, vs “Per meeting with John Doe and Sue Smith on 1 Jan 2013, changed business rules to use UpdateDate column for the record instead of AuthorizationDate”. Suddenly, code changes that might otherwise be mysterious a year from now, are quite clear.

Keep this in mind when writing and reviewing code. In later years, you’ll be glad if you implement it.

If you have a good project ticketing system, “why” documentation can be as simple as “Purpose: See ticket DB-1103, notes from 3 May 13”. It adds a level of work to reading the documentation, since it’s not contained in the code, but allows for much richer “why” where that’s needed.

No comments:

Post a Comment