IncompleteThoughts: DBA Article: Documentation

A while back, I wrote a series of articles for beginning DBAs. Not on the technical side of how to handle SQL Server, but on the subject of “things I wish I’d known when I was starting out”.

Here’s the first:

DBA Tips & Tricks: Document “Why” Not “What”

One thing I find in reviewing and refactoring older database code, that makes it much more difficult to figure out what to fix and how, is the documentation provided.

We’ve all run into situations where there simply isn’t any documentation, or it consists of “--TODO: Document this section” or something equally unhelpful.

Many DBAs and developers will even claim that “good code documents itself” as a justification for never getting around to documenting it. Sometimes this is seen as lazy, sometimes it’s seen as clever. What it really amounts to, in my opinion, is a complete misunderstanding of what documentation is about.

Here’s an example of what some might call “well-documented code”:


--smalldatetime fix
ALTER FUNCTION [dbo].[fDateFix] (@dateval DATETIME)
RETURNS SMALLDATETIME
AS 
    BEGIN
    -- Declare Variables  
        DECLARE @sdate SMALLDATETIME
        DECLARE @edate SMALLDATETIME
        DECLARE @newdate SMALLDATETIME
   
    -- Set values
        SET @sdate = CAST('1900-01-01' AS SMALLDATETIME); 
        SET @edate = CAST('2079-06-06' AS SMALLDATETIME); 
   
    -- Check value of input, correct if needed
        IF @dateval IS NOT NULL 
            BEGIN
                IF @dateval < @sdate 
                    BEGIN
                        SET @newdate = @sdate
                    END
                ELSE 
                    IF @dateval > @edate 
                        BEGIN
                            SET @newdate = @edate
                        END
                    ELSE 
                        BEGIN
                            SET @newdate = @dateval
                        END
            END
        ELSE 
            BEGIN
                SET @newdate = NULL 
            END
  
    -- Return corrected date          
        RETURN @newdate
    END

Each section says what it does. It clearly states that the function has the purpose of “smalldatetime fix”. The first part declares the variables, the next part assigns values to some of them, and so on.

This is a clear example of where “the code documents itself” is both true, and completely and utterly useless!

Why is it necessary to “fix” a smalldatetime? What are those “magical number” type dates that those variables are assigned to? Why does this function exist?

How about this, instead:

/*
   Purpose:    This takes DateTime input, checks if it is out-of-range
               for datatype SmallDateTime, and sets to the closest
               allowed value for that datatype if so.

               Use when tables are designed with SmallDateTime and inputs
               are not policed (free-form) in applications.
*/
ALTER FUNCTION [dbo].[fDateFix] (@dateval DATETIME)
RETURNS SMALLDATETIME
AS 
    BEGIN
        DECLARE @sdate SMALLDATETIME
        DECLARE @edate SMALLDATETIME
        DECLARE @newdate SMALLDATETIME
   
        SET @sdate = CAST('1900-01-01' AS SMALLDATETIME); -- Minimum allowed SmallDateTime value
        SET @edate = CAST('2079-06-06' AS SMALLDATETIME); -- Maximum allowed SmallDateTime value
   
       -- Check value of input, correct if needed
        IF @dateval IS NOT NULL 
            BEGIN
                IF @dateval < @sdate 
                    BEGIN
                        SET @newdate = @sdate
                    END
                ELSE 
                    IF @dateval > @edate 
                        BEGIN
                            SET @newdate = @edate
                        END
                    ELSE 
                        BEGIN
                            SET @newdate = @dateval
                        END
            END
        ELSE 
            BEGIN
                SET @newdate = NULL 
            END
  
        RETURN @newdate
    END

(Note, this function is one I copied out of a third-party application and is not intended to be an example of good code. It could obviously be improved by a blind man with a sledgehammer. It is used here simply to show the difference in documentation styles.)

Note that anyone reviewing the “what it does” version gets no help at all from the “documentation”. “Oh really? So this part declares variables? I would never have guessed that!”, isn’t something that one says without sarcasm.

But the “why was this written” documentation says things the code itself can’t say, and is useful to someone reviewing the code at a later time. It raises questions like, “Is there a comparable function that corrects from DateTime2 to DateTime? If so, why isn’t it a single function with the output type determined via a parameter?” And, “Why don’t they just police this in the application? Shouldn’t this be handled closer to the user, so they know that we’re just going to arbitrarily change their input?” And so on. Very useful to someone doing a refactor.

Documenting why changes were made, in the DDL scripts that implement the changes, can allow for a chronological study of either source control or a DDL log. Imagine the different between logging “Updates the table” type comments, vs “Per meeting with John Doe and Sue Smith on 1 Jan 2013, changed business rules to use UpdateDate column for the record instead of AuthorizationDate”. Suddenly, code changes that might otherwise be mysterious a year from now, are quite clear.

Keep this in mind when writing and reviewing code. In later years, you’ll be glad if you implement it.

If you have a good project ticketing system, “why” documentation can be as simple as “Purpose: See ticket DB-1103, notes from 3 May 13”. It adds a level of work to reading the documentation, since it’s not contained in the code, but allows for much richer “why” where that’s needed.

IncompleteThoughts

Saturday, October 5, 2013

DBA Article: Documentation

No comments:

Post a Comment