A while back, I wrote a series of articles for beginning DBAs. Not on the technical side of how to handle SQL Server, but on the subject of “things I wish I’d known when I was starting out”.
Here’s the first:
DBA Tips & Tricks: Document “Why” Not “What”
One thing I find in reviewing and refactoring older database code, that makes it much more difficult to figure out what to fix and how, is the documentation provided.
We’ve all run into situations where there simply isn’t any documentation, or it consists of “--TODO: Document this section” or something equally unhelpful.
Many DBAs and developers will even claim that “good code documents itself” as a justification for never getting around to documenting it. Sometimes this is seen as lazy, sometimes it’s seen as clever. What it really amounts to, in my opinion, is a complete misunderstanding of what documentation is about.
Here’s an example of what some might call “well-documented code”:
--smalldatetime fix
ALTER FUNCTION [dbo].[fDateFix] (@dateval DATETIME)
RETURNS SMALLDATETIME
AS
BEGIN
-- Declare Variables
DECLARE @sdate SMALLDATETIME
DECLARE @edate SMALLDATETIME
DECLARE @newdate SMALLDATETIME
-- Set values
SET @sdate = CAST('1900-01-01' AS SMALLDATETIME);
SET @edate = CAST('2079-06-06' AS SMALLDATETIME);
-- Check value of input, correct if needed
IF @dateval IS NOT NULL
BEGIN
IF @dateval < @sdate
BEGIN
SET @newdate = @sdate
END
ELSE
IF @dateval > @edate
BEGIN
SET @newdate = @edate
END
ELSE
BEGIN
SET @newdate = @dateval
END
END
ELSE
BEGIN
SET @newdate = NULL
END
-- Return corrected date
RETURN @newdate
END
Each section says what it does. It clearly states that the function has the purpose of “smalldatetime fix”. The first part declares the variables, the next part assigns values to some of them, and so on.
This is a clear example of where “the code documents itself” is both true, and completely and utterly useless!
Why is it necessary to “fix” a smalldatetime? What are those “magical number” type dates that those variables are assigned to? Why does this function exist?
How about this, instead:
/*
Purpose: This takes DateTime input, checks if it is out-of-range
for datatype SmallDateTime, and sets to the closest
allowed value for that datatype if so.
Use when tables are designed with SmallDateTime and inputs
are not policed (free-form) in applications.
*/
ALTER FUNCTION [dbo].[fDateFix] (@dateval DATETIME)
RETURNS SMALLDATETIME
AS
BEGIN
DECLARE @sdate SMALLDATETIME
DECLARE @edate SMALLDATETIME
DECLARE @newdate SMALLDATETIME
SET @sdate = CAST('1900-01-01' AS SMALLDATETIME); -- Minimum allowed SmallDateTime value
SET @edate = CAST('2079-06-06' AS SMALLDATETIME); -- Maximum allowed SmallDateTime value
-- Check value of input, correct if needed
IF @dateval IS NOT NULL
BEGIN
IF @dateval < @sdate
BEGIN
SET @newdate = @sdate
END
ELSE
IF @dateval > @edate
BEGIN
SET @newdate = @edate
END
ELSE
BEGIN
SET @newdate = @dateval
END
END
ELSE
BEGIN
SET @newdate = NULL
END
RETURN @newdate
END
(Note, this function is one I copied out of a third-party application and is not intended to be an example of good code. It could obviously be improved by a blind man with a sledgehammer. It is used here simply to show the difference in documentation styles.)
Note that anyone reviewing the “what it does” version gets no help at all from the “documentation”. “Oh really? So this part declares variables? I would never have guessed that!”, isn’t something that one says without sarcasm.
But the “why was this written” documentation says things the code itself can’t say, and is useful to someone reviewing the code at a later time. It raises questions like, “Is there a comparable function that corrects from DateTime2 to DateTime? If so, why isn’t it a single function with the output type determined via a parameter?” And, “Why don’t they just police this in the application? Shouldn’t this be handled closer to the user, so they know that we’re just going to arbitrarily change their input?” And so on. Very useful to someone doing a refactor.
Documenting why changes were made, in the DDL scripts that implement the changes, can allow for a chronological study of either source control or a DDL log. Imagine the different between logging “Updates the table” type comments, vs “Per meeting with John Doe and Sue Smith on 1 Jan 2013, changed business rules to use UpdateDate column for the record instead of AuthorizationDate”. Suddenly, code changes that might otherwise be mysterious a year from now, are quite clear.
Keep this in mind when writing and reviewing code. In later years, you’ll be glad if you implement it.
If you have a good project ticketing system, “why” documentation can be as simple as “Purpose: See ticket DB-1103, notes from 3 May 13”. It adds a level of work to reading the documentation, since it’s not contained in the code, but allows for much richer “why” where that’s needed.
No comments:
Post a Comment