Tuesday, August 21, 2012

Artful Recovery


Error handling is not rocket science and yet most developers get this so far wrong as to be embarrassing. The real purpose of error handling is threefold: to limit the impact of unanticipated interruption on processing overall, to provide the user with a graceful recovery so they may continue work, and to allow a programmer to determine and correct the underlying cause of the problem.

To accomplish this in modern object-oriented languages you need to bubble and persist your errors. Exceptions can happen anytime and nearly everywhere: the network could go down; a SQL log file can run out of space; a server can crash. Knowing this you should never assume that your good intentions (say to update a data record) will proceed along unperturbed.

“Try-catch” all data access routines as well as any methods you are calling from another class or library. You should think ahead to include code in your catching clause to resolve those situations you have initiated in your own class. Then depending on what invokes your class, you should either rethrow the entire exception or persist the exception so that you can retrieve it later.

Of course in your outermost class you need to decide how to notify the user of the exception. If your application is all client-side then create an error notification form that supplies both a simple-language description of the error along with a “details button” that provides the whole technical error code and stack dump, including the dead module’s version and session information.

The following pattern separates system-type errors (the query crashed) from errors in business logic. This is a perfectly reasonable way to handle errors, as system level errors might indicate a more severe situation that may merit immediate technical assistance, whereas business errors may simply be a missing or mistyped parameter to a function call.

Two key elements create this approach. The first is that you should have all your modules that perform core business-logic implement a commonly defined custom Error object. Here is an example of one I use:

public struct MyErrorObject
{
public string ErrAgentName;
public string ErrAgentVsn;
public string ErrDescription;
public int ErrSeverity;
}
public void SetErrObject(string ErrorReason, string WarnOrFatal)
{
ThisError = new MyErrorObject();
ThisError.ErrAgentName = Assembly.GetCallingAssembly().CodeBase;
ThisError.ErrAgentVsn = Assembly.GetCallingAssembly().Get- Name().Version.ToString();
ThisError.ErrDescription = ErrorReason;
if (WarnOrFatal == “Fatal”)
{ThisError.ErrSeverity = 2;}
else
{ThisError.ErrSeverity = 1;}
}

Notice that I define a severity level that allows you, in the calling parent, to decide if you may continue processing with just a warning. Also notice that the SetErrObject logic sets version information using reflection to allow for more detailed debugging by tech support.

Secondly, methods in classes that implement this object should return a boolean to indicate success or failure: a typical invocation then would look like:

public bool ValidateParameters()
{
if (userid.Length < 3)
{
string invalidUser = “User ID ” + userid + ” is Invalid”;
this.SetErrObject(invalidUser, “Fatal”);
return false;
}
// … more validation stuff follows

And the outer call to run the whole thing would look like:

try
{if (!ValidateParameters())
{if (ThisError.ErrSeverity > 1)
{return false;}
}
catch
{throw new Exception(“ValidateParameters Failed!”);}

Notice that we still try-catch the method and bubble up errors along both pathways: any exception from the called method gets thrown upward in the catch clause, and if the error was a business logic failure we return “false” to the user interface.

If your application is running on a server then have the top module send appropriate email notification to tech support. If your app is a stateless service then you should persist the error to a database (along with the session key naturally); furthermore you should provide an additional method to retrieve and send this info to the client side on a subsequent request.

When your application hits an error it is like a sick child that needs some loving attention. The difference between just bailing with a message-box and providing fully nuanced error-handling is like the difference between giving the sick kid some consommé or giving him a nice hot bowl of chunky chicken noodle soup and calling the doctor. Take the time to handle your errors with the same care you would dedicate to your own sick child.