We had a requirement recently to manipulate the content of an incoming pdf document. The .net application in question already used a pdf document manipulation library so we decided to use that to interrogate the content of the document (after scanning it for viruses). If the document was not a valid pdf document (ie it was corrupt or it was not a pdf file) we needed to return a message to the user politely asking them to try a different file.
public bool ValidatePdf(byte[] pdfFile)
{
ThirdPartyLibrary.Pdf = new Pdf(pdfFile);
}
We pass the pdf as a byte array to the ThirdPartyLibrary Pdf constructor. If the byte array isn't a valid pdf document the ThirdPartyLibrary throws System.Exception.
An error occurred, but we can recover.
The key here is that while an error has occurred, it's a recoverable error. Rather than the application crashing we want to handle this particular case and return a friendly message to the user.
Swallowing errors!
public bool ValidatePdf(byte[] pdfFile)
{
try
{
ThirdPartyLibrary.Pdf = new Pdf(pdfFile);
}
catch(Exception)
{
return false;
}
return true;
}
While this code may do what we want, it also allows any number of errors that occur lower in the call stack to go unnoticed. That same constructor also throws System.ArgumentException, System.IOException and System.NullReferenceException (oh and it throws System.Exception for a number of other error cases too). If we catch the System.Exception thrown when an invalid pdf (or non-pdf) file is provided and return false, how can we guarantee that an invalid byte array caused the exception? How do we know that a perfectly valid PDF wasn't provided but something else went wrong?
Ideally System.Exception should only be caught in the global exception handler. We can do a number of things with the code above to try and avoid suppressing other errors, including checking the message on the Exception object and re-throwing if it doesn't match the string we expect. This isn't particularly elegant or robust.
When something goes wrong, throw a custom exception.
If the ThirdPartyLibrary had thrown a custom exception (e.g. InvalidPdfContentException) that extended Exception (not ApplicationException) the code could have looked like this:
public bool ValidatePdf(byte[] pdfFile)
{
try
{
ThirdPartyLibrary.Pdf = new Pdf(pdfFile);
}
catch(InvalidPdfContentException exception)
{
//logging could be useful here
return false;
}
return true;
}
With this code any other exceptions that are thrown (errors that we can't or don't want to recover from) will not be swallowed and can be handled by the global exception handler.
You should never write throw new System.Exception(). If the framework doesn't contain an appropriate exception type, create a custom exception. It doesn't need to do anything, other than being an indentifiable type.
Error codes vs exceptions
Chris Brumme from the CLR team wrote a great article on The Exception Model where he explains in depth how exceptions work. As exceptions are expensive, you should use them for exceptional (funny that) circumstances only. If exceptions are being thrown as part of the normal flow of the application or on a frequent basis you should re-work your API to return the error without the use of an exception (think Int.TryParse).
As our pdf code above likely falls into the case where exceptions are occuring too frequently (however bear in mind that our code is triggered by a user clicking a button and providing a document, not a server process that executes 1000 times per second) the best solution to the above could have been for our ThirdPartyLibrary to provide a bool ValidatePdfContent(byte[] pdfContent) method that checked the document content internally without the use of exceptions.
What are your thoughts and experiences with .net error handling?