How do I stop parsing an XML document with IVBSAXXMLReader in Delphi?
In order to quickly parse some large XML documents in a Delphi (2007) program, I have implemented the IVBSAXContentHandler interface and use it like this:
FXMLReader := CoSAXXMLReader60.Create;
FXMLReader.contentHandler := Self;
FXMLReader.parseURL(FXmlFile);
This works fine, as long as I simply parse the whole file, but I'd like to stop once I found the content that I am looking for. So my implementation of IVBSAXContentHandler.startElement checks for some condition and when it is true should abort further parsing. I tried this:
procedure TContentHandler.startElement(var strNamespaceURI, strLocalName, strQName: WideString; const oAttributes: IVBSAXAttributes);
begin
if SomeCondition then
SysUtils.Abort;
end;
Unfortunately this raises the rather unhelpful EOleException "Catastrophic failure". (I also tried raising a custom exception with the same result.)
MSDN says the following:
The ErrorHandler interface essentially allows the XMLReader to signal the ContentHandler implementation that it wants to abort processing. Conversely, ContentHandler implementations can indicate to the XMLReader that it wants to abort processing. This can be accomplished by simply raising an application-specific exception. This is especially useful for aborting processing once the implementation finds what it is looking for:
Private Sub IVBSAXContentHandler_characters(ByVal strChars As String)
' I found what I was looking for, abort processing
Err.Raise vbObjectError + errDone, "startElement", _
"I got what I want, let's go play!"
End Sub
So, apparently somehow I also need to implement the IVBSAXErrorHandler interface somehow. This interface needs three methods:
procedure TContentHandler.error(const oLocator: IVBSAXLocator;
var strErrorMessage: WideString; nErrorCode: Integer);
begin
end;
procedure TContentHandler.fatalError(const oLocator: IVBSAXLocator;
var strErrorMessage: WideString; nErrorCode: Integer);
begin
end;
procedure TContentHandler.ignorableWarning(const oLocator: IVBSAXLocator;
var strErrorMessage: WideString; nErrorCode: Integer);
begin
end;
and also must be assigned before calling the ParseURL method:
FXMLReader := CoSAXXMLReader60.Create;
FXMLReader.contentHandler := Self;
FXMLReader.errorHandler := Self;
FXMLReader.parseURL(FXmlFile);
Unfortunately that doesn't make any difference, because now the fatalError handler gets called with strErrorMessage = 'Catastrophic failure'. With an empty method body this still results in the above mentioned unhelpful EOleException "Catastrophic failure".
So, now I am out of ideas:
EDIT:
Based on Ondrej Kelle's answer, here is the solution I finally used:
Declare the following constant:
const
// idea taken from Delphi 10.1 unit System.Win.ComObj:
EExceptionRaisedHRESULT = HResult(E_UNEXPECTED or (1 shl 29)); // turn on customer bit
Add two new fields to the TContentHandler class:
FExceptObject: TObject;
FExceptAddr: Pointer;
Add this code to the destructor:
FreeAndNil(FExceptObject);
Add a new method SafeCallException:
function TContentHandler.SafeCallException(ExceptObject: TObject; ExceptAddr: Pointer): HResult;
var
GUID: TGUID;
exc: Exception;
begin
if ExceptObject is Exception then begin
exc := Exception(ExceptObject);
// Create a copy of the exception object and store it in the FExceptObject field
FExceptObject := exc.NewInstance;
Exception(FExceptObject).Create(exc.Message);
Exception(FExceptObject).HelpContext := exc.HelpContext;
// Store the exception address in the FExceptAddr field
FExceptAddr := ExceptAddr;
// return a custom HRESULT
Result := EExceptionRaisedHRESULT;
end else begin
ZeroMemory(@GUID, SizeOf(GUID));
Result := HandleSafeCallException(ExceptObject, ExceptAddr, GUID, '', '');
end;
end;
Add an exception handler to the calling code:
var
exc: Exception;
begin
try
FXMLReader := CoSAXXMLReader60.Create;
FXMLReader.contentHandler := Self;
// we do not need an errorHandler
FXMLReader.parseURL(FXmlFile);
FXMLReader := nil;
except
on e: EOleException do begin
// Check for the custom HRESULT
if e.ErrorCode = EExceptionRaisedHRESULT then begin
// Check that the exception object is assigned
if Assigned(FExceptObject) then begin
exc := Exception(FExceptObject);
// set the pointer to NIL
FExceptObject := nil;
// raise the exception a the given address
raise exc at FExceptAddr;
end;
end;
// fallback: raise the original exception
raise;
end;
end;
end;
While this works for me, it has a serious flaw: It copies only the Message and the HelpContext property of the original exception. So, if there are more properties/fields, eg
EInOutError = class(Exception)
public
ErrorCode: Integer;
end;
These will not be initialized when the exception is re-raised in the calling code.
The advantage is that you will get the correct exception address in the debugger. Beware that you won't get the correct call stack.
Simply calling Abort;
is fine. In this case, just override SafeCallException
in your IVBSAXContentHandler
implementor class:
function TContentHandler.SafeCallException(ExceptObject: TObject; ExceptAddr: Pointer): HRESULT;
begin
Result := HandleSafeCallException(ExceptObject, ExceptAddr, TGUID.Empty, '', '');
end;
HandleSafeCallException
supplied in ComObj
will cause EAbort
you're raising to be translated into a HRESULT
value E_ABORT
which will then be translated back to EAbort
by SafeCallError
.
Alternatively, you can raise your own exception class, override SafeCallException
to translate it into your specific HRESULT
value and replace SafeCallErrorProc
with your own to translate it back into your Delphi exception which you can then handle on the calling side.