How to debug nondeterministic access violation crash?

Our C#/COM/C++ application is crashing and I need help debugging it. Running with gflags enabled and WinDbg attached, we determined the crashes are caused by an access violation, but we haven't been able to narrow it down any more than that. We are not seeing the issue on all machines; there are a couple of machines that seem to reproduce the issue frequently but not deterministically. We have observed the application crash from simply switching away from the application (say, Alt-Tab) and then back. Output from WinDbg is below.

We have been trying to systematically comment out areas of code that could be causing the problem, but we haven't had much success yet.

Any suggestions on what debugging steps or tools we should try?

!analyze -v

EXCEPTION_RECORD: ffffffff -- (.exr 0xffffffffffffffff) ExceptionAddress: 1a584ff2 (+0x1a584ff1)
ExceptionCode: c0000005 (Access violation) ExceptionFlags: 00000000 NumberParameters: 2 Parameter[0]: 00000000 Parameter[1]: 1a584ff2 Attempt to read from address 1a584ff2

PROCESS_NAME: ProcessFiles.exe

ERROR_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

EXCEPTION_CODE: (NTSTATUS) 0xc0000005 - The instruction at 0x%08lx referenced memory at 0x%08lx. The memory could not be %s.

EXCEPTION_PARAMETER1: 00000000

EXCEPTION_PARAMETER2: 1a584ff2

READ_ADDRESS: 1a584ff2

FOLLOWUP_IP: Ed20+1a584ff1 1a584ff2 ?? ???

NTGLOBALFLAG: 2000000

APPLICATION_VERIFIER_FLAGS: 0

IP_MODULE_UNLOADED: Ed20+1a584ff1 1a584ff2 ?? ???

MANAGED_STACK: (TransitionMU) 0EC6F6F4 7B1D8CCE System_Windows_Forms_ni!System.Windows.Forms.Application+ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(Int32, Int32, Int32)+0x24e 0EC6F790 7B1D8937 System_Windows_Forms_ni!System.Windows.Forms.Application+ThreadContext.RunMessageLoopInner(Int32, System.Windows.Forms.ApplicationContext)+0x177 0EC6F7E4 7B1D8781 System_Windows_Forms_ni!System.Windows.Forms.Application+ThreadContext.RunMessageLoop(Int32, System.Windows.Forms.ApplicationContext)+0x61 0EC6F814 7B195911 System_Windows_Forms_ni!System.Windows.Forms.Application.Run(System.Windows.Forms.Form)+0x31 0EC6F828 0969D97A Extract_Utilities_Forms!Extract.Utilities.Forms.VerificationForm`1[[System.__Canon, mscorlib]].A(System.Object)+0x23a 0EC6F8C0 79A00EEE mscorlib_ni!System.Threading.ThreadHelper.ThreadStart_Context(System.Object)+0x72a25e 0EC6F8CC 792E019F mscorlib_ni!System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext , System.Threading.ContextCallback, System.Object)+0x6f 0EC6F8E4 797DB48A mscorlib_ni!System.Threading.ThreadHelper.ThreadStart(System.Object)+0x4a (TransitionUM)

LAST_CONTROL_TRANSFER: from 7e418734 to 1a584ff2

FAULTING_THREAD: ffffffff

ADDITIONAL_DEBUG_TEXT: Followup set based on attribute [ip_not_executable] from Frame:[0] on thread:[e30]

BUGCHECK_STR: APPLICATION_FAULT_BAD_INSTRUCTION_PTR_INVALID_POINTER_READ_WRONG_SYMBOLS_WINDOW_HOOK

PRIMARY_PROBLEM_CLASS: BAD_INSTRUCTION_PTR

DEFAULT_BUCKET_ID: BAD_INSTRUCTION_PTR

STACK_TEXT: 7b1d8cce System_Windows_Forms_ni!System.Windows.Forms.Application+ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop+0xc 7b1d8937 System_Windows_Forms_ni!System.Windows.Forms.Application+ThreadContext.RunMessageLoopInner+0x0 7b1d8781 System_Windows_Forms_ni!System.Windows.Forms.Application+ThreadContext.RunMessageLoop+0x0 7b195911 System_Windows_Forms_ni!System.Windows.Forms.Application.Run+0x31 0969d97a Extract_Utilities_Forms!Extract.Utilities.Forms.VerificationForm`1[[System.__Canon, mscorlib]].A+0x23a 79a00eee mscorlib_ni!System.Threading.ThreadHelper.ThreadStart_Context+0x72a25e 792e019f mscorlib_ni!System.Threading.ExecutionContext.Run+0x6f 797db48a mscorlib_ni!System.Threading.ThreadHelper.ThreadStart+0x4a

STACK_COMMAND: .ecxr ; ~~[e30] ; .frame 0 ; ** Pseudo Context ** ; kb

FAILED_INSTRUCTION_ADDRESS: Ed20+1a584ff1 1a584ff2 ??
???

SYMBOL_NAME: Ed20

FOLLOWUP_NAME: MachineOwner

MODULE_NAME: Ed20

IMAGE_NAME: Ed20

DEBUG_FLR_IMAGE_TIMESTAMP: 0

FAILURE_BUCKET_ID: BAD_INSTRUCTION_PTR_c0000005_Ed20!Unloaded

BUCKET_ID: APPLICATION_FAULT_BAD_INSTRUCTION_PTR_INVALID_POINTER_READ_WRONG_SYMBOLS_WINDOW_HOOK_BAD_IP_Ed20

Followup: MachineOwner


Find a computer that reproduces the crash fairly often, and install WinDbg on that computer. Then run windbg.exe -I , which will make WinDbg the post-mortem crash handler.

Wait for the crash to occur. When that happens, WinDbg will automatically open at the point of the crash. Use the WinDbg command kpn to get a stack trace. (You may need to ensure that you have symbols on the machine, too.)


Thanks for the replies.

We ended up finding the problem by evaluating all code changes since the last version of the software (which didn't crash). The culprit was that HideSelection was being set to false in the OnLostFocus override of a text control. Per this post-- that causes bad things to happen (well, on some machines, anyway).

链接地址: http://www.djcxy.com/p/82312.html

上一篇: 如何检测MinGW下的堆损坏错误?

下一篇: 如何调试非确定性访问冲突崩溃?