Hi all, got a nasty system hang issue that I have about run out of
ideas on:


We only get the hang on an Asus A6000 U, we havent seen it on any other
laptop, and, in fact, the driver for this device has been WHQLed in
various versions for the last 6 months or so.

The device is a PCCard device with an FPGA interface (It is also made
by us). We are using WinDbg on a 1394 cable to debug.

The hang happens with release drivers, and debug drivers that have no
DbgPrinting. With a lot of DbgPrinting, the laptop doesnt hang.

With just one DbgPrint at the very start of the Isr, wew do get tha
hang, but WinDbg does not show a continuous trace, so this is not an
unhandled interrupt as the ISr is not getting called.

We cannot break into the hung laptop using Ctrl+Break. We can also not
generate a user crash via the RightCtrl + Scroll Lock x 2 technique.

Re: System hang by fat_boy

fat_boy
Fri Apr 07 09:04:37 CDT 2006

Damn, hit the send button too soon...

It is also runing with verifier enabled with default settings, so
deadlock detection is on.

If anyone has any ideas about how to break into the system, or what
could cause a laptop hang like this I would be very interested.

Thanks in advance.


Re: System hang by Eliyas

Eliyas
Sat Apr 08 09:47:49 CDT 2006

I would put a debug trace statement at the end of ISR and in all the
routines that either acquire the interrupt lock (KeAcquireInterruptSpinLock)
explicitly or called by KeSynchronizeExecution - just to make sure you are
not deadlocking at DIRQL. I don't have access to the source right now, so
I'm not sure deadlock verification is made for interrupt spinlocks.

-Eliyas



Re: System hang by fat_boy

fat_boy
Mon Apr 10 09:16:02 CDT 2006

Interesting, I did that, and the PC didnt hang, even with the most
minimal tracing.

I eventually discovered that putting a KeStallExecutionProcessor(1) at
the end of the Isr (with no tracing) also 'fixes' the problem.

This only occurs, dont forget, on an Asus A6000. The cause is some
kind of a timing issue between the bus driver and the Ricoh cardbus
controller. This isnt my driver, and I dont follow the design
precisely, but the Isr only disables interrupts at the end, when it is
close to calling IoInsertDpcForIsr() And even then it only disables
interrupts on selected channels.

I think that perhaps the cardbus controller hangs the PCI bus if it
gets an interrupt disable, followed by register access, or, because
this is a two channel device, if interupts are too close together.

Thanks anyway.


Re: System hang by fat_boy

fat_boy
Wed Apr 19 04:32:33 CDT 2006

Hi all

Turns out it is a caching issue.

A data read from device memory after a write, flushes the cache, and we
dont get the hang.