We have got our WDF driver up and running with an 8 lane PCI express
card. It uses hardware DMA scatter gather and appears to be working
well. However we are only getting about 50% of the theoretical
bandwidth over 8 lanes (1GB/s instead of 2GB/s). Unfortunately we need
to hit around 60%. The 50% doesn't include driver overhead and is
measured by the hardware during the transfer.
One of the main factors that can affect PCI express efficiency is the
payload size. From what I have read the maximum payload is 4k which is
supposed to give near to 100%. For 128 byte payloads with an estimated
application overhead it's around 55%, and for 64 bytes it is around
40%. On our AMD platform as we are not measuring the overhead, it is
feasible that we're getting 64 byte payloads (our card supports 4k).
A few things we have noticed so far:
ATI X1900 XT has a maximum payload size of 128 (according to SiSoft
tool)
Intel 965 chipset has a maximum payload size of 128 (according to the
data sheet).
Here are a couple of questions I hope someone can shed some light on:
1) On an AMD system PCI express is interfaced by HyperTransport which
is packet based, and has a maximum payload size of 64 bytes. Does this
mean its PCI express interface is also limited to 64 byte payloads?
2) Does anyone know of a motherboard chipset that supports large
payloads (4k)?
Apologies if this is slightly off topic, but I guessed it is something
that will affect anyone writing a PCI express driver.