My question is as follows:

Suppose I have a number of driver-allocated buffers. How can I map them
into a single contiguous user-space buffer. I guess I should build an
MDL that describes the physical pages mapped for these buffers and call
MmMapLockedPagesSpecifyCache(), but did someone do this in practice?
There seems to be no function to add several different pages to an MDL.

Thanks a lot in advance,
Greg.

Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Thu Aug 03 13:34:29 CDT 2006

Very bad idea. You expose kernel pool memory to user mode, which can cause
security leaks.
Why do such stange a thing?

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com

<grishka@gmail.com> wrote in message
news:1154629702.353304.130590@i42g2000cwa.googlegroups.com...
> My question is as follows:
>
> Suppose I have a number of driver-allocated buffers. How can I map them
> into a single contiguous user-space buffer. I guess I should build an
> MDL that describes the physical pages mapped for these buffers and call
> MmMapLockedPagesSpecifyCache(), but did someone do this in practice?
> There seems to be no function to add several different pages to an MDL.
>
> Thanks a lot in advance,
> Greg.
>


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Don

Don
Thu Aug 03 13:35:10 CDT 2006

Even if you map them to a MDL you have not exported them to user space.
Your model is flawed, have the user space application pass the buffer to the
driver. Having the driver do the alloc always produces a copy.

--
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
http://www.windrvr.com
Remove StopSpam from the email to reply

<grishka@gmail.com> wrote in message
news:1154629702.353304.130590@i42g2000cwa.googlegroups.com...
> My question is as follows:
>
> Suppose I have a number of driver-allocated buffers. How can I map them
> into a single contiguous user-space buffer. I guess I should build an
> MDL that describes the physical pages mapped for these buffers and call
> MmMapLockedPagesSpecifyCache(), but did someone do this in practice?
> There seems to be no function to add several different pages to an MDL.
>
> Thanks a lot in advance,
> Greg.
>



Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Thu Aug 03 15:44:14 CDT 2006

Don,

I know that this is an unusual practice, but, due to some severe
performance restrictions, I do have to expose kernel-space buffers to
the user mode. In short, these buffers are allocated by some other
device driver and I can't copy the data into another buffer because of
the associated performance penalty. Would it be a single buffer, it's
clear how to map it to the user space, but I'm talking about multiple
ones here.


Don Burn wrote:
> Even if you map them to a MDL you have not exported them to user space.
> Your model is flawed, have the user space application pass the buffer to the
> driver. Having the driver do the alloc always produces a copy.
>
> --
> Don Burn (MVP, Windows DDK)
> Windows 2k/XP/2k3 Filesystem and Driver Consulting
> http://www.windrvr.com
> Remove StopSpam from the email to reply
>
> <grishka@gmail.com> wrote in message
> news:1154629702.353304.130590@i42g2000cwa.googlegroups.com...
> > My question is as follows:
> >
> > Suppose I have a number of driver-allocated buffers. How can I map them
> > into a single contiguous user-space buffer. I guess I should build an
> > MDL that describes the physical pages mapped for these buffers and call
> > MmMapLockedPagesSpecifyCache(), but did someone do this in practice?
> > There seems to be no function to add several different pages to an MDL.
> >
> > Thanks a lot in advance,
> > Greg.
> >


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Thu Aug 03 15:45:03 CDT 2006

Maxim,

I know that this is an unusual practice, but, due to some severe
performance restrictions, I do have to expose kernel-space buffers to
the user mode. In short, these buffers are allocated by some other
device driver and I can't copy the data into another buffer because of
the associated performance penalty. Would it be a single buffer, it's
clear how to map it to the user space, but I'm talking about multiple
ones here.



Maxim S. Shatskih wrote:
> Very bad idea. You expose kernel pool memory to user mode, which can cause
> security leaks.
> Why do such stange a thing?
>
> --
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> maxim@storagecraft.com
> http://www.storagecraft.com
>
> <grishka@gmail.com> wrote in message
> news:1154629702.353304.130590@i42g2000cwa.googlegroups.com...
> > My question is as follows:
> >
> > Suppose I have a number of driver-allocated buffers. How can I map them
> > into a single contiguous user-space buffer. I guess I should build an
> > MDL that describes the physical pages mapped for these buffers and call
> > MmMapLockedPagesSpecifyCache(), but did someone do this in practice?
> > There seems to be no function to add several different pages to an MDL.
> >
> > Thanks a lot in advance,
> > Greg.
> >


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Doron

Doron
Thu Aug 03 20:40:27 CDT 2006

have you measured perf and seen that you have a perf problem or is this just
a guess based on potential bandwidth and w/out measurement?

--
Please do not send e-mail directly to this alias. this alias is for
newsgroup purposes only.
This posting is provided "AS IS" with no warranties, and confers no rights.


<grishka@gmail.com> wrote in message
news:1154637854.081643.210730@h48g2000cwc.googlegroups.com...
> Don,
>
> I know that this is an unusual practice, but, due to some severe
> performance restrictions, I do have to expose kernel-space buffers to
> the user mode. In short, these buffers are allocated by some other
> device driver and I can't copy the data into another buffer because of
> the associated performance penalty. Would it be a single buffer, it's
> clear how to map it to the user space, but I'm talking about multiple
> ones here.
>
>
> Don Burn wrote:
>> Even if you map them to a MDL you have not exported them to user space.
>> Your model is flawed, have the user space application pass the buffer to
>> the
>> driver. Having the driver do the alloc always produces a copy.
>>
>> --
>> Don Burn (MVP, Windows DDK)
>> Windows 2k/XP/2k3 Filesystem and Driver Consulting
>> http://www.windrvr.com
>> Remove StopSpam from the email to reply
>>
>> <grishka@gmail.com> wrote in message
>> news:1154629702.353304.130590@i42g2000cwa.googlegroups.com...
>> > My question is as follows:
>> >
>> > Suppose I have a number of driver-allocated buffers. How can I map them
>> > into a single contiguous user-space buffer. I guess I should build an
>> > MDL that describes the physical pages mapped for these buffers and call
>> > MmMapLockedPagesSpecifyCache(), but did someone do this in practice?
>> > There seems to be no function to add several different pages to an MDL.
>> >
>> > Thanks a lot in advance,
>> > Greg.
>> >
>



Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Fri Aug 04 00:51:53 CDT 2006

> I know that this is an unusual practice, but, due to some severe
> performance restrictions, I do have to expose kernel-space buffers to
> the user mode. In short, these buffers are allocated by some other
> device driver and I can't copy the data into another buffer because of
> the associated performance penalty. Would it be a single buffer, it's
> clear how to map it to the user space, but I'm talking about multiple
> ones here.

Correct approach:

- allocate in user mode, pass the pending DeviceIoControl with METHOD_IN_DIRECT
to the driver. In the driver, map Irp->MdlAddress to kernel space and enjoy
your shared memory :-) do not forget to provide the cancel routine for the IRP.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Fri Aug 04 05:50:01 CDT 2006

Yes, this came out as the only solution after quite a bit of research.
Copying this amount of data eats up an unacceptable amount of CPU.

Doron Holan [MS] wrote:
> have you measured perf and seen that you have a perf problem or is this just
> a guess based on potential bandwidth and w/out measurement?
>
> --
> Please do not send e-mail directly to this alias. this alias is for
> newsgroup purposes only.
> This posting is provided "AS IS" with no warranties, and confers no rights.
>
>
> <grishka@gmail.com> wrote in message
> news:1154637854.081643.210730@h48g2000cwc.googlegroups.com...
> > Don,
> >
> > I know that this is an unusual practice, but, due to some severe
> > performance restrictions, I do have to expose kernel-space buffers to
> > the user mode. In short, these buffers are allocated by some other
> > device driver and I can't copy the data into another buffer because of
> > the associated performance penalty. Would it be a single buffer, it's
> > clear how to map it to the user space, but I'm talking about multiple
> > ones here.
> >
> >
> > Don Burn wrote:
> >> Even if you map them to a MDL you have not exported them to user space.
> >> Your model is flawed, have the user space application pass the buffer to
> >> the
> >> driver. Having the driver do the alloc always produces a copy.
> >>
> >> --
> >> Don Burn (MVP, Windows DDK)
> >> Windows 2k/XP/2k3 Filesystem and Driver Consulting
> >> http://www.windrvr.com
> >> Remove StopSpam from the email to reply
> >>
> >> <grishka@gmail.com> wrote in message
> >> news:1154629702.353304.130590@i42g2000cwa.googlegroups.com...
> >> > My question is as follows:
> >> >
> >> > Suppose I have a number of driver-allocated buffers. How can I map them
> >> > into a single contiguous user-space buffer. I guess I should build an
> >> > MDL that describes the physical pages mapped for these buffers and call
> >> > MmMapLockedPagesSpecifyCache(), but did someone do this in practice?
> >> > There seems to be no function to add several different pages to an MDL.
> >> >
> >> > Thanks a lot in advance,
> >> > Greg.
> >> >
> >


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Fri Aug 04 05:53:40 CDT 2006

OK, this is good, but my question was about mapping more than one
kernel-space buffer into ONE CONTIGUOUS user-mode virtual range. I know
that the MDL header is followed by an array of page indexes, but
there's no API to add separate pages to that array. Would it be OK if I
initialize an MDL, build this list manually and map it through
MmMap...()?

Maxim S. Shatskih wrote:
> > I know that this is an unusual practice, but, due to some severe
> > performance restrictions, I do have to expose kernel-space buffers to
> > the user mode. In short, these buffers are allocated by some other
> > device driver and I can't copy the data into another buffer because of
> > the associated performance penalty. Would it be a single buffer, it's
> > clear how to map it to the user space, but I'm talking about multiple
> > ones here.
>
> Correct approach:
>
> - allocate in user mode, pass the pending DeviceIoControl with METHOD_IN_DIRECT
> to the driver. In the driver, map Irp->MdlAddress to kernel space and enjoy
> your shared memory :-) do not forget to provide the cancel routine for the IRP.
>
> --
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> maxim@storagecraft.com
> http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Fri Aug 04 05:57:00 CDT 2006

> OK, this is good, but my question was about mapping more than one
> kernel-space buffer into ONE CONTIGUOUS user-mode virtual range.

Impossible. If you map several buffers - then MM provides no guarantees at all
about how they will be mapped to user.

Use one huge user-allocated buffer, chop it to parts and submit an overlapped
IOCTL for each part. This is the solution.

> that the MDL header is followed by an array of page indexes, but
> there's no API to add separate pages to that array.

Surely not so.

>Would it be OK if I
> initialize an MDL, build this list manually and map it through
> MmMap...()?

Once more: mapping kernel pool memory to user space is a very bad idea.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Fri Aug 04 05:54:30 CDT 2006

Then at least allocate in user mode and map in kernel mode, not vice versa.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com

<grishka@gmail.com> wrote in message
news:1154688601.603603.57700@m73g2000cwd.googlegroups.com...
> Yes, this came out as the only solution after quite a bit of research.
> Copying this amount of data eats up an unacceptable amount of CPU.
>
> Doron Holan [MS] wrote:
> > have you measured perf and seen that you have a perf problem or is this
just
> > a guess based on potential bandwidth and w/out measurement?
> >
> > --
> > Please do not send e-mail directly to this alias. this alias is for
> > newsgroup purposes only.
> > This posting is provided "AS IS" with no warranties, and confers no rights.
> >
> >
> > <grishka@gmail.com> wrote in message
> > news:1154637854.081643.210730@h48g2000cwc.googlegroups.com...
> > > Don,
> > >
> > > I know that this is an unusual practice, but, due to some severe
> > > performance restrictions, I do have to expose kernel-space buffers to
> > > the user mode. In short, these buffers are allocated by some other
> > > device driver and I can't copy the data into another buffer because of
> > > the associated performance penalty. Would it be a single buffer, it's
> > > clear how to map it to the user space, but I'm talking about multiple
> > > ones here.
> > >
> > >
> > > Don Burn wrote:
> > >> Even if you map them to a MDL you have not exported them to user space.
> > >> Your model is flawed, have the user space application pass the buffer to
> > >> the
> > >> driver. Having the driver do the alloc always produces a copy.
> > >>
> > >> --
> > >> Don Burn (MVP, Windows DDK)
> > >> Windows 2k/XP/2k3 Filesystem and Driver Consulting
> > >> http://www.windrvr.com
> > >> Remove StopSpam from the email to reply
> > >>
> > >> <grishka@gmail.com> wrote in message
> > >> news:1154629702.353304.130590@i42g2000cwa.googlegroups.com...
> > >> > My question is as follows:
> > >> >
> > >> > Suppose I have a number of driver-allocated buffers. How can I map
them
> > >> > into a single contiguous user-space buffer. I guess I should build an
> > >> > MDL that describes the physical pages mapped for these buffers and
call
> > >> > MmMapLockedPagesSpecifyCache(), but did someone do this in practice?
> > >> > There seems to be no function to add several different pages to an
MDL.
> > >> >
> > >> > Thanks a lot in advance,
> > >> > Greg.
> > >> >
> > >
>


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Fri Aug 04 05:57:30 CDT 2006

> Copying this amount of data eats up an unacceptable amount of CPU.

How many MB/s? What is your exact task?

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Fri Aug 04 07:09:06 CDT 2006

1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
allocated by another (opaque) driver. Copying the data into
IOCTL-supplied buffer utilizes CPU above the acceptable limit (about 5%
CPU usage on the most powerful machine I have, 20-30% on a typical
one). Once more, I can not supply my buffers for delivery. I can just
have pointers to those owned by the other driver.

Maxim S. Shatskih wrote:
> > Copying this amount of data eats up an unacceptable amount of CPU.
>
> How many MB/s? What is your exact task?
>
> --
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> maxim@storagecraft.com
> http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Don

Don
Fri Aug 04 07:17:18 CDT 2006

Sorry but the opaque driver is bad news. If you are going to get
performance you need to fix that driver, any other schema by you is going to
result in a copy sooner or later.


--
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
http://www.windrvr.com
Remove StopSpam from the email to reply



<grishka@gmail.com> wrote in message
news:1154693345.963482.31520@p79g2000cwp.googlegroups.com...
>1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
> allocated by another (opaque) driver. Copying the data into
> IOCTL-supplied buffer utilizes CPU above the acceptable limit (about 5%
> CPU usage on the most powerful machine I have, 20-30% on a typical
> one). Once more, I can not supply my buffers for delivery. I can just
> have pointers to those owned by the other driver.
>
> Maxim S. Shatskih wrote:
>> > Copying this amount of data eats up an unacceptable amount of CPU.
>>
>> How many MB/s? What is your exact task?
>>
>> --
>> Maxim Shatskih, Windows DDK MVP
>> StorageCraft Corporation
>> maxim@storagecraft.com
>> http://www.storagecraft.com
>



Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Fri Aug 04 07:19:20 CDT 2006

> 1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
> allocated by another (opaque) driver.

Why such a strange architecture? Why "opaque drivers"? why not send the pending
IOCTLs from the app and run DMA over their MDLs?

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Fri Aug 04 07:22:29 CDT 2006

If that was a good news, I wouldn't ask my question here :)
I do have quite a bit of experience in kernel mode and I'm aware of the
"right" and "wrong" ways of doing things. However, that's the task I
have and changing the other driver is definitely out of question.

Don Burn wrote:
> Sorry but the opaque driver is bad news. If you are going to get
> performance you need to fix that driver, any other schema by you is going to
> result in a copy sooner or later.
>
>
> --
> Don Burn (MVP, Windows DDK)
> Windows 2k/XP/2k3 Filesystem and Driver Consulting
> http://www.windrvr.com
> Remove StopSpam from the email to reply
>
>
>
> <grishka@gmail.com> wrote in message
> news:1154693345.963482.31520@p79g2000cwp.googlegroups.com...
> >1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
> > allocated by another (opaque) driver. Copying the data into
> > IOCTL-supplied buffer utilizes CPU above the acceptable limit (about 5%
> > CPU usage on the most powerful machine I have, 20-30% on a typical
> > one). Once more, I can not supply my buffers for delivery. I can just
> > have pointers to those owned by the other driver.
> >
> > Maxim S. Shatskih wrote:
> >> > Copying this amount of data eats up an unacceptable amount of CPU.
> >>
> >> How many MB/s? What is your exact task?
> >>
> >> --
> >> Maxim Shatskih, Windows DDK MVP
> >> StorageCraft Corporation
> >> maxim@storagecraft.com
> >> http://www.storagecraft.com
> >


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Fri Aug 04 07:26:50 CDT 2006

OK, I think you're pointing me into a right direction.
If I use a DIRECT_IO IOCTL, the system allocates and locks down the
buffer for me.
Do you mean I can actually CHANGE the MDL in the IRP or even REPLACE it
completely? If so, splitting the user-mode-allocated buffer and mapping
each separate part into one kernel-allocated buffer would solve my
problem.

Maxim S. Shatskih wrote:
> > 1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
> > allocated by another (opaque) driver.
>
> Why such a strange architecture? Why "opaque drivers"? why not send the pending
> IOCTLs from the app and run DMA over their MDLs?
>
> --
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> maxim@storagecraft.com
> http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Don

Don
Fri Aug 04 07:29:18 CDT 2006

Then so is reasonable performance, anything you try will be an extreme hack
that will likely not be stable under stress. Since you have already said
the system is going to be stressed with large inputs since you care about
performance, you are in a lose or lose situation.

Note, trying to hack around with the MDL is a dead end, there is more to the
memory manager than that.


--
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
http://www.windrvr.com
Remove StopSpam from the email to reply


<grishka@gmail.com> wrote in message
news:1154694149.595192.268790@m73g2000cwd.googlegroups.com...
> If that was a good news, I wouldn't ask my question here :)
> I do have quite a bit of experience in kernel mode and I'm aware of the
> "right" and "wrong" ways of doing things. However, that's the task I
> have and changing the other driver is definitely out of question.
>
> Don Burn wrote:
>> Sorry but the opaque driver is bad news. If you are going to get
>> performance you need to fix that driver, any other schema by you is going
>> to
>> result in a copy sooner or later.
>>
>>
>> --
>> Don Burn (MVP, Windows DDK)
>> Windows 2k/XP/2k3 Filesystem and Driver Consulting
>> http://www.windrvr.com
>> Remove StopSpam from the email to reply
>>
>>
>>
>> <grishka@gmail.com> wrote in message
>> news:1154693345.963482.31520@p79g2000cwp.googlegroups.com...
>> >1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
>> > allocated by another (opaque) driver. Copying the data into
>> > IOCTL-supplied buffer utilizes CPU above the acceptable limit (about 5%
>> > CPU usage on the most powerful machine I have, 20-30% on a typical
>> > one). Once more, I can not supply my buffers for delivery. I can just
>> > have pointers to those owned by the other driver.
>> >
>> > Maxim S. Shatskih wrote:
>> >> > Copying this amount of data eats up an unacceptable amount of CPU.
>> >>
>> >> How many MB/s? What is your exact task?
>> >>
>> >> --
>> >> Maxim Shatskih, Windows DDK MVP
>> >> StorageCraft Corporation
>> >> maxim@storagecraft.com
>> >> http://www.storagecraft.com
>> >
>



Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Fri Aug 04 07:35:33 CDT 2006

I agree with you about "extreme hacking", but remember that brilliant
things, such as RegMon by SysInternals wouldn't be possible had its
authors stick to "standard" methods only and avoid patching system
call pointer table.

Don Burn wrote:
> Then so is reasonable performance, anything you try will be an extreme hack
> that will likely not be stable under stress. Since you have already said
> the system is going to be stressed with large inputs since you care about
> performance, you are in a lose or lose situation.
>
> Note, trying to hack around with the MDL is a dead end, there is more to the
> memory manager than that.
>
>
> --
> Don Burn (MVP, Windows DDK)
> Windows 2k/XP/2k3 Filesystem and Driver Consulting
> http://www.windrvr.com
> Remove StopSpam from the email to reply
>
>
> <grishka@gmail.com> wrote in message
> news:1154694149.595192.268790@m73g2000cwd.googlegroups.com...
> > If that was a good news, I wouldn't ask my question here :)
> > I do have quite a bit of experience in kernel mode and I'm aware of the
> > "right" and "wrong" ways of doing things. However, that's the task I
> > have and changing the other driver is definitely out of question.
> >
> > Don Burn wrote:
> >> Sorry but the opaque driver is bad news. If you are going to get
> >> performance you need to fix that driver, any other schema by you is going
> >> to
> >> result in a copy sooner or later.
> >>
> >>
> >> --
> >> Don Burn (MVP, Windows DDK)
> >> Windows 2k/XP/2k3 Filesystem and Driver Consulting
> >> http://www.windrvr.com
> >> Remove StopSpam from the email to reply
> >>
> >>
> >>
> >> <grishka@gmail.com> wrote in message
> >> news:1154693345.963482.31520@p79g2000cwp.googlegroups.com...
> >> >1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
> >> > allocated by another (opaque) driver. Copying the data into
> >> > IOCTL-supplied buffer utilizes CPU above the acceptable limit (about 5%
> >> > CPU usage on the most powerful machine I have, 20-30% on a typical
> >> > one). Once more, I can not supply my buffers for delivery. I can just
> >> > have pointers to those owned by the other driver.
> >> >
> >> > Maxim S. Shatskih wrote:
> >> >> > Copying this amount of data eats up an unacceptable amount of CPU.
> >> >>
> >> >> How many MB/s? What is your exact task?
> >> >>
> >> >> --
> >> >> Maxim Shatskih, Windows DDK MVP
> >> >> StorageCraft Corporation
> >> >> maxim@storagecraft.com
> >> >> http://www.storagecraft.com
> >> >
> >


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Don

Don
Fri Aug 04 07:47:12 CDT 2006

No call table hacking was minor hacking compared to what you are looking at.
It might be easier to look at writing your own kernel.


--
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
http://www.windrvr.com
Remove StopSpam from the email to reply


<grishka@gmail.com> wrote in message
news:1154694933.880099.60480@i42g2000cwa.googlegroups.com...
>I agree with you about "extreme hacking", but remember that brilliant
> things, such as RegMon by SysInternals wouldn't be possible had its
> authors stick to "standard" methods only and avoid patching system
> call pointer table.
>
> Don Burn wrote:
>> Then so is reasonable performance, anything you try will be an extreme
>> hack
>> that will likely not be stable under stress. Since you have already said
>> the system is going to be stressed with large inputs since you care about
>> performance, you are in a lose or lose situation.
>>
>> Note, trying to hack around with the MDL is a dead end, there is more to
>> the
>> memory manager than that.
>>
>>
>> --
>> Don Burn (MVP, Windows DDK)
>> Windows 2k/XP/2k3 Filesystem and Driver Consulting
>> http://www.windrvr.com
>> Remove StopSpam from the email to reply
>>
>>
>> <grishka@gmail.com> wrote in message
>> news:1154694149.595192.268790@m73g2000cwd.googlegroups.com...
>> > If that was a good news, I wouldn't ask my question here :)
>> > I do have quite a bit of experience in kernel mode and I'm aware of the
>> > "right" and "wrong" ways of doing things. However, that's the task I
>> > have and changing the other driver is definitely out of question.
>> >
>> > Don Burn wrote:
>> >> Sorry but the opaque driver is bad news. If you are going to get
>> >> performance you need to fix that driver, any other schema by you is
>> >> going
>> >> to
>> >> result in a copy sooner or later.
>> >>
>> >>
>> >> --
>> >> Don Burn (MVP, Windows DDK)
>> >> Windows 2k/XP/2k3 Filesystem and Driver Consulting
>> >> http://www.windrvr.com
>> >> Remove StopSpam from the email to reply
>> >>
>> >>
>> >>
>> >> <grishka@gmail.com> wrote in message
>> >> news:1154693345.963482.31520@p79g2000cwp.googlegroups.com...
>> >> >1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
>> >> > allocated by another (opaque) driver. Copying the data into
>> >> > IOCTL-supplied buffer utilizes CPU above the acceptable limit (about
>> >> > 5%
>> >> > CPU usage on the most powerful machine I have, 20-30% on a typical
>> >> > one). Once more, I can not supply my buffers for delivery. I can
>> >> > just
>> >> > have pointers to those owned by the other driver.
>> >> >
>> >> > Maxim S. Shatskih wrote:
>> >> >> > Copying this amount of data eats up an unacceptable amount of
>> >> >> > CPU.
>> >> >>
>> >> >> How many MB/s? What is your exact task?
>> >> >>
>> >> >> --
>> >> >> Maxim Shatskih, Windows DDK MVP
>> >> >> StorageCraft Corporation
>> >> >> maxim@storagecraft.com
>> >> >> http://www.storagecraft.com
>> >> >
>> >
>



Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Fri Aug 04 08:28:47 CDT 2006

> If I use a DIRECT_IO IOCTL, the system allocates and locks down the
> buffer for me.

No. The OS just locks the pages _of the user buffer_ to the MDL and provide you
with this MDL.

> Do you mean I can actually CHANGE the MDL in the IRP or even REPLACE it
> completely? If so, splitting the user-mode-allocated buffer and mapping
> each separate part into one kernel-allocated buffer would solve my
> problem.

Correct. IoBuildPartialMdl is your friend.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Fri Aug 04 08:37:53 CDT 2006

> I agree with you about "extreme hacking", but remember that brilliant
> things, such as RegMon by SysInternals wouldn't be possible had its

Correct, and I don't think RegMon is intended for _production_ servers. It is a
lab tool.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Don

Don
Fri Aug 04 08:45:16 CDT 2006

And of course RegMon no longer does this for OS'es Microsoft supports.


--
Don Burn (MVP, Windows DDK)
Windows 2k/XP/2k3 Filesystem and Driver Consulting
http://www.windrvr.com
Remove StopSpam from the email to reply



"Maxim S. Shatskih" <maxim@storagecraft.com> wrote in message
news:eaviji$btm$1@news.mtu.ru...
>> I agree with you about "extreme hacking", but remember that brilliant
>> things, such as RegMon by SysInternals wouldn't be possible had its
>
> Correct, and I don't think RegMon is intended for _production_ servers. It
> is a
> lab tool.
>
> --
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> maxim@storagecraft.com
> http://www.storagecraft.com
>



Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Fri Aug 04 09:16:59 CDT 2006

Then again. I have a pointer to kernel-mode buffers that were NOT
allocated by me. The PCI device fills those buffers using its DMA. I
can't copy the data from the buffers because it takes too much CPU. I
must have these buffers merged into continuous address space for fast
sequential processing by user-mode application. I don't mind using
"extreme hacking" if there's no standard way to do this. My options?

Maxim S. Shatskih wrote:
> > If I use a DIRECT_IO IOCTL, the system allocates and locks down the
> > buffer for me.
>
> No. The OS just locks the pages _of the user buffer_ to the MDL and provide you
> with this MDL.
>
> > Do you mean I can actually CHANGE the MDL in the IRP or even REPLACE it
> > completely? If so, splitting the user-mode-allocated buffer and mapping
> > each separate part into one kernel-allocated buffer would solve my
> > problem.
>
> Correct. IoBuildPartialMdl is your friend.
>
> --
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> maxim@storagecraft.com
> http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Fri Aug 04 09:18:50 CDT 2006

Sorry, I just missed your reply. Thanks very much for your help!
Maxim S. Shatskih wrote:
> > If I use a DIRECT_IO IOCTL, the system allocates and locks down the
> > buffer for me.
>
> No. The OS just locks the pages _of the user buffer_ to the MDL and provide you
> with this MDL.
>
> > Do you mean I can actually CHANGE the MDL in the IRP or even REPLACE it
> > completely? If so, splitting the user-mode-allocated buffer and mapping
> > each separate part into one kernel-allocated buffer would solve my
> > problem.
>
> Correct. IoBuildPartialMdl is your friend.
>
> --
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> maxim@storagecraft.com
> http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Alexander

Alexander
Fri Aug 04 11:14:42 CDT 2006

And what you do with that amount of data?
Is it really PCI or modern PCI-X? PCI theoretically cannot handle >133MB/s,
unless you have 64 bit, 66MHz, which gives you only 4 times as much.

<grishka@gmail.com> wrote in message
news:1154693345.963482.31520@p79g2000cwp.googlegroups.com...
>1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
> allocated by another (opaque) driver. Copying the data into
> IOCTL-supplied buffer utilizes CPU above the acceptable limit (about 5%
> CPU usage on the most powerful machine I have, 20-30% on a typical
> one). Once more, I can not supply my buffers for delivery. I can just
> have pointers to those owned by the other driver.
>
> Maxim S. Shatskih wrote:
>> > Copying this amount of data eats up an unacceptable amount of CPU.
>>
>> How many MB/s? What is your exact task?
>>
>> --
>> Maxim Shatskih, Windows DDK MVP
>> StorageCraft Corporation
>> maxim@storagecraft.com
>> http://www.storagecraft.com
>



Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Fri Aug 04 12:51:54 CDT 2006

It's not PCI-X (currently), but I'm definitely getting close to 1
GBit/sec, so maybe it's really 64/66 one.

Alexander Grigoriev wrote:
> And what you do with that amount of data?
> Is it really PCI or modern PCI-X? PCI theoretically cannot handle >133MB/s,
> unless you have 64 bit, 66MHz, which gives you only 4 times as much.
>
> <grishka@gmail.com> wrote in message
> news:1154693345.963482.31520@p79g2000cwp.googlegroups.com...
> >1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
> > allocated by another (opaque) driver. Copying the data into
> > IOCTL-supplied buffer utilizes CPU above the acceptable limit (about 5%
> > CPU usage on the most powerful machine I have, 20-30% on a typical
> > one). Once more, I can not supply my buffers for delivery. I can just
> > have pointers to those owned by the other driver.
> >
> > Maxim S. Shatskih wrote:
> >> > Copying this amount of data eats up an unacceptable amount of CPU.
> >>
> >> How many MB/s? What is your exact task?
> >>
> >> --
> >> Maxim Shatskih, Windows DDK MVP
> >> StorageCraft Corporation
> >> maxim@storagecraft.com
> >> http://www.storagecraft.com
> >


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Fri Aug 04 14:25:35 CDT 2006

> must have these buffers merged into continuous address space for fast

Why contiguous? Map several buffers to the user space, and reference them by
pointer array. Is it bad?

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Fri Aug 04 14:27:42 CDT 2006

Note about IoBuildPartialMdl.

Having the VirtualAddress parameter in this call is some strange design
idea by MS. The correct value for this parameter is:

(PUCHAR)(MmGetMdlVirtualAddress(MasterMdl)) + Offset

This allows you to make a part by usual offset/length pair.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com

<grishka@gmail.com> wrote in message
news:1154701130.401524.159130@p79g2000cwp.googlegroups.com...
> Sorry, I just missed your reply. Thanks very much for your help!
> Maxim S. Shatskih wrote:
> > > If I use a DIRECT_IO IOCTL, the system allocates and locks down the
> > > buffer for me.
> >
> > No. The OS just locks the pages _of the user buffer_ to the MDL and provide
you
> > with this MDL.
> >
> > > Do you mean I can actually CHANGE the MDL in the IRP or even REPLACE it
> > > completely? If so, splitting the user-mode-allocated buffer and mapping
> > > each separate part into one kernel-allocated buffer would solve my
> > > problem.
> >
> > Correct. IoBuildPartialMdl is your friend.
> >
> > --
> > Maxim Shatskih, Windows DDK MVP
> > StorageCraft Corporation
> > maxim@storagecraft.com
> > http://www.storagecraft.com
>


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Pavel

Pavel
Fri Aug 04 15:49:39 CDT 2006

"Maxim S. Shatskih" <maxim@storagecraft.com> wrote in message news:eb073d$f7s$1@news.mtu.ru...
> Note about IoBuildPartialMdl.
>
> Having the VirtualAddress parameter in this call is some strange design
> idea by MS. The correct value for this parameter is:
>
> (PUCHAR)(MmGetMdlVirtualAddress(MasterMdl)) + Offset
>
> This allows you to make a part by usual offset/length pair.
>

But the offset must be page aligned? you can't glue together
two memory blocks less than a page, without a hole between?

--PA



Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Fri Aug 04 15:58:20 CDT 2006

> But the offset must be page aligned? you can't glue together
> two memory blocks less than a page, without a hole between?

No. I'm not about glueing together, I'm about separating and making sub-range
descriptor from a range descriptor.

--
Maxim Shatskih, Windows DDK MVP
StorageCraft Corporation
maxim@storagecraft.com
http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Alexander

Alexander
Sat Aug 05 00:29:26 CDT 2006

So it's 1 Gbit/s (128 MB/s) not 1 GB/s?

<grishka@gmail.com> wrote in message
news:1154713914.142853.62810@m73g2000cwd.googlegroups.com...
> It's not PCI-X (currently), but I'm definitely getting close to 1
> GBit/sec, so maybe it's really 64/66 one.
>
> Alexander Grigoriev wrote:
>> And what you do with that amount of data?
>> Is it really PCI or modern PCI-X? PCI theoretically cannot handle
>> >133MB/s,
>> unless you have 64 bit, 66MHz, which gives you only 4 times as much.
>>
>> <grishka@gmail.com> wrote in message
>> news:1154693345.963482.31520@p79g2000cwp.googlegroups.com...
>> >1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
>> > allocated by another (opaque) driver. Copying the data into
>> > IOCTL-supplied buffer utilizes CPU above the acceptable limit (about 5%
>> > CPU usage on the most powerful machine I have, 20-30% on a typical
>> > one). Once more, I can not supply my buffers for delivery. I can just
>> > have pointers to those owned by the other driver.
>> >
>> > Maxim S. Shatskih wrote:
>> >> > Copying this amount of data eats up an unacceptable amount of CPU.
>> >>
>> >> How many MB/s? What is your exact task?
>> >>
>> >> --
>> >> Maxim Shatskih, Windows DDK MVP
>> >> StorageCraft Corporation
>> >> maxim@storagecraft.com
>> >> http://www.storagecraft.com
>> >
>



Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Sat Aug 05 02:00:38 CDT 2006

No, this won't work because the user-mode application must output the
data directly to a third device as a single, coninuous block.
Maxim S. Shatskih wrote:
> > must have these buffers merged into continuous address space for fast
>
> Why contiguous? Map several buffers to the user space, and reference them by
> pointer array. Is it bad?
>
> --
> Maxim Shatskih, Windows DDK MVP
> StorageCraft Corporation
> maxim@storagecraft.com
> http://www.storagecraft.com


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by grishka

grishka
Sat Aug 05 02:02:54 CDT 2006

Right.

Alexander Grigoriev wrote:
> So it's 1 Gbit/s (128 MB/s) not 1 GB/s?
>
> <grishka@gmail.com> wrote in message
> news:1154713914.142853.62810@m73g2000cwd.googlegroups.com...
> > It's not PCI-X (currently), but I'm definitely getting close to 1
> > GBit/sec, so maybe it's really 64/66 one.
> >
> > Alexander Grigoriev wrote:
> >> And what you do with that amount of data?
> >> Is it really PCI or modern PCI-X? PCI theoretically cannot handle
> >> >133MB/s,
> >> unless you have 64 bit, 66MHz, which gives you only 4 times as much.
> >>
> >> <grishka@gmail.com> wrote in message
> >> news:1154693345.963482.31520@p79g2000cwp.googlegroups.com...
> >> >1 GB/s. The data is delivered by a PCI device DMA into kernel buffers
> >> > allocated by another (opaque) driver. Copying the data into
> >> > IOCTL-supplied buffer utilizes CPU above the acceptable limit (about 5%
> >> > CPU usage on the most powerful machine I have, 20-30% on a typical
> >> > one). Once more, I can not supply my buffers for delivery. I can just
> >> > have pointers to those owned by the other driver.
> >> >
> >> > Maxim S. Shatskih wrote:
> >> >> > Copying this amount of data eats up an unacceptable amount of CPU.
> >> >>
> >> >> How many MB/s? What is your exact task?
> >> >>
> >> >> --
> >> >> Maxim Shatskih, Windows DDK MVP
> >> >> StorageCraft Corporation
> >> >> maxim@storagecraft.com
> >> >> http://www.storagecraft.com
> >> >
> >


Re: Mapping multiple physical/system address virtual buffers into a single user-space virtual range. by Maxim

Maxim
Sat Aug 05 05:30:54 CDT 2006

Can you get rid of the "opaque driver" in the middle, it seems to be mis