Hello all,

I have created a VPN application, which installs a Virtual NIC (an NDIS
miniport driver) in the system. Thus when you connects with the VPN then a
virtual NIC appears in the system and all the IP packets (for the VPN
network) are then transferred to the Virtual NIC. From my virtual NIC i am
transferring the IP packets back to a user lever application (via file
interface CrateFile(), ReadFile(), WriteFile()) which then sends all those
IP packets to the remote machine (a computer having similar VPN application
and VNIC driver running) using a socket connection. I need to do following:

1. I would like to know how changing (increasing and/or decreasing) the MTU
of the Virtual NIC could effect the performance of the system. Please feel
free to give your thoughts/comments on it.

2. I also like to calculate time to transfer IP packet from my VNIC driver
to the user level application. I just like to ask what APIs can I use in the
driver to calculate time and then in the user application also to calculate
time. I actually am thinking about getting system time using some API in the
driver and put that time information in a packet and send that packet to
user application. Then user application uses some other API to calculate
time when it recieves the packet and then to calculate time difference by
subtracting time calculated by the driver from that of user application. But
I dont know which specific API should I use in user level application which
gives me time in the same format as calculated by some other API in the
driver.

Please reply,

Thanks,

Arsalan

Re: A question regarding MTU: how it can effect TCP performance + other queries by Stephan

Stephan
Wed Nov 15 04:27:29 CST 2006

1) MTU should be choosen such that there is no extra fragmentation
required. That is, if IP sends a frame to your VNIC, you should also
see one frame sent to the remote sation. So you should see a 1:1
relation between the number of IP frames handled by the virtual NIC and
the physical NIC.
(Note that fragmentation can still take place at some router, but this
is another story.)

2) In the driver, use KeQueryPerformanceCounter(). In the app., use
QueryPerformanceCounter().

Stephan
---
Arsalan Ahmad wrote:
> Hello all,
>
> I have created a VPN application, which installs a Virtual NIC (an NDIS
> miniport driver) in the system. Thus when you connects with the VPN then a
> virtual NIC appears in the system and all the IP packets (for the VPN
> network) are then transferred to the Virtual NIC. From my virtual NIC i am
> transferring the IP packets back to a user lever application (via file
> interface CrateFile(), ReadFile(), WriteFile()) which then sends all those
> IP packets to the remote machine (a computer having similar VPN application
> and VNIC driver running) using a socket connection. I need to do following:
>
> 1. I would like to know how changing (increasing and/or decreasing) the MTU
> of the Virtual NIC could effect the performance of the system. Please feel
> free to give your thoughts/comments on it.
>
> 2. I also like to calculate time to transfer IP packet from my VNIC driver
> to the user level application. I just like to ask what APIs can I use in the
> driver to calculate time and then in the user application also to calculate
> time. I actually am thinking about getting system time using some API in the
> driver and put that time information in a packet and send that packet to
> user application. Then user application uses some other API to calculate
> time when it recieves the packet and then to calculate time difference by
> subtracting time calculated by the driver from that of user application. But
> I dont know which specific API should I use in user level application which
> gives me time in the same format as calculated by some other API in the
> driver.
>
> Please reply,
>
> Thanks,
>
> Arsalan


Re: A question regarding MTU: how it can effect TCP performance + other queries by Arsalan

Arsalan
Wed Nov 15 05:04:19 CST 2006

But suppose if I am set MTU such that there is 1:1 relation between number
of IP frames handle by the virtual NIC and the physical NIC then it means
the final IP frame send to physical NIC will have <IP header for physical
NIC> + <IP header for VNIC> + <Application data>. So, for each application
packet i have overhead equal to twice the <IP frame>. I was thinking if I
will increase the MTU of my VNIC to some arbitrary number like lets say 3000
bytes then it means my VNIC frame could be upto 3000 bytes (<IP header for
VNIC> + <Application data>) which if broken down to multiple IP frame for
physical NIC would be like <IP header for physcial NIC> + <IP header for
VNIC> + <Some Application data> and the rest of the IP frames for physcial
NIC would be <IP header for physcial NIC> + <Rest of application data>. I
think this will increase the throughput of VPN. Won't it?

"Stephan Wolf [MVP]" <stewo68@hotmail.com> wrote in message
news:1163586449.092470.289090@m7g2000cwm.googlegroups.com...
> 1) MTU should be choosen such that there is no extra fragmentation
> required. That is, if IP sends a frame to your VNIC, you should also
> see one frame sent to the remote sation. So you should see a 1:1
> relation between the number of IP frames handled by the virtual NIC and
> the physical NIC.
> (Note that fragmentation can still take place at some router, but this
> is another story.)
>
> 2) In the driver, use KeQueryPerformanceCounter(). In the app., use
> QueryPerformanceCounter().
>
> Stephan
> ---
> Arsalan Ahmad wrote:
>> Hello all,
>>
>> I have created a VPN application, which installs a Virtual NIC (an NDIS
>> miniport driver) in the system. Thus when you connects with the VPN then
>> a
>> virtual NIC appears in the system and all the IP packets (for the VPN
>> network) are then transferred to the Virtual NIC. From my virtual NIC i
>> am
>> transferring the IP packets back to a user lever application (via file
>> interface CrateFile(), ReadFile(), WriteFile()) which then sends all
>> those
>> IP packets to the remote machine (a computer having similar VPN
>> application
>> and VNIC driver running) using a socket connection. I need to do
>> following:
>>
>> 1. I would like to know how changing (increasing and/or decreasing) the
>> MTU
>> of the Virtual NIC could effect the performance of the system. Please
>> feel
>> free to give your thoughts/comments on it.
>>
>> 2. I also like to calculate time to transfer IP packet from my VNIC
>> driver
>> to the user level application. I just like to ask what APIs can I use in
>> the
>> driver to calculate time and then in the user application also to
>> calculate
>> time. I actually am thinking about getting system time using some API in
>> the
>> driver and put that time information in a packet and send that packet to
>> user application. Then user application uses some other API to calculate
>> time when it recieves the packet and then to calculate time difference by
>> subtracting time calculated by the driver from that of user application.
>> But
>> I dont know which specific API should I use in user level application
>> which
>> gives me time in the same format as calculated by some other API in the
>> driver.
>>
>> Please reply,
>>
>> Thanks,
>>
>> Arsalan
>



Re: A question regarding MTU: how it can effect TCP performance + other queries by Stephan

Stephan
Wed Nov 15 10:28:49 CST 2006

Arsalan Ahmad wrote:
> But suppose if I am set MTU such that there is 1:1 relation between number
> of IP frames handle by the virtual NIC and the physical NIC then it means
> the final IP frame send to physical NIC will have <IP header for physical
> NIC> + <IP header for VNIC> + <Application data>.

Exactly, so the MTU of your VNIC should be *smaller* than that of the
physical NIC. This will allow each virtual IP packet to be packaged in
one physical IP packet.

Otherwise, if you choose a larger MTU for your VNIC, you will actually
see IP fragments sent over the physical NIC. This is ok if no fragments
get lost and reassembly is reliable. If however physical IP fragments
get lost, you are likely to see many retransmission attempts by higher
layers like TCP.

Well I guess you should simply try and see what gives you the best
results.

Stephan


Re: A question regarding MTU: how it can effect TCP performance + other queries by Steve

Steve
Wed Nov 15 12:43:22 CST 2006

On 2006-11-15 10:28:49 -0600, "Stephan Wolf [MVP]" <stewo68@hotmail.com> said:
> Otherwise, if you choose a larger MTU for your VNIC, you will actually
> see IP fragments sent over the physical NIC. This is ok if no fragments
> get lost and reassembly is reliable. If however physical IP fragments
> get lost, you are likely to see many retransmission attempts by higher
> layers like TCP.

TCP tends to set the DF bit.

-sd


Re: A question regarding MTU: how it can effect TCP performance + other queries by Stephan

Stephan
Thu Nov 16 08:37:55 CST 2006

Steve Dispensa wrote:
> TCP tends to set the DF bit.

Hmm, ok, so here is what I guess that means for the OP:

- If the DF bit is set, you should also set this bit in the physical IP
frame.
- Thus, if the virtual IP frame does not fit in one physical IP
fragment, you are lost.
- Consequently, as I suggested earlier, the virtual MTU must be less
than the physical MTU such that a maximum virtual (IP) frame fits in a
single physical IP frame.

Please correct me if I am wrong.

Stephan


Re: A question regarding MTU: how it can effect TCP performance + other queries by Steve

Steve
Thu Nov 16 19:46:13 CST 2006

On 2006-11-16 08:37:55 -0600, "Stephan Wolf [MVP]" <stewo68@hotmail.com> said:

> Steve Dispensa wrote:
>> TCP tends to set the DF bit.
>
> Hmm, ok, so here is what I guess that means for the OP:
>
> - If the DF bit is set, you should also set this bit in the physical IP
> frame.
> - Thus, if the virtual IP frame does not fit in one physical IP
> fragment, you are lost.
> - Consequently, as I suggested earlier, the virtual MTU must be less
> than the physical MTU such that a maximum virtual (IP) frame fits in a
> single physical IP frame.
>
> Please correct me if I am wrong.

That makes sense to me. Depending on the application that the virtual
miniport is designed for, though, it's not a given that you really have
to (or want to) do 1:1 packet encapsulation. In fact, if the OP winds
up tunneling through TCP, the interface's MTU doesn't matter as much
since TCP will do its own segmentation regardless. There are obvious
performance considerations there too, which are left as an exercise to
the OP. :-)

-Steve


Re: A question regarding MTU: how it can effect TCP performance + other queries by Stephan

Stephan
Fri Nov 17 03:02:02 CST 2006

Steve Dispensa wrote:
> In fact, if the OP winds
> up tunneling through TCP, the interface's MTU doesn't matter as much
> since TCP will do its own segmentation regardless.

Couldn't agree more. Did the OP mention whether he wraps frames sent to
his virtual NIC in IP or in TCP packets?

Stephan


Re: A question regarding MTU: how it can effect TCP performance + other queries by Arsalan

Arsalan
Tue Nov 28 04:03:23 CST 2006

Hello all,

I have been doing some throughput tests with Iperf for my VNIC and physical
NIC. I connect two P-4 3.4 GHz computers with 1Gbps NIC cards together via
cross-over cable and then performed my experiments.

What I found that for TCP, throughput that I am getting on physical
interface is around 900 Mbps and on virtual interface i am getting around 90
Mbps (only 10% of the physcial one). All the tests were performed with Iperf
with 15 parallel streams.

However, for UDP its totally different. With Iperf with 15 parallel streams,
I am getting only around 500 Mbps throughput on physical interface while
approximately 1000Mbps (1Gbps) on virtual interface. I am unable to figure
out why am I getting this strange behaviour. Can you guess?

My virtual NIC pass all the packets to a user-level application which then
send all the IP packets of virtual interface to user-level application
running on the other computer. The communication between the two user-level
application is via TCP.

So, for TCP packets over virtual interface, I am sending a TCP packet
encapsulated within another TCP packet when passed to physical interface,
while for UDP I am sending UDP packet encapsulated within TCP packet when
passed to physical interface.

Thanks,

Arsalan


"Stephan Wolf [MVP]" <stewo68@hotmail.com> wrote in message
news:1163754122.418495.271480@b28g2000cwb.googlegroups.com...
> Steve Dispensa wrote:
>> In fact, if the OP winds
>> up tunneling through TCP, the interface's MTU doesn't matter as much
>> since TCP will do its own segmentation regardless.
>
> Couldn't agree more. Did the OP mention whether he wraps frames sent to
> his virtual NIC in IP or in TCP packets?
>
> Stephan
>



Re: A question regarding MTU: how it can effect TCP performance + other queries by Steve

Steve
Wed Nov 29 00:06:42 CST 2006

On 2006-11-28 04:03:23 -0600, "Arsalan Ahmad" <arsal__@hotmail.com> said:

> Hello all,
>
> I have been doing some throughput tests with Iperf for my VNIC and
> physical NIC. I connect two P-4 3.4 GHz computers with 1Gbps NIC cards
> together via cross-over cable and then performed my experiments.
>
> What I found that for TCP, throughput that I am getting on physical
> interface is around 900 Mbps and on virtual interface i am getting
> around 90 Mbps (only 10% of the physcial one). All the tests were
> performed with Iperf with 15 parallel streams.
>
> However, for UDP its totally different. With Iperf with 15 parallel
> streams, I am getting only around 500 Mbps throughput on physical
> interface while approximately 1000Mbps (1Gbps) on virtual interface. I
> am unable to figure out why am I getting this strange behaviour. Can
> you guess?

Yes, I can guess; are you tunneling the packets? Do you have Nagle's
turned on? Did you do the math on the bandwidth-delay product and make
your TCP window was set appropriately?


> My virtual NIC pass all the packets to a user-level application which
> then send all the IP packets of virtual interface to user-level
> application running on the other computer. The communication between
> the two user-level application is via TCP.

Windows is sensitive to the exact structure of your sockets code. To
get the best possible performance from the OS, you may want to look
into using I/O Completion for your tunnel code on both ends, and make
sure you keep enough buffers pending such that the protocol is never
waiting on you.


> So, for TCP packets over virtual interface, I am sending a TCP packet
> encapsulated within another TCP packet when passed to physical
> interface, while for UDP I am sending UDP packet encapsulated within
> TCP packet when passed to physical interface.

Given this, I'm voting for Nagle's. Turn on NODELAY and re-test.

-sd


Re: A question regarding MTU: how it can effect TCP performance + other queries by Arsalan

Arsalan
Wed Nov 29 07:37:48 CST 2006

My question was that Iperf is showing less UDP throughput over physical
interface (around 550Mbps for 15 parallel streams) while for virtual NIC it
is showing greater throughput (around 1Gbps for 15 parallel streams). So why
is this? because what I thought that UDP thoughput should be high in case of
physical interface which is not I am getting. And how Nagle's. Turn on
NODELAY for TCP can effect UDP throughput performance?

Thanks,

Arsalan


"Steve Dispensa" <dispensa@takethisout.positivenetworks.net> wrote in
message news:2006112900064250073-dispensa@takethisoutpositivenetworksnet...
> On 2006-11-28 04:03:23 -0600, "Arsalan Ahmad" <arsal__@hotmail.com> said:
>
>> Hello all,
>>
>> I have been doing some throughput tests with Iperf for my VNIC and
>> physical NIC. I connect two P-4 3.4 GHz computers with 1Gbps NIC cards
>> together via cross-over cable and then performed my experiments.
>>
>> What I found that for TCP, throughput that I am getting on physical
>> interface is around 900 Mbps and on virtual interface i am getting around
>> 90 Mbps (only 10% of the physcial one). All the tests were performed with
>> Iperf with 15 parallel streams.
>>
>> However, for UDP its totally different. With Iperf with 15 parallel
>> streams, I am getting only around 500 Mbps throughput on physical
>> interface while approximately 1000Mbps (1Gbps) on virtual interface. I am
>> unable to figure out why am I getting this strange behaviour. Can you
>> guess?
>
> Yes, I can guess; are you tunneling the packets? Do you have Nagle's
> turned on? Did you do the math on the bandwidth-delay product and make
> your TCP window was set appropriately?
>
>
>> My virtual NIC pass all the packets to a user-level application which
>> then send all the IP packets of virtual interface to user-level
>> application running on the other computer. The communication between the
>> two user-level application is via TCP.
>
> Windows is sensitive to the exact structure of your sockets code. To get
> the best possible performance from the OS, you may want to look into using
> I/O Completion for your tunnel code on both ends, and make sure you keep
> enough buffers pending such that the protocol is never waiting on you.
>
>
>> So, for TCP packets over virtual interface, I am sending a TCP packet
>> encapsulated within another TCP packet when passed to physical interface,
>> while for UDP I am sending UDP packet encapsulated within TCP packet when
>> passed to physical interface.
>
> Given this, I'm voting for Nagle's. Turn on NODELAY and re-test.
>
> -sd
>



Re: A question regarding MTU: how it can effect TCP performance + other queries by Steve

Steve
Wed Nov 29 22:55:28 CST 2006

I'm sorry, I misread the UDP part of your question. You're right,
something seems off there. Are you buffering somewhere somehow? Are
some packets hitting the bit bucket after being delivered to the
virtual NIC, perhaps?

As for TCP case, you said:

>>> What I found that for TCP, throughput that I am getting on physical
>>> interface is around 900 Mbps and on virtual interface i am getting
>>> around 90 Mbps (only 10% of the physcial one). All the tests were
>>> performed with Iperf with 15 parallel streams.

and

>>> So, for TCP packets over virtual interface, I am sending a TCP packet
>>> encapsulated within another TCP packet when passed to physical
>>> interface, while for UDP I am sending UDP packet encapsulated within
>>> TCP packet when passed to physical interface.

Nagle would matter for UDP packets encapsulated within TCP packets,
depending (a great deal) on the characteristics of the traffic you are
testing with. Nagle's has an adverse affect on transmission speed,
because it introduces an artificial queuing delay, regardless of what
it's carrying. I suggested turning on NODELAY for the "outside" TCP
connection, not (necessarily) the "inside" connection.

At any rate, your results do seem strange to me; not a whole lot else
comes to mind though. You might want to instrument your miniport and
track where the packets spend their time between receipt at your send
handler and final send() to the other system.

Good luck.

-sd


On 2006-11-29 07:37:48 -0600, "Arsalan Ahmad" <arsal__@hotmail.com> said:

> My question was that Iperf is showing less UDP throughput over physical
> interface (around 550Mbps for 15 parallel streams) while for virtual
> NIC it is showing greater throughput (around 1Gbps for 15 parallel
> streams). So why is this? because what I thought that UDP thoughput
> should be high in case of physical interface which is not I am getting.
> And how Nagle's. Turn on NODELAY for TCP can effect UDP throughput
> performance?
>
> Thanks,
>
> Arsalan
>
>
> "Steve Dispensa" <dispensa@takethisout.positivenetworks.net> wrote in
> message
> news:2006112900064250073-dispensa@takethisoutpositivenetworksnet...
>> On 2006-11-28 04:03:23 -0600, "Arsalan Ahmad" <arsal__@hotmail.com> said:
>>
>>> Hello all,
>>>
>>> I have been doing some throughput tests with Iperf for my VNIC and
>>> physical NIC. I connect two P-4 3.4 GHz computers with 1Gbps NIC cards
>>> together via cross-over cable and then performed my experiments.
>>>
>>> What I found that for TCP, throughput that I am getting on physical
>>> interface is around 900 Mbps and on virtual interface i am getting
>>> around 90 Mbps (only 10% of the physcial one). All the tests were
>>> performed with Iperf with 15 parallel streams.
>>>
>>> However, for UDP its totally different. With Iperf with 15 parallel
>>> streams, I am getting only around 500 Mbps throughput on physical
>>> interface while approximately 1000Mbps (1Gbps) on virtual interface. I
>>> am unable to figure out why am I getting this strange behaviour. Can
>>> you guess?
>>
>> Yes, I can guess; are you tunneling the packets? Do you have Nagle's
>> turned on? Did you do the math on the bandwidth-delay product and make
>> your TCP window was set appropriately?
>>
>>
>>> My virtual NIC pass all the packets to a user-level application which
>>> then send all the IP packets of virtual interface to user-level
>>> application running on the other computer. The communication between
>>> the two user-level application is via TCP.
>>
>> Windows is sensitive to the exact structure of your sockets code. To
>> get the best possible performance from the OS, you may want to look
>> into using I/O Completion for your tunnel code on both ends, and make
>> sure you keep enough buffers pending such that the protocol is never
>> waiting on you.
>>
>>
>>> So, for TCP packets over virtual interface, I am sending a TCP packet
>>> encapsulated within another TCP packet when passed to physical
>>> interface, while for UDP I am sending UDP packet encapsulated within
>>> TCP packet when passed to physical interface.
>>
>> Given this, I'm voting for Nagle's. Turn on NODELAY and re-test.
>>
>> -sd




Re: A question regarding MTU: how it can effect TCP performance + by Pankaj

Pankaj
Thu Nov 30 01:38:17 CST 2006


*Removing multiple newsgroups as cross posting usually generates tons of
extra traffic.*

What packet size IO are you doing in case of TCP? Can you check if your
physical NIC has TCP large send offload enabled?

The reason for the large difference in TCP performance may be due to the
offload happening in case of physical NIC.

I can't think of anything for the UDP case however, that just seems
strange to me. You probably need to dig deeper and tell us what exactly is
happening here. How many packets are being sent, what was the size of
these packets? Are you grouping multiple UDP packets in one TCP packet?

--
Pankaj Garg
2006-11-29 at 11:32pm
http://www.intellectualheaven.com


On Wed, 29 Nov 2006, Steve Dispensa wrote:

>
> I'm sorry, I misread the UDP part of your question. You're right, something
> seems off there. Are you buffering somewhere somehow? Are some packets
> hitting the bit bucket after being delivered to the virtual NIC, perhaps?
>
> As for TCP case, you said:
>
>>>> What I found that for TCP, throughput that I am getting on physical
>>>> interface is around 900 Mbps and on virtual interface i am getting around
>>>> 90 Mbps (only 10% of the physcial one). All the tests were performed with
>>>> Iperf with 15 parallel streams.
>
> and
>
>>>> So, for TCP packets over virtual interface, I am sending a TCP packet
>>>> encapsulated within another TCP packet when passed to physical interface,
>>>> while for UDP I am sending UDP packet encapsulated within TCP packet when
>>>> passed to physical interface.
>
> Nagle would matter for UDP packets encapsulated within TCP packets, depending
> (a great deal) on the characteristics of the traffic you are testing with.
> Nagle's has an adverse affect on transmission speed, because it introduces an
> artificial queuing delay, regardless of what it's carrying. I suggested
> turning on NODELAY for the "outside" TCP connection, not (necessarily) the
> "inside" connection.
>
> At any rate, your results do seem strange to me; not a whole lot else comes
> to mind though. You might want to instrument your miniport and track where
> the packets spend their time between receipt at your send handler and final
> send() to the other system.
>
> Good luck.
>
> -sd
>
>
> On 2006-11-29 07:37:48 -0600, "Arsalan Ahmad" <arsal__@hotmail.com> said:
>
>> My question was that Iperf is showing less UDP throughput over physical
>> interface (around 550Mbps for 15 parallel streams) while for virtual NIC it
>> is showing greater throughput (around 1Gbps for 15 parallel streams). So
>> why is this? because what I thought that UDP thoughput should be high in
>> case of physical interface which is not I am getting. And how Nagle's. Turn
>> on NODELAY for TCP can effect UDP throughput performance?
>>
>> Thanks,
>>
>> Arsalan
>>
>>
>> "Steve Dispensa" <dispensa@takethisout.positivenetworks.net> wrote in
>> message news:2006112900064250073-dispensa@takethisoutpositivenetworksnet...
>>> On 2006-11-28 04:03:23 -0600, "Arsalan Ahmad" <arsal__@hotmail.com> said:
>>>
>>>> Hello all,
>>>>
>>>> I have been doing some throughput tests with Iperf for my VNIC and
>>>> physical NIC. I connect two P-4 3.4 GHz computers with 1Gbps NIC cards
>>>> together via cross-over cable and then performed my experiments.
>>>>
>>>> What I found that for TCP, throughput that I am getting on physical
>>>> interface is around 900 Mbps and on virtual interface i am getting around
>>>> 90 Mbps (only 10% of the physcial one). All the tests were performed with
>>>> Iperf with 15 parallel streams.
>>>>
>>>> However, for UDP its totally different. With Iperf with 15 parallel
>>>> streams, I am getting only around 500 Mbps throughput on physical
>>>> interface while approximately 1000Mbps (1Gbps) on virtual interface. I am
>>>> unable to figure out why am I getting this strange behaviour. Can you
>>>> guess?
>>>
>>> Yes, I can guess; are you tunneling the packets? Do you have Nagle's
>>> turned on? Did you do the math on the bandwidth-delay product and make
>>> your TCP window was set appropriately?
>>>
>>>
>>>> My virtual NIC pass all the packets to a user-level application which
>>>> then send all the IP packets of virtual interface to user-level
>>>> application running on the other computer. The communication between the
>>>> two user-level application is via TCP.
>>>
>>> Windows is sensitive to the exact structure of your sockets code. To get
>>> the best possible performance from the OS, you may want to look into using
>>> I/O Completion for your tunnel code on both ends, and make sure you keep
>>> enough buffers pending such that the protocol is never waiting on you.
>>>
>>>
>>>> So, for TCP packets over virtual interface, I am sending a TCP packet
>>>> encapsulated within another TCP packet when passed to physical interface,
>>>> while for UDP I am sending UDP packet encapsulated within TCP packet when
>>>> passed to physical interface.
>>>
>>> Given this, I'm voting for Nagle's. Turn on NODELAY and re-test.
>>>
>>> -sd
>
>
>
>

Re: A question regarding MTU: how it can effect TCP performance + other queries by Arsalan

Arsalan
Thu Nov 30 03:55:01 CST 2006

Hi all,

Thanks for all the replies.

Again, my question is not regarding TCP. I have identified reasons why TCP
throughput is less in case of virtual NIC.

My concern is about UDP throughput that why UDP Iperf is reporting UDP
throughpu on virtual NIC greater than that on in physical NIC. I am using
Iperf with default options for UDP. Below is the output of Iperf UDP client
for physical NIC.

------------------------------------------------------------
Client connecting to 192.168.1.2, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 8.00 KByte (default)
------------------------------------------------------------
[1184] local 192.168.1.1 port 2281 connected with 192.168.1.2 port 5001
[1564] local 192.168.1.1 port 2262 connected with 192.168.1.2 port 5001
[1524] local 192.168.1.1 port 2264 connected with 192.168.1.2 port 5001
[1444] local 192.168.1.1 port 2268 connected with 192.168.1.2 port 5001
[1364] local 192.168.1.1 port 2272 connected with 192.168.1.2 port 5001
[1384] local 192.168.1.1 port 2271 connected with 192.168.1.2 port 5001
[1324] local 192.168.1.1 port 2274 connected with 192.168.1.2 port 5001
[1404] local 192.168.1.1 port 2270 connected with 192.168.1.2 port 5001
[1284] local 192.168.1.1 port 2276 connected with 192.168.1.2 port 5001
[1244] local 192.168.1.1 port 2278 connected with 192.168.1.2 port 5001
[1204] local 192.168.1.1 port 2280 connected with 192.168.1.2 port 5001
[1264] local 192.168.1.1 port 2277 connected with 192.168.1.2 port 5001
[1684] local 192.168.1.1 port 2257 connected with 192.168.1.2 port 5001
[1424] local 192.168.1.1 port 2269 connected with 192.168.1.2 port 5001
[1544] local 192.168.1.1 port 2263 connected with 192.168.1.2 port 5001
[1224] local 192.168.1.1 port 2279 connected with 192.168.1.2 port 5001
[1464] local 192.168.1.1 port 2267 connected with 192.168.1.2 port 5001
[1624] local 192.168.1.1 port 2259 connected with 192.168.1.2 port 5001
[1304] local 192.168.1.1 port 2275 connected with 192.168.1.2 port 5001
[1504] local 192.168.1.1 port 2265 connected with 192.168.1.2 port 5001
[1584] local 192.168.1.1 port 2261 connected with 192.168.1.2 port 5001
[1344] local 192.168.1.1 port 2273 connected with 192.168.1.2 port 5001
[1484] local 192.168.1.1 port 2266 connected with 192.168.1.2 port 5001
[1604] local 192.168.1.1 port 2260 connected with 192.168.1.2 port 5001
[1644] local 192.168.1.1 port 2258 connected with 192.168.1.2 port 5001
[ ID] Interval Transfer Bandwidth
[1564] 0.0-10.0 sec 21.8 MBytes 18.2 Mbits/sec
[1184] 0.0-10.0 sec 21.7 MBytes 18.2 Mbits/sec
[1444] 0.0-10.0 sec 21.8 MBytes 18.2 Mbits/sec
[1524] 0.0-10.0 sec 21.8 MBytes 18.3 Mbits/sec
[1364] 0.0-10.0 sec 21.9 MBytes 18.4 Mbits/sec
[1384] 0.0-10.0 sec 21.7 MBytes 18.2 Mbits/sec
[1404] 0.0-10.0 sec 21.8 MBytes 18.3 Mbits/sec
[1204] 0.0-10.0 sec 22.0 MBytes 18.4 Mbits/sec
[1244] 0.0-10.0 sec 22.1 MBytes 18.5 Mbits/sec
[1324] 0.0-10.0 sec 21.9 MBytes 18.4 Mbits/sec
[1284] 0.0-10.0 sec 22.0 MBytes 18.4 Mbits/sec
[1264] 0.0-10.0 sec 21.9 MBytes 18.4 Mbits/sec
[1464] 0.0-10.0 sec 22.1 MBytes 18.5 Mbits/sec
[1224] 0.0-10.0 sec 22.1 MBytes 18.5 Mbits/sec
[1424] 0.0-10.0 sec 22.0 MBytes 18.5 Mbits/sec
[1684] 0.0-10.0 sec 22.0 MBytes 18.5 Mbits/sec
[1544] 0.0-10.0 sec 22.0 MBytes 18.4 Mbits/sec
[1504] 0.0-10.0 sec 22.2 MBytes 18.6 Mbits/sec
[1624] 0.0-10.0 sec 22.2 MBytes 18.6 Mbits/sec
[1304] 0.0-10.0 sec 22.3 MBytes 18.6 Mbits/sec
[ ID] Interval Transfer Bandwidth
[1584] 0.0-10.0 sec 22.5 MBytes 18.8 Mbits/sec
[1344] 0.0-10.0 sec 22.7 MBytes 19.0 Mbits/sec
[1484] 0.0-10.0 sec 22.7 MBytes 19.0 Mbits/sec
[1604] 0.0-10.0 sec 23.0 MBytes 19.3 Mbits/sec
[1644] 0.0-10.0 sec 23.0 MBytes 19.3 Mbits/sec
[1564] WARNING: did not receive ack of last datagram after 10 tries.
[1184] WARNING: did not receive ack of last datagram after 10 tries.
[1564] Sent 15516 datagrams
[1184] Sent 15474 datagrams
[1524] WARNING: did not receive ack of last datagram after 10 tries.
[1524] Sent 15582 datagrams
[1444] WARNING: did not receive ack of last datagram after 10 tries.
[1444] Sent 15535 datagrams
[1384] WARNING: did not receive ack of last datagram after 10 tries.
[1364] WARNING: did not receive ack of last datagram after 10 tries.
[1384] Sent 15459 datagrams
[1364] Sent 15647 datagrams
[1264] WARNING: did not receive ack of last datagram after 10 tries.
[1204] WARNING: did not receive ack of last datagram after 10 tries.
[1264] Sent 15638 datagrams
[1204] Sent 15675 datagrams
[1284] WARNING: did not receive ack of last datagram after 10 tries.
[1284] Sent 15684 datagrams
[1244] WARNING: did not receive ack of last datagram after 10 tries.
[1244] Sent 15734 datagrams
[1324] WARNING: did not receive ack of last datagram after 10 tries.
[1324] Sent 15630 datagrams
[1404] WARNING: did not receive ack of last datagram after 10 tries.
[1404] Sent 15580 datagrams
[1464] WARNING: did not receive ack of last datagram after 10 tries.
[1464] Sent 15763 datagrams
[1424] WARNING: did not receive ack of last datagram after 10 tries.
[1424] Sent 15717 datagrams
[1544] WARNING: did not receive ack of last datagram after 10 tries.
[1684] WARNING: did not receive ack of last datagram after 10 tries.
[1544] Sent 15704 datagrams
[1684] Sent 15714 datagrams
[1224] WARNING: did not receive ack of last datagram after 10 tries.
[1224] Sent 15773 datagrams
[1504] WARNING: did not receive ack of last datagram after 10 tries.
[1624] WARNING: did not receive ack of last datagram after 10 tries.
[1504] Sent 15842 datagrams
[1624] Sent 15844 datagrams
[1304] WARNING: did not receive ack of last datagram after 10 tries.
[1304] Sent 15875 datagrams
[1584] WARNING: did not receive ack of last datagram after 10 tries.
[1584] Sent 16050 datagrams
[1484] WARNING: did not receive ack of last datagram after 10 tries.
[1344] WARNING: did not receive ack of last datagram after 10 tries.
[1484] Sent 16209 datagrams
[1344] Sent 16180 datagrams
[1604] WARNING: did not receive ack of last datagram after 10 tries.
[1604] Sent 16398 datagrams
[1644] WARNING: did not receive ack of last datagram after 10 tries.
[1644] Sent 16399 datagrams
[SUM] 0.0-10.2 sec 553 MBytes 456 Mbits/sec

The output of Iperf UDP client for virtual NIC is as follows:

------------------------------------------------------------
Client connecting to 192.168.100.16, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 8.00 KByte (default)
------------------------------------------------------------
[1196] local 192.168.100.17 port 2579 connected with 192.168.100.16 port
5001
[1684] local 192.168.100.17 port 2558 connected with 192.168.100.16 port
5001
[1580] local 192.168.100.17 port 2560 connected with 192.168.100.16 port
5001
[1560] local 192.168.100.17 port 2561 connected with 192.168.100.16 port
5001
[1540] local 192.168.100.17 port 2562 connected with 192.168.100.16 port
5001
[1520] local 192.168.100.17 port 2563 connected with 192.168.100.16 port
5001
[1500] local 192.168.100.17 port 2564 connected with 192.168.100.16 port
5001
[1480] local 192.168.100.17 port 2565 connected with 192.168.100.16 port
5001
[1460] local 192.168.100.17 port 2566 connected with 192.168.100.16 port
5001
[1440] local 192.168.100.17 port 2567 connected with 192.168.100.16 port
5001
[1420] local 192.168.100.17 port 2568 connected with 192.168.100.16 port
5001
[1400] local 192.168.100.17 port 2569 connected with 192.168.100.16 port
5001
[1380] local 192.168.100.17 port 2570 connected with 192.168.100.16 port
5001
[1360] local 192.168.100.17 port 2571 connected with 192.168.100.16 port
5001
[1340] local 192.168.100.17 port 2572 connected with 192.168.100.16 port
5001
[1320] local 192.168.100.17 port 2573 connected with 192.168.100.16 port
5001
[1300] local 192.168.100.17 port 2574 connected with 192.168.100.16 port
5001
[1276] local 192.168.100.17 port 2575 connected with 192.168.100.16 port
5001
[1236] local 192.168.100.17 port 2577 connected with 192.168.100.16 port
5001
[1156] local 192.168.100.17 port 2581 connected with 192.168.100.16 port
5001
[1260] local 192.168.100.17 port 2576 connected with 192.168.100.16 port
5001
[1216] local 192.168.100.17 port 2578 connected with 192.168.100.16 port
5001
[1176] local 192.168.100.17 port 2580 connected with 192.168.100.16 port
5001
[1604] local 192.168.100.17 port 2559 connected with 192.168.100.16 port
5001
[1136] local 192.168.100.17 port 2582 connected with 192.168.100.16 port
5001
[ ID] Interval Transfer Bandwidth
[1196] 0.0-10.0 sec 47.8 MBytes 40.0 Mbits/sec
[1684] 0.0-10.0 sec 30.7 MBytes 25.7 Mbits/sec
[1560] 0.0-10.0 sec 48.3 MBytes 40.4 Mbits/sec
[1580] 0.0-10.0 sec 30.8 MBytes 25.8 Mbits/sec
[1176] 0.0-10.0 sec 30.7 MBytes 25.7 Mbits/sec
[1320] 0.0-10.0 sec 47.9 MBytes 40.1 Mbits/sec
[1276] 0.0-10.0 sec 48.1 MBytes 40.3 Mbits/sec
[1400] 0.0-10.0 sec 48.0 MBytes 40.2 Mbits/sec
[1216] 0.0-10.0 sec 30.6 MBytes 25.6 Mbits/sec
[1540] 0.0-10.0 sec 30.4 MBytes 25.5 Mbits/sec
[1300] 0.0-10.0 sec 31.3 MBytes 26.2 Mbits/sec
[1480] 0.0-10.0 sec 47.9 MBytes 40.2 Mbits/sec
[1440] 0.0-10.0 sec 47.9 MBytes 40.1 Mbits/sec
[1236] 0.0-10.0 sec 47.9 MBytes 40.1 Mbits/sec
[1360] 0.0-10.0 sec 47.7 MBytes 39.9 Mbits/sec
[1156] 0.0-10.0 sec 48.5 MBytes 40.6 Mbits/sec
[1520] 0.0-10.0 sec 48.2 MBytes 40.3 Mbits/sec
[1420] 0.0-10.0 sec 31.4 MBytes 26.3 Mbits/sec
[1260] 0.0-10.0 sec 30.0 MBytes 25.2 Mbits/sec
[1380] 0.0-10.0 sec 31.2 MBytes 26.1 Mbits/sec
[ ID] Interval Transfer Bandwidth
[1500] 0.0-10.0 sec 29.9 MBytes 25.0 Mbits/sec
[1460] 0.0-10.0 sec 30.4 MBytes 25.5 Mbits/sec
[1340] 0.0-10.0 sec 30.7 MBytes 25.7 Mbits/sec
[1136] 0.0-10.0 sec 34.3 MBytes 28.7 Mbits/sec
[1604] 0.0-10.0 sec 52.0 MBytes 43.5 Mbits/sec
[1196] WARNING: did not receive ack of last datagram after 10 tries.
[1196] Sent 34096 datagrams

So any idea why is it like that?

Thanks,

Arsalan

"Pankaj Garg" <pankajg@intellectualheaven.com> wrote in message
news:Pine.WNT.4.64.0611292332321.2364@pankajg-notevt...
>
> *Removing multiple newsgroups as cross posting usually generates tons of
> extra traffic.*
>
> What packet size IO are you doing in case of TCP? Can you check if your
> physical NIC has TCP large send offload enabled?
>
> The reason for the large difference in TCP performance may be due to the
> offload happening in case of physical NIC.
>
> I can't think of anything for the UDP case however, that just seems
> strange to me. You probably need to dig deeper and tell us what exactly is
> happening here. How many packets are being sent, what was the size of
> these packets? Are you grouping multiple UDP packets in one TCP packet?
>
> --
> Pankaj Garg
> 2006-11-29 at 11:32pm
> http://www.intellectualheaven.com
>
>
> On Wed, 29 Nov 2006, Steve Dispensa wrote:
>
>>
>> I'm sorry, I misread the UDP part of your question. You're right,
>> something seems off there. Are you buffering somewhere somehow? Are some
>> packets hitting the bit bucket after being delivered to the virtual NIC,
>> perhaps?
>>
>> As for TCP case, you said:
>>
>>>>> What I found that for TCP, throughput that I am getting on physical
>>>>> interface is around 900 Mbps and on virtual interface i am getting
>>>>> around 90 Mbps (only 10% of the physcial one). All the tests were
>>>>> performed with Iperf with 15 parallel streams.
>>
>> and
>>
>>>>> So, for TCP packets over virtual interface, I am sending a TCP packet
>>>>> encapsulated within another TCP packet when passed to physical
>>>>> interface, while for UDP I am sending UDP packet encapsulated within
>>>>> TCP packet when passed to physical interface.
>>
>> Nagle would matter for UDP packets encapsulated within TCP packets,
>> depending (a great deal) on the characteristics of the traffic you are
>> testing with. Nagle's has an adverse affect on transmission speed,
>> because it introduces an artificial queuing delay, regardless of what
>> it's carrying. I suggested turning on NODELAY for the "outside" TCP
>> connection, not (necessarily) the "inside" connection.
>>
>> At any rate, your results do seem strange to me; not a whole lot else
>> comes to mind though. You might want to instrument your miniport and
>> track where the packets spend their time between receipt at your send
>> handler and final send() to the other system.
>>
>> Good luck.
>>
>> -sd
>>
>>
>> On 2006-11-29 07:37:48 -0600, "Arsalan Ahmad" <arsal__@hotmail.com> said:
>>
>>> My question was that Iperf is showing less UDP throughput over physical
>>> interface (around 550Mbps for 15 parallel streams) while for virtual NIC
>>> it is showing greater throughput (around 1Gbps for 15 parallel streams).
>>> So why is this? because what I thought that UDP thoughput should be high
>>> in case of physical interface which is not I am getting. And how
>>> Nagle's. Turn on NODELAY for TCP can effect UDP throughput performance?
>>>
>>> Thanks,
>>>
>>> Arsalan
>>>
>>>
>>> "Steve Dispensa" <dispensa@takethisout.positivenetworks.net> wrote in
>>> message
>>> news:2006112900064250073-dispensa@takethisoutpositivenetworksnet...
>>>> On 2006-11-28 04:03:23 -0600, "Arsalan Ahmad" <arsal__@hotmail.com>
>>>> said:
>>>>
>>>>> Hello all,
>>>>>
>>>>> I have been doing some throughput tests with Iperf for my VNIC and
>>>>> physical NIC. I connect two P-4 3.4 GHz computers with 1Gbps NIC cards
>>>>> together via cross-over cable and then performed my experiments.
>>>>>
>>>>> What I found that for TCP, throughput that I am getting on physical
>>>>> interface is around 900 Mbps and on virtual interface i am getting
>>>>> around 90 Mbps (only 10% of the physcial one). All the tests were
>>>>> performed with Iperf with 15 parallel streams.
>>>>>
>>>>> However, for UDP its totally different. With Iperf with 15 parallel
>>>>> streams, I am getting only around 500 Mbps throughput on physical
>>>>> interface while approximately 1000Mbps (1Gbps) on virtual interface. I
>>>>> am unable to figure out why am I getting this strange behaviour. Can
>>>>> you guess?
>>>>
>>>> Yes, I can guess; are you tunneling the packets? Do you have Nagle's
>>>> turned on? Did you do the math on the bandwidth-delay product and make
>>>> your TCP window was set appropriately?
>>>>
>>>>
>>>>> My virtual NIC pass all the packets to a user-level application which
>>>>> then send all the IP packets of virtual interface to user-level
>>>>> application running on the other computer. The communication between
>>>>> the two user-level application is via TCP.
>>>>
>>>> Windows is sensitive to the exact structure of your sockets code. To
>>>> get the best possible performance from the OS, you may want to look
>>>> into using I/O Completion for your tunnel code on both ends, and make
>>>> sure you keep enough buffers pending such that the protocol is never
>>>> waiting on you.
>>>>
>>>>
>>>>> So, for TCP packets over virtual interface, I am sending a TCP packet
>>>>> encapsulated within another TCP packet when passed to physical
>>>>> interface, while for UDP I am sending UDP packet encapsulated within
>>>>> TCP packet when passed to physical interface.
>>>>
>>>> Given this, I'm voting for Nagle's. Turn on NODELAY and re-test.
>>>>
>>>> -sd
>>
>>
>>
>>



Re: A question regarding MTU: how it can effect TCP performance + other queries by Arsalan

Arsalan
Thu Nov 30 03:58:41 CST 2006

Hi,

Just to add, I am sending one IP packet that I receive over virtual
interface to the physical one using socket. Now depending on its size, this
packet (packet recieved on virtual interface) could break into two packets
in case of physical NIC depending on the MTU. For UDP test, Iperf is using
datagram size of 1470 bytes and UDP buffer size of 8KB.

So any idea why UDP throughput is reported by Iperf to be greater in case of
virtual NIC as compared to physical NIC?

Thanks,

Arsalan

"Pankaj Garg" <pankajg@intellectualheaven.com> wrote in message
news:Pine.WNT.4.64.0611292332321.2364@pankajg-notevt...
>
> *Removing multiple newsgroups as cross posting usually generates tons of
> extra traffic.*
>
> What packet size IO are you doing in case of TCP? Can you check if your
> physical NIC has TCP large send offload enabled?
>
> The reason for the large difference in TCP performance may be due to the
> offload happening in case of physical NIC.
>
> I can't think of anything for the UDP case however, that just seems
> strange to me. You probably need to dig deeper and tell us what exactly is
> happening here. How many packets are being sent, what was the size of
> these packets? Are you grouping multiple UDP packets in one TCP packet?
>
> --
> Pankaj Garg
> 2006-11-29 at 11:32pm
> http://www.intellectualheaven.com
>
>
> On Wed, 29 Nov 2006, Steve Dispensa wrote:
>
>>
>> I'm sorry, I misread the UDP part of your question. You're right,
>> something seems off there. Are you buffering somewhere somehow? Are some
>> packets hitting the bit bucket after being delivered to the virtual NIC,
>> perhaps?
>>
>> As for TCP case, you said:
>>
>>>>> What I found that for TCP, throughput that I am getting on physical
>>>>> interface is around 900 Mbps and on virtual interface i am getting
>>>>> around 90 Mbps (only 10% of the physcial one). All the tests were
>>>>> performed with Iperf with 15 parallel streams.
>>
>> and
>>
>>>>> So, for TCP packets over virtual interface, I am sending a TCP packet
>>>>> encapsulated within another TCP packet when passed to physical
>>>>> interface, while for UDP I am sending UDP packet encapsulated within
>>>>> TCP packet when passed to physical interface.
>>
>> Nagle would matter for UDP packets encapsulated within TCP packets,
>> depending (a great deal) on the characteristics of the traffic you are
>> testing with. Nagle's has an adverse affect on transmission speed,
>> because it introduces an artificial queuing delay, regardless of what
>> it's carrying. I suggested turning on NODELAY for the "outside" TCP
>> connection, not (necessarily) the "inside" connection.
>>
>> At any rate, your results do seem strange to me; not a whole lot else
>> comes to mind though. You might want to instrument your miniport and
>> track where the packets spend their time between receipt at your send
>> handler and final send() to the other system.
>>
>> Good luck.
>>
>> -sd
>>
>>
>> On 2006-11-29 07:37:48 -0600, "Arsalan Ahmad" <arsal__@hotmail.com> said:
>>
>>> My question was that Iperf is showing less UDP throughput over physical
>>> interface (around 550Mbps for 15 parallel streams) while for virtual NIC
>>> it is showing greater throughput (around 1Gbps for 15 parallel streams).
>>> So why is this? because what I thought that UDP thoughput should be high
>>> in case of physical interface which is not I am getting. And how
>>> Nagle's. Turn on NODELAY for TCP can effect UDP throughput performance?
>>>
>>> Thanks,
>>>
>>> Arsalan
>>>
>>>
>>> "Steve Dispensa" <dispensa@takethisout.positivenetworks.net> wrote in
>>> message
>>> news:2006112900064250073-dispensa@takethisoutpositivenetworksnet...
>>>> On 2006-11-28 04:03:23 -0600, "Arsalan Ahmad" <arsal__@hotmail.com>
>>>> said:
>>>>
>>>>> Hello all,
>>>>>
>>>>> I have been doing some throughput tests with Iperf for my VNIC and
>>>>> physical NIC. I connect two P-4 3.4 GHz computers with 1Gbps NIC cards
>>>>> together via cross-over cable and then performed my experiments.
>>>>>
>>>>> What I found that for TCP, throughput that I am getting on physical
>>>>> interface is around 900 Mbps and on virtual interface i am getting
>>>>> around 90 Mbps (only 10% of the physcial one). All the tests were
>>>>> performed with Iperf with 15 parallel streams.
>>>>>
>>>>> However, for UDP its totally different. With Iperf with 15 parallel
>>>>> streams, I am getting only around 500 Mbps throughput on physical
>>>>> interface while approximately 1000Mbps (1Gbps) on virtual interface. I
>>>>> am unable to figure out why am I getting this strange behaviour. Can
>>>>> you guess?
>>>>
>>>> Yes, I can guess; are you tunneling the packets? Do you have Nagle's
>>>> turned on? Did you do the math on the bandwidth-delay product and make
>>>> your TCP window was set appropriately?
>>>>
>>>>
>>>>> My virtual NIC pass all the packets to a user-level application which
>>>>> then send all the IP packets of virtual interface to user-level
>>>>> application running on the other computer. The communication between
>>>>> the two user-level application is via TCP.
>>>>
>>>> Windows is sensitive to the exact structure of your sockets code. To
>>>> get the best possible performance from the OS, you may want to look
>>>> into using I/O Completion for your tunnel code on both ends, and make
>>>> sure you keep enough buffers pending such that the protocol is never
>>>> waiting on you.
>>>>
>>>>
>>>>> So, for TCP packets over virtual interface, I am sending a TCP packet
>>>>> encapsulated within another TCP packet when passed to physical
>>>>> interface, while for UDP I am sending UDP packet encapsulated within
>>>>> TCP packet when passed to physical interface.
>>>>
>>>> Given this, I'm voting for Nagle's. Turn on NODELAY and re-test.
>>>>
>>>> -sd
>>
>>
>>
>>