public inbox for cygwin-patches@cygwin.com
 help / color / mirror / Atom feed
* Fix performance on 10Gb networks
@ 2014-11-18 19:30 Iuliu Rus
  2014-11-18 20:43 ` Corinna Vinschen
  0 siblings, 1 reply; 6+ messages in thread
From: Iuliu Rus @ 2014-11-18 19:30 UTC (permalink / raw)
  To: cygwin-patches

[-- Attachment #1: Type: text/plain, Size: 663 bytes --]

Hello,
Google is running Cygwin apps on its 10Gb networks and we are seeing
extremely bad performance in a couple of cases. For example, iperf
with the defaults results in only 10Mbits/sec.
We tracked this down to a combination of non-blocking sockets with
Nagle+delayed ack kicking in, since the apps eventually end up sending
a very small packets (2 bytes).
We have a case open against Microsoft but since everything is moving
very slow we would like to work around by picking socket buffers that
are multiple of 4k.

Change log:
2014-11-18 Iuliu Rus <rus.iuliu@gmail.com>

* net.cc Change default values for socket buffers to fix performance
on 10Gb networks.

[-- Attachment #2: net_patch --]
[-- Type: application/octet-stream, Size: 1452 bytes --]

Index: winsup/cygwin/net.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/net.cc,v
retrieving revision 1.320
diff -u -p -r1.320 net.cc
--- winsup/cygwin/net.cc	13 Oct 2014 08:18:18 -0000	1.320
+++ winsup/cygwin/net.cc	18 Nov 2014 19:12:00 -0000
@@ -621,13 +621,16 @@ fdsock (cygheap_fdmanip& fd, const devic
      this is no problem on 64 bit.  So we set the default buffer size to
      the default values in current 3.x Linux versions.
 
-     (*) Maximum normal TCP window size.  Coincidence?  */
+     (*) Maximum normal TCP window size.  Coincidence?  
+
+     NOTE 3. Setting the window size to 65535 results in extremely bad performance for apps that send data in multiples of Kb, as they eventually end up sending 1 byte on the network and naggle + delay ack kicks in. For example, iperf on a 10Gb network gives only 10 Mbits/sec with a 65535 send buffer. We want this to be a multiple of PAGE_SIZE, but since 64k breaks WSADuplicateSocket we use 60Kb.
+*/
 #ifdef __x86_64__
   ((fhandler_socket *) fd)->rmem () = 212992;
   ((fhandler_socket *) fd)->wmem () = 212992;
 #else
-  ((fhandler_socket *) fd)->rmem () = 65535;
-  ((fhandler_socket *) fd)->wmem () = 65535;
+  ((fhandler_socket *) fd)->rmem () = 63488;
+  ((fhandler_socket *) fd)->wmem () = 63488;
 #endif
   if (::setsockopt (soc, SOL_SOCKET, SO_RCVBUF,
 		    (char *) &((fhandler_socket *) fd)->rmem (), sizeof (int)))

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fix performance on 10Gb networks
  2014-11-18 19:30 Fix performance on 10Gb networks Iuliu Rus
@ 2014-11-18 20:43 ` Corinna Vinschen
  2014-11-19 18:18   ` Iuliu Rus
  0 siblings, 1 reply; 6+ messages in thread
From: Corinna Vinschen @ 2014-11-18 20:43 UTC (permalink / raw)
  To: cygwin-patches

[-- Attachment #1: Type: text/plain, Size: 2610 bytes --]

Hi Iuliu,

On Nov 18 19:30, Iuliu Rus wrote:
> Hello,
> Google is running Cygwin apps on its 10Gb networks and we are seeing
> extremely bad performance in a couple of cases. For example, iperf
> with the defaults results in only 10Mbits/sec.
> We tracked this down to a combination of non-blocking sockets with
> Nagle+delayed ack kicking in, since the apps eventually end up sending
> a very small packets (2 bytes).
> We have a case open against Microsoft but since everything is moving
> very slow we would like to work around by picking socket buffers that
> are multiple of 4k.

Thanks for the patch.  One question:

> Change log:
> 2014-11-18 Iuliu Rus <rus.iuliu@gmail.com>
> 
> * net.cc Change default values for socket buffers to fix performance
> on 10Gb networks.
> 
> Index: winsup/cygwin/net.cc
> ===================================================================
> RCS file: /cvs/src/src/winsup/cygwin/net.cc,v
> retrieving revision 1.320
> diff -u -p -r1.320 net.cc
> --- winsup/cygwin/net.cc	13 Oct 2014 08:18:18 -0000	1.320
> +++ winsup/cygwin/net.cc	18 Nov 2014 19:12:00 -0000
> @@ -621,13 +621,16 @@ fdsock (cygheap_fdmanip& fd, const devic
>       this is no problem on 64 bit.  So we set the default buffer size to
>       the default values in current 3.x Linux versions.
>  
> -     (*) Maximum normal TCP window size.  Coincidence?  */
> +     (*) Maximum normal TCP window size.  Coincidence?  
> +
> +     NOTE 3. Setting the window size to 65535 results in extremely
> bad performance for apps that send data in multiples of Kb, as they
> eventually end up sending 1 byte on the network and naggle + delay ack
> kicks in. For example, iperf on a 10Gb network gives only 10 Mbits/sec
> with a 65535 send buffer. We want this to be a multiple of PAGE_SIZE,
> but since 64k breaks WSADuplicateSocket we use 60Kb.

We do?  See below.

> +*/
>  #ifdef __x86_64__
>    ((fhandler_socket *) fd)->rmem () = 212992;
>    ((fhandler_socket *) fd)->wmem () = 212992;
>  #else
> -  ((fhandler_socket *) fd)->rmem () = 65535;
> -  ((fhandler_socket *) fd)->wmem () = 65535;
> +  ((fhandler_socket *) fd)->rmem () = 63488;
> +  ((fhandler_socket *) fd)->wmem () = 63488;

This is 62K, certainly not a multiple of the native PAGE_SIZE of 4K.
And this makes me wonder.  Did you intend to use 60K and ended up with
62K for a reason?  And then, why not 63K as a multiple of 1K?


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fix performance on 10Gb networks
  2014-11-18 20:43 ` Corinna Vinschen
@ 2014-11-19 18:18   ` Iuliu Rus
       [not found]     ` <CAD97vhocMs1xoSoPsLWzJrMqahkONyx_KrVYwFJSeoupvfsvRQ@mail.gmail.com>
  2014-11-20  8:30     ` Corinna Vinschen
  0 siblings, 2 replies; 6+ messages in thread
From: Iuliu Rus @ 2014-11-19 18:18 UTC (permalink / raw)
  To: cygwin-patches

[-- Attachment #1: Type: text/plain, Size: 2963 bytes --]

 You are right, of course. We initially thought it has to be a
multiple of page_size but it doesn't. I just re-tested with 63k and it
gives good perf too.
We get 600Mbits/second compared with 10Mb for the old default.
Attached the modified patch.

On Tue, Nov 18, 2014 at 8:43 PM, Corinna Vinschen
<corinna-cygwin@cygwin.com> wrote:
> Hi Iuliu,
>
> On Nov 18 19:30, Iuliu Rus wrote:
>> Hello,
>> Google is running Cygwin apps on its 10Gb networks and we are seeing
>> extremely bad performance in a couple of cases. For example, iperf
>> with the defaults results in only 10Mbits/sec.
>> We tracked this down to a combination of non-blocking sockets with
>> Nagle+delayed ack kicking in, since the apps eventually end up sending
>> a very small packets (2 bytes).
>> We have a case open against Microsoft but since everything is moving
>> very slow we would like to work around by picking socket buffers that
>> are multiple of 4k.
>
> Thanks for the patch.  One question:
>
>> Change log:
>> 2014-11-18 Iuliu Rus <rus.iuliu@gmail.com>
>>
>> * net.cc Change default values for socket buffers to fix performance
>> on 10Gb networks.
>>
>> Index: winsup/cygwin/net.cc
>> ===================================================================
>> RCS file: /cvs/src/src/winsup/cygwin/net.cc,v
>> retrieving revision 1.320
>> diff -u -p -r1.320 net.cc
>> --- winsup/cygwin/net.cc      13 Oct 2014 08:18:18 -0000      1.320
>> +++ winsup/cygwin/net.cc      18 Nov 2014 19:12:00 -0000
>> @@ -621,13 +621,16 @@ fdsock (cygheap_fdmanip& fd, const devic
>>       this is no problem on 64 bit.  So we set the default buffer size to
>>       the default values in current 3.x Linux versions.
>>
>> -     (*) Maximum normal TCP window size.  Coincidence?  */
>> +     (*) Maximum normal TCP window size.  Coincidence?
>> +
>> +     NOTE 3. Setting the window size to 65535 results in extremely
>> bad performance for apps that send data in multiples of Kb, as they
>> eventually end up sending 1 byte on the network and naggle + delay ack
>> kicks in. For example, iperf on a 10Gb network gives only 10 Mbits/sec
>> with a 65535 send buffer. We want this to be a multiple of PAGE_SIZE,
>> but since 64k breaks WSADuplicateSocket we use 60Kb.
>
> We do?  See below.
>
>> +*/
>>  #ifdef __x86_64__
>>    ((fhandler_socket *) fd)->rmem () = 212992;
>>    ((fhandler_socket *) fd)->wmem () = 212992;
>>  #else
>> -  ((fhandler_socket *) fd)->rmem () = 65535;
>> -  ((fhandler_socket *) fd)->wmem () = 65535;
>> +  ((fhandler_socket *) fd)->rmem () = 63488;
>> +  ((fhandler_socket *) fd)->wmem () = 63488;
>
> This is 62K, certainly not a multiple of the native PAGE_SIZE of 4K.
> And this makes me wonder.  Did you intend to use 60K and ended up with
> 62K for a reason?  And then, why not 63K as a multiple of 1K?
>
>
> Corinna
>
> --
> Corinna Vinschen                  Please, send mails regarding Cygwin to
> Cygwin Maintainer                 cygwin AT cygwin DOT com
> Red Hat

[-- Attachment #2: net_patch --]
[-- Type: application/octet-stream, Size: 1445 bytes --]

Index: winsup/cygwin/net.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/net.cc,v
retrieving revision 1.320
diff -u -p -r1.320 net.cc
--- winsup/cygwin/net.cc	13 Oct 2014 08:18:18 -0000	1.320
+++ winsup/cygwin/net.cc	19 Nov 2014 18:02:18 -0000
@@ -621,13 +621,16 @@ fdsock (cygheap_fdmanip& fd, const devic
      this is no problem on 64 bit.  So we set the default buffer size to
      the default values in current 3.x Linux versions.
 
-     (*) Maximum normal TCP window size.  Coincidence?  */
+     (*) Maximum normal TCP window size.  Coincidence?  
+
+     NOTE 3. Setting the window size to 65535 results in extremely bad performance for apps that send data in multiples of Kb, as they eventually end up sending 1 byte on the network and naggle + delay ack kicks in. For example, iperf on a 10Gb network gives only 10 Mbits/sec with a 65535 send buffer. We want this to be a multiple of 1k, but since 64k breaks WSADuplicateSocket we use 63Kb.
+*/
 #ifdef __x86_64__
   ((fhandler_socket *) fd)->rmem () = 212992;
   ((fhandler_socket *) fd)->wmem () = 212992;
 #else
-  ((fhandler_socket *) fd)->rmem () = 65535;
-  ((fhandler_socket *) fd)->wmem () = 65535;
+  ((fhandler_socket *) fd)->rmem () = 64512;
+  ((fhandler_socket *) fd)->wmem () = 64512;
 #endif
   if (::setsockopt (soc, SOL_SOCKET, SO_RCVBUF,
 		    (char *) &((fhandler_socket *) fd)->rmem (), sizeof (int)))

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fix performance on 10Gb networks
       [not found]     ` <CAD97vhocMs1xoSoPsLWzJrMqahkONyx_KrVYwFJSeoupvfsvRQ@mail.gmail.com>
@ 2014-11-19 19:30       ` Lev Bishop
  2014-11-20  8:37         ` Corinna Vinschen
  0 siblings, 1 reply; 6+ messages in thread
From: Lev Bishop @ 2014-11-19 19:30 UTC (permalink / raw)
  To: cygwin-patches

Maybe my analysis from some years ago can be relevant here? Another
issue with delayed acks and winsock. I haven't been following cygwin
for some time, so not sure exactly what the status is:
https://cygwin.com/ml/cygwin-patches/2006-q2/msg00031.html

Lev

-- 
Lev Bishop

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fix performance on 10Gb networks
  2014-11-19 18:18   ` Iuliu Rus
       [not found]     ` <CAD97vhocMs1xoSoPsLWzJrMqahkONyx_KrVYwFJSeoupvfsvRQ@mail.gmail.com>
@ 2014-11-20  8:30     ` Corinna Vinschen
  1 sibling, 0 replies; 6+ messages in thread
From: Corinna Vinschen @ 2014-11-20  8:30 UTC (permalink / raw)
  To: cygwin-patches

[-- Attachment #1: Type: text/plain, Size: 1434 bytes --]

On Nov 19 18:18, Iuliu Rus wrote:
>  You are right, of course. We initially thought it has to be a
> multiple of page_size but it doesn't. I just re-tested with 63k and it
> gives good perf too.
> We get 600Mbits/second compared with 10Mb for the old default.
> Attached the modified patch.
> 
> On Tue, Nov 18, 2014 at 8:43 PM, Corinna Vinschen
> <corinna-cygwin@cygwin.com> wrote:
> > Hi Iuliu,
> >
> > On Nov 18 19:30, Iuliu Rus wrote:
> >> Hello,
> >> Google is running Cygwin apps on its 10Gb networks and we are seeing
> >> extremely bad performance in a couple of cases. For example, iperf
> >> with the defaults results in only 10Mbits/sec.
> >> We tracked this down to a combination of non-blocking sockets with
> >> Nagle+delayed ack kicking in, since the apps eventually end up sending
> >> a very small packets (2 bytes).
> >> We have a case open against Microsoft but since everything is moving
> >> very slow we would like to work around by picking socket buffers that
> >> are multiple of 4k.
> >
> > Thanks for the patch.  One question:
> >
> >> Change log:
> >> 2014-11-18 Iuliu Rus <...>
> >>
> >> * net.cc Change default values for socket buffers to fix performance
> >> on 10Gb networks.

Patch applied.


Thanks a lot,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Fix performance on 10Gb networks
  2014-11-19 19:30       ` Lev Bishop
@ 2014-11-20  8:37         ` Corinna Vinschen
  0 siblings, 0 replies; 6+ messages in thread
From: Corinna Vinschen @ 2014-11-20  8:37 UTC (permalink / raw)
  To: cygwin-patches

[-- Attachment #1: Type: text/plain, Size: 816 bytes --]

Hi Lev,

On Nov 19 14:30, Lev Bishop wrote:
> Maybe my analysis from some years ago can be relevant here? Another
> issue with delayed acks and winsock. I haven't been following cygwin
> for some time, so not sure exactly what the status is:
> https://cygwin.com/ml/cygwin-patches/2006-q2/msg00031.html

The code changed quite a bit in the meantime.  Your patch was against
Cygwin 1.5.x, so there's IPv6 support, native 64 bit support, dropped
support for older OS versions prior to XP SP3, etc.

But there are certainly still good chances for optimization.  I would
very much appreciate if you would take another look into this.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat

[-- Attachment #2: Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-11-20  8:37 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-18 19:30 Fix performance on 10Gb networks Iuliu Rus
2014-11-18 20:43 ` Corinna Vinschen
2014-11-19 18:18   ` Iuliu Rus
     [not found]     ` <CAD97vhocMs1xoSoPsLWzJrMqahkONyx_KrVYwFJSeoupvfsvRQ@mail.gmail.com>
2014-11-19 19:30       ` Lev Bishop
2014-11-20  8:37         ` Corinna Vinschen
2014-11-20  8:30     ` Corinna Vinschen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).