public inbox for cygwin-developers@cygwin.com
 help / color / mirror / Atom feed
* Implement sched_[gs]etaffinity()
@ 2019-04-11  4:21 Mark Geisert
  2019-04-11  8:26 ` Corinna Vinschen
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Geisert @ 2019-04-11  4:21 UTC (permalink / raw)
  To: cygwin-developers

I've recently sent a patch to cygwin-patches that implements these 
Linux-specific functions.  I used the following test program to debug and test 
the implementation.  When the program is run, you can watch it migrate from CPU 
to CPU with Windows Task Manager.

I've only tested on 64-bit Windows 7 so far.  If the code (in the patch) is 
adequate I will supply another patch for doc updates, etc.

..mark

P.S. Here's the test program:

~ cat afftest.c
#define _GNU_SOURCE
#include <errno.h>
#include <math.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include "/oss/src/newlib/libc/include/sched.h" //XXX for demo only

size_t cpusetsize;
volatile int nprocs;
volatile int proc;

void handler (int unused)
{
   char buf[2] = "\000";
   cpu_set_t mask;

   ++proc;
   if (proc >= nprocs)
     proc = 0;
   buf[0] = '0' + proc;
   write (2, buf, 1);

   mask = 1 << proc;
   int res = sched_setaffinity (0, cpusetsize, &mask);
   if (res < 0)
     {
       perror ("handler");
       exit (2);
     }
   alarm (8);
}

int
main (int argc, char **argv)
{
   char *ptr = getenv ("NUMBER_OF_PROCESSORS");

   if (!ptr)
     return 1;
   nprocs = atoi (ptr);
   proc = nprocs;
   cpusetsize = (nprocs + 7) / 8;

   signal (SIGALRM, handler);
   alarm (1);

   double x = 92837492873.2398749827394723984723;
   while (x++)
     x = sqrt (x), x *= x;
}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Implement sched_[gs]etaffinity()
  2019-04-11  4:21 Implement sched_[gs]etaffinity() Mark Geisert
@ 2019-04-11  8:26 ` Corinna Vinschen
  2019-04-11  8:38   ` Corinna Vinschen
  0 siblings, 1 reply; 11+ messages in thread
From: Corinna Vinschen @ 2019-04-11  8:26 UTC (permalink / raw)
  To: Mark Geisert; +Cc: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1317 bytes --]

Hi Mark,

On Apr 10 21:21, Mark Geisert wrote:
> I've recently sent a patch to cygwin-patches that implements these
> Linux-specific functions.  I used the following test program to debug and
> test the implementation.  When the program is run, you can watch it migrate
> from CPU to CPU with Windows Task Manager.
> 
> I've only tested on 64-bit Windows 7 so far.  If the code (in the patch) is
> adequate I will supply another patch for doc updates, etc.

Your patch is nicely done, but what about machines with more than 64
CPUs?  Your patch only uses the standard API for up to 64 CPUs, so a
process can never use more than 64 CPUs or use CPUs from different CPU
groups.  There was also the case of this weird machine Achim Gratz once
worked on, which had less than 64 CPUs but *still* used multiple CPU
groups under Windows, for some reason.

Any chance you could update your patch to support this functionality?
For some info, see MSDN:

https://docs.microsoft.com/en-us/windows/desktop/ProcThread/processor-groups

Also, there's already some code in fhandler_proc.cc, function
format_proc_cpuinfo to handle CPU groups.  You can use the
wincap.has_processor_groups() method to check if the system
supports CPU groups.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Implement sched_[gs]etaffinity()
  2019-04-11  8:26 ` Corinna Vinschen
@ 2019-04-11  8:38   ` Corinna Vinschen
  2019-04-11 20:52     ` Mark Geisert
  0 siblings, 1 reply; 11+ messages in thread
From: Corinna Vinschen @ 2019-04-11  8:38 UTC (permalink / raw)
  To: Mark Geisert; +Cc: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 1594 bytes --]

On Apr 11 10:25, Corinna Vinschen wrote:
> Hi Mark,
> 
> On Apr 10 21:21, Mark Geisert wrote:
> > I've recently sent a patch to cygwin-patches that implements these
> > Linux-specific functions.  I used the following test program to debug and
> > test the implementation.  When the program is run, you can watch it migrate
> > from CPU to CPU with Windows Task Manager.
> > 
> > I've only tested on 64-bit Windows 7 so far.  If the code (in the patch) is
> > adequate I will supply another patch for doc updates, etc.
> 
> Your patch is nicely done, but what about machines with more than 64
> CPUs?  Your patch only uses the standard API for up to 64 CPUs, so a
> process can never use more than 64 CPUs or use CPUs from different CPU
> groups.  There was also the case of this weird machine Achim Gratz once
> worked on, which had less than 64 CPUs but *still* used multiple CPU
> groups under Windows, for some reason.
> 
> Any chance you could update your patch to support this functionality?
> For some info, see MSDN:
> 
> https://docs.microsoft.com/en-us/windows/desktop/ProcThread/processor-groups
> 
> Also, there's already some code in fhandler_proc.cc, function
> format_proc_cpuinfo to handle CPU groups.  You can use the
> wincap.has_processor_groups() method to check if the system
> supports CPU groups.

Btw., Glibc's cpu_set_t supports up to 1024 CPUs.  See
https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/bits/cpu-set.h
This may be ok for the foreseable future, I guess.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Implement sched_[gs]etaffinity()
  2019-04-11  8:38   ` Corinna Vinschen
@ 2019-04-11 20:52     ` Mark Geisert
  2019-04-12  7:46       ` Corinna Vinschen
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Geisert @ 2019-04-11 20:52 UTC (permalink / raw)
  To: cygwin-developers

On Thu, 11 Apr 2019, Corinna Vinschen wrote:
> On Apr 11 10:25, Corinna Vinschen wrote:
>> Hi Mark,
>>
>> On Apr 10 21:21, Mark Geisert wrote:
>>> I've recently sent a patch to cygwin-patches that implements these
>>> Linux-specific functions.  I used the following test program to debug and
>>> test the implementation.  When the program is run, you can watch it migrate
>>> from CPU to CPU with Windows Task Manager.
>>>
>>> I've only tested on 64-bit Windows 7 so far.  If the code (in the patch) is
>>> adequate I will supply another patch for doc updates, etc.
>>
>> Your patch is nicely done, but what about machines with more than 64
>> CPUs?  Your patch only uses the standard API for up to 64 CPUs, so a
>> process can never use more than 64 CPUs or use CPUs from different CPU
>> groups.  There was also the case of this weird machine Achim Gratz once
>> worked on, which had less than 64 CPUs but *still* used multiple CPU
>> groups under Windows, for some reason.
>>
>> Any chance you could update your patch to support this functionality?
>> For some info, see MSDN:
>>
>> https://docs.microsoft.com/en-us/windows/desktop/ProcThread/processor-groups
>>
>> Also, there's already some code in fhandler_proc.cc, function
>> format_proc_cpuinfo to handle CPU groups.  You can use the
>> wincap.has_processor_groups() method to check if the system
>> supports CPU groups.
>
> Btw., Glibc's cpu_set_t supports up to 1024 CPUs.  See
> https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/bits/cpu-set.h
> This may be ok for the foreseable future, I guess.

Hi Corinna,
I will look into CPU group support; thanks for the pointers.  I also need 
to fix the assumption I made about which flavor of pid would be handed to 
the functions.. they will be Cygwin pids but need conversion to Windows 
pids internally.

..mark

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Implement sched_[gs]etaffinity()
  2019-04-11 20:52     ` Mark Geisert
@ 2019-04-12  7:46       ` Corinna Vinschen
  2019-04-16  8:19         ` Mark Geisert
  0 siblings, 1 reply; 11+ messages in thread
From: Corinna Vinschen @ 2019-04-12  7:46 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2161 bytes --]

On Apr 11 13:52, Mark Geisert wrote:
> On Thu, 11 Apr 2019, Corinna Vinschen wrote:
> > On Apr 11 10:25, Corinna Vinschen wrote:
> > > Hi Mark,
> > > 
> > > On Apr 10 21:21, Mark Geisert wrote:
> > > > I've recently sent a patch to cygwin-patches that implements these
> > > > Linux-specific functions.  I used the following test program to debug and
> > > > test the implementation.  When the program is run, you can watch it migrate
> > > > from CPU to CPU with Windows Task Manager.
> > > > 
> > > > I've only tested on 64-bit Windows 7 so far.  If the code (in the patch) is
> > > > adequate I will supply another patch for doc updates, etc.
> > > 
> > > Your patch is nicely done, but what about machines with more than 64
> > > CPUs?  Your patch only uses the standard API for up to 64 CPUs, so a
> > > process can never use more than 64 CPUs or use CPUs from different CPU
> > > groups.  There was also the case of this weird machine Achim Gratz once
> > > worked on, which had less than 64 CPUs but *still* used multiple CPU
> > > groups under Windows, for some reason.
> > > 
> > > Any chance you could update your patch to support this functionality?
> > > For some info, see MSDN:
> > > 
> > > https://docs.microsoft.com/en-us/windows/desktop/ProcThread/processor-groups
> > > 
> > > Also, there's already some code in fhandler_proc.cc, function
> > > format_proc_cpuinfo to handle CPU groups.  You can use the
> > > wincap.has_processor_groups() method to check if the system
> > > supports CPU groups.
> > 
> > Btw., Glibc's cpu_set_t supports up to 1024 CPUs.  See
> > https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/bits/cpu-set.h
> > This may be ok for the foreseable future, I guess.
> 
> Hi Corinna,
> I will look into CPU group support; thanks for the pointers.  I also need to
> fix the assumption I made about which flavor of pid would be handed to the
> functions.. they will be Cygwin pids but need conversion to Windows pids
> internally.

Yeah, right, I missed to notice that.  I'll add a few notes inline
over @ cygwin-patches.


Corinna

-- 
Corinna Vinschen
Cygwin Maintainer

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Implement sched_[gs]etaffinity()
  2019-04-12  7:46       ` Corinna Vinschen
@ 2019-04-16  8:19         ` Mark Geisert
  2019-04-16 10:45           ` Corinna Vinschen
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Geisert @ 2019-04-16  8:19 UTC (permalink / raw)
  To: cygwin-developers

On Fri, 12 Apr 2019, Corinna Vinschen wrote:
> On Apr 11 13:52, Mark Geisert wrote:
>> On Thu, 11 Apr 2019, Corinna Vinschen wrote:
>>> On Apr 11 10:25, Corinna Vinschen wrote:
[...]
>>>> Your patch is nicely done, but what about machines with more than 64
>>>> CPUs?  Your patch only uses the standard API for up to 64 CPUs, so a
>>>> process can never use more than 64 CPUs or use CPUs from different CPU
>>>> groups.  There was also the case of this weird machine Achim Gratz once
>>>> worked on, which had less than 64 CPUs but *still* used multiple CPU
>>>> groups under Windows, for some reason.
>>>>
>>>> Any chance you could update your patch to support this functionality?
>>>> For some info, see MSDN:
>>>>
>>>> https://docs.microsoft.com/en-us/windows/desktop/ProcThread/processor-groups
>>>>
>>>> Also, there's already some code in fhandler_proc.cc, function
>>>> format_proc_cpuinfo to handle CPU groups.  You can use the
>>>> wincap.has_processor_groups() method to check if the system
>>>> supports CPU groups.
>>>
>>> Btw., Glibc's cpu_set_t supports up to 1024 CPUs.  See
>>> https://sourceware.org/git/?p=glibc.git;a=blob;f=posix/bits/cpu-set.h
>>> This may be ok for the foreseable future, I guess.
>>
>> Hi Corinna,
>> I will look into CPU group support; thanks for the pointers.  I also need to
>> fix the assumption I made about which flavor of pid would be handed to the
>> functions.. they will be Cygwin pids but need conversion to Windows pids
>> internally.
>
> Yeah, right, I missed to notice that.  I'll add a few notes inline
> over @ cygwin-patches.

I've updated my code locally to account for your notes on cygwin-patches; 
thanks!  I've also spent some time researching Windows affinities vs Linux 
affinities and have come to some conclusions.  I'm airing these for review
before I start coding in earnest.  I appreciate all comments from anybody 
interested.

(1) On Linux, one deals with processor affinities using a huge mask that 
allows to choose from all processors on the system.  On Windows, one deals 
with processor affinities for only the current processor group, max 64 
processors in a group.  This implies conversion between the two "views" 
when getting or setting processor affinities on Cygwin.

(2) On Linux, sched_get/setaffinity() take a pid_t argument, but it can 
be either a process id or a thread id.  If one selects a process id, the 
action affects just the main thread of that process.  On Windows, 
selecting the process id affects all threads of that process.

(3) For completeness, Linux's pthread_get/setaffinity_np() should probably 
be supplied by the proposed code too.

(4) I was looking at Cygwin's fhandler_proc.cc, function 
format_proc_cpuinfo().  There's a call to __get_cpus_per_group() which is 
implemented in miscfuncs.cc.  I haven't seen in the MSDN docs whether each 
processor group is guaranteed to have the same number of processors.  I 
might even expect variations on a NUMA system.  Anybody know if one can 
depend on the group membership of the first processor group to apply to 
all groups?

(5) On Windows, a process starts out in a particular processor group.  One 
can then change thread affinities in such a way that some threads run in a 
different processor group than other threads of the same process.  The 
process becomes a "multi-group" process.  This has implications for the 
conversions discussed in (1).

(6) On Linux, processor affinity is inherited across fork() and execve(). 
I'll need to ensure Cygwin's implementation of those calls handle affinity 
the same way.

So this is looking like a more substantial project :-).
Thanks for reading,

..mark

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Implement sched_[gs]etaffinity()
  2019-04-16  8:19         ` Mark Geisert
@ 2019-04-16 10:45           ` Corinna Vinschen
  2019-04-17  4:31             ` Mark Geisert
  0 siblings, 1 reply; 11+ messages in thread
From: Corinna Vinschen @ 2019-04-16 10:45 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 3377 bytes --]

On Apr 16 01:19, Mark Geisert wrote:
> On Fri, 12 Apr 2019, Corinna Vinschen wrote:
> > Yeah, right, I missed to notice that.  I'll add a few notes inline
> > over @ cygwin-patches.
> 
> I've updated my code locally to account for your notes on cygwin-patches;
> thanks!  I've also spent some time researching Windows affinities vs Linux
> affinities and have come to some conclusions.  I'm airing these for review
> before I start coding in earnest.  I appreciate all comments from anybody
> interested.
> 
> (1) On Linux, one deals with processor affinities using a huge mask that
> allows to choose from all processors on the system.  On Windows, one deals
> with processor affinities for only the current processor group, max 64
> processors in a group.  This implies conversion between the two "views" when
> getting or setting processor affinities on Cygwin.
> 
> (2) On Linux, sched_get/setaffinity() take a pid_t argument, but it can be
> either a process id or a thread id.  If one selects a process id, the action
> affects just the main thread of that process.  On Windows, selecting the
> process id affects all threads of that process.
> 
> (3) For completeness, Linux's pthread_get/setaffinity_np() should probably
> be supplied by the proposed code too.
> 
> (4) I was looking at Cygwin's fhandler_proc.cc, function
> format_proc_cpuinfo().  There's a call to __get_cpus_per_group() which is
> implemented in miscfuncs.cc.  I haven't seen in the MSDN docs whether each
> processor group is guaranteed to have the same number of processors.  I
> might even expect variations on a NUMA system.  Anybody know if one can
> depend on the group membership of the first processor group to apply to all
> groups?

Maybe https://go.microsoft.com/fwlink/p/?linkid=147914 helps?

 "If the number of logical processors exceeds the maximum group size,
  Windows creates multiple groups by splitting the node into n groups,
  where the first n-1 groups have capacities that are equal to the group
  size."

We were over that already when creating the code in format_proc_cpuinfo.
So, IIUC, , and IIRC, the idea is that the logical CPUs are split into
equal chunks of logical CPUs, along NUMA node bordres on a NUMA system,
and the last group has potentially, but seldomly, less nodes.
in the end, the important thing is that all groups have equal size,
except the last one.

Therefore:

  WORD cpu_group = cpu_number / num_cpu_per_group;
  KAFFINITY cpu_mask = 1L << (cpu_number % num_cpu_per_group);

That also means the transposition between the groupless linux system
and the WIndows system is fairly easy.

> (5) On Windows, a process starts out in a particular processor group.  One
> can then change thread affinities in such a way that some threads run in a
> different processor group than other threads of the same process.  The
> process becomes a "multi-group" process.  This has implications for the
> conversions discussed in (1).

Don't see how.  Care to explain?

> (6) On Linux, processor affinity is inherited across fork() and execve().
> I'll need to ensure Cygwin's implementation of those calls handle affinity
> the same way.

Just passing the INHERIT_PARENT_AFFINITY flag to CreateProcess{AsUser}
should do the trick.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Implement sched_[gs]etaffinity()
  2019-04-16 10:45           ` Corinna Vinschen
@ 2019-04-17  4:31             ` Mark Geisert
  2019-04-17  7:57               ` Corinna Vinschen
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Geisert @ 2019-04-17  4:31 UTC (permalink / raw)
  To: cygwin-developers

On Tue, 16 Apr 2019, Corinna Vinschen wrote:
> On Apr 16 01:19, Mark Geisert wrote:
>> (4) I was looking at Cygwin's fhandler_proc.cc, function
>> format_proc_cpuinfo().  There's a call to __get_cpus_per_group() which is
>> implemented in miscfuncs.cc.  I haven't seen in the MSDN docs whether each
>> processor group is guaranteed to have the same number of processors.  I
>> might even expect variations on a NUMA system.  Anybody know if one can
>> depend on the group membership of the first processor group to apply to all
>> groups?
>
> Maybe https://go.microsoft.com/fwlink/p/?linkid=147914 helps?
>
> "If the number of logical processors exceeds the maximum group size,
>  Windows creates multiple groups by splitting the node into n groups,
>  where the first n-1 groups have capacities that are equal to the group
>  size."

Great; thanks for that.

> We were over that already when creating the code in format_proc_cpuinfo.
> So, IIUC, , and IIRC, the idea is that the logical CPUs are split into
> equal chunks of logical CPUs, along NUMA node bordres on a NUMA system,
> and the last group has potentially, but seldomly, less nodes.
> in the end, the important thing is that all groups have equal size,
> except the last one.
>
> Therefore:
>
>  WORD cpu_group = cpu_number / num_cpu_per_group;
>  KAFFINITY cpu_mask = 1L << (cpu_number % num_cpu_per_group);
>
> That also means the transposition between the groupless linux system
> and the WIndows system is fairly easy.

Yes, dealing with an array of unsigned longs vs bitblt ops FTW.

>> (5) On Windows, a process starts out in a particular processor group.  One
>> can then change thread affinities in such a way that some threads run in a
>> different processor group than other threads of the same process.  The
>> process becomes a "multi-group" process.  This has implications for the
>> conversions discussed in (1).
>
> Don't see how.  Care to explain?

I was just whinging in advance that a single sched_get/setaffinity will 
result in multiple Windows affinity ops to gather/scatter among processor 
groups the process belongs to.  At least they won't be bitblt ops.

>> (6) On Linux, processor affinity is inherited across fork() and execve().
>> I'll need to ensure Cygwin's implementation of those calls handle affinity
>> the same way.
>
> Just passing the INHERIT_PARENT_AFFINITY flag to CreateProcess{AsUser}
> should do the trick.

OK.  Hope so.

(7), to make a prime number: I don't see any need for the Cygwin DLL to 
keep any affinity info (process or thread) or processor group membership 
info around, do you?  I believe the sched_get/setaffinity functions will 
do whatever Windows ops they need to do on the fly based on the args 
passed in.  That allows the user to do Windows affinity ops at will 
outside of Cygwin without screwing up any Cygwin-maintained context.

Thanks again,

..mark

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Implement sched_[gs]etaffinity()
  2019-04-17  4:31             ` Mark Geisert
@ 2019-04-17  7:57               ` Corinna Vinschen
  2019-04-26  8:44                 ` Mark Geisert
  0 siblings, 1 reply; 11+ messages in thread
From: Corinna Vinschen @ 2019-04-17  7:57 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2352 bytes --]

On Apr 16 21:31, Mark Geisert wrote:
> On Tue, 16 Apr 2019, Corinna Vinschen wrote:
> > On Apr 16 01:19, Mark Geisert wrote:
> > >   Anybody know if one can
> > > depend on the group membership of the first processor group to apply to all
> > > groups?
> > 
> > Maybe https://go.microsoft.com/fwlink/p/?linkid=147914 helps?
> > 
> > "If the number of logical processors exceeds the maximum group size,
> >  Windows creates multiple groups by splitting the node into n groups,
> >  where the first n-1 groups have capacities that are equal to the group
> >  size."
> 
> Great; thanks for that.
> 
> > [...]
> > Therefore:
> > 
> >  WORD cpu_group = cpu_number / num_cpu_per_group;
> >  KAFFINITY cpu_mask = 1L << (cpu_number % num_cpu_per_group);
> > 
> > That also means the transposition between the groupless linux system
> > and the WIndows system is fairly easy.
> 
> Yes, dealing with an array of unsigned longs vs bitblt ops FTW.
> 
> > > (6) On Linux, processor affinity is inherited across fork() and execve().
> > > I'll need to ensure Cygwin's implementation of those calls handle affinity
> > > the same way.
> > 
> > Just passing the INHERIT_PARENT_AFFINITY flag to CreateProcess{AsUser}
> > should do the trick.
> 
> OK.  Hope so.

Well, nope, sorry.  Per MSDN:

  The process inherits its parent's affinity. If the parent process has
  threads in more than one processor group, the new process inherits the
  group-relative affinity of an arbitrary group in use by the parent.

  Also important: This value is not supported on Vista, so it should
  only be used if wincap.has_processor_groups() is true.

> (7), to make a prime number: I don't see any need for the Cygwin DLL to keep
> any affinity info (process or thread) or processor group membership info
> around, do you?  I believe the sched_get/setaffinity functions will do
> whatever Windows ops they need to do on the fly based on the args passed in.
> That allows the user to do Windows affinity ops at will outside of Cygwin
> without screwing up any Cygwin-maintained context.

I agree.  Additionally I think we should not overvalue affinity
inheritance.  Specifying INHERIT_PARENT_AFFINITY should be enough
for a start.  There's no reason for overkill.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Implement sched_[gs]etaffinity()
  2019-04-17  7:57               ` Corinna Vinschen
@ 2019-04-26  8:44                 ` Mark Geisert
  2019-04-26  8:53                   ` Corinna Vinschen
  0 siblings, 1 reply; 11+ messages in thread
From: Mark Geisert @ 2019-04-26  8:44 UTC (permalink / raw)
  To: cygwin-developers

On Wed, 17 Apr 2019, Corinna Vinschen wrote:
> On Apr 16 21:31, Mark Geisert wrote:
>> On Tue, 16 Apr 2019, Corinna Vinschen wrote:
>>> On Apr 16 01:19, Mark Geisert wrote:
>>>>   Anybody know if one can
>>>> depend on the group membership of the first processor group to apply to all
>>>> groups?
>>>
>>> Maybe https://go.microsoft.com/fwlink/p/?linkid=147914 helps?
>>>
>>> "If the number of logical processors exceeds the maximum group size,
>>>  Windows creates multiple groups by splitting the node into n groups,
>>>  where the first n-1 groups have capacities that are equal to the group
>>>  size."
>>
>> Great; thanks for that.
>>
>>> [...]
>>> Therefore:
>>>
>>>  WORD cpu_group = cpu_number / num_cpu_per_group;
>>>  KAFFINITY cpu_mask = 1L << (cpu_number % num_cpu_per_group);
>>>
>>> That also means the transposition between the groupless linux system
>>> and the WIndows system is fairly easy.
>>
>> Yes, dealing with an array of unsigned longs vs bitblt ops FTW.

I've been doing research to more fully understand the non-symmetric API 
for Windows affinity ops.  I came across a non-MS document online that 
discusses affinity on Windows with >64 CPUs.  The author works on "Process 
Lasso", a product that attempts to balance performance of apps across 
CPUs.

Anyway, he says processors are divided evenly among groups.  One reason 
for this is that Windows allocates new processes round-robin among the 
processor groups.  This won't balance properly if some groups have more 
processors than other groups.  Here's a link to the doc:
https://bitsum.com/general/the-64-core-threshold-processor-groups-and-windows/

I'm not trying to muddy the waters, I'm just trying to figure out if there 
are different processor group assignment methods for different kinds of 
systems, SMP vs NUMA for instance.

I don't think the code I've got is robust enough to submit yet.  I suppose 
I could ship what should work, i.e., single-group processes and threads 
and just return ENOSYS for multi-group ops.  Or just hold off 'til done.

..mark

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Implement sched_[gs]etaffinity()
  2019-04-26  8:44                 ` Mark Geisert
@ 2019-04-26  8:53                   ` Corinna Vinschen
  0 siblings, 0 replies; 11+ messages in thread
From: Corinna Vinschen @ 2019-04-26  8:53 UTC (permalink / raw)
  To: cygwin-developers

[-- Attachment #1: Type: text/plain, Size: 2705 bytes --]

On Apr 26 01:44, Mark Geisert wrote:
> On Wed, 17 Apr 2019, Corinna Vinschen wrote:
> > On Apr 16 21:31, Mark Geisert wrote:
> > > On Tue, 16 Apr 2019, Corinna Vinschen wrote:
> > > > On Apr 16 01:19, Mark Geisert wrote:
> > > > >   Anybody know if one can
> > > > > depend on the group membership of the first processor group to apply to all
> > > > > groups?
> > > > 
> > > > Maybe https://go.microsoft.com/fwlink/p/?linkid=147914 helps?
> > > > 
> > > > "If the number of logical processors exceeds the maximum group size,
> > > >  Windows creates multiple groups by splitting the node into n groups,
> > > >  where the first n-1 groups have capacities that are equal to the group
> > > >  size."
> > > 
> > > Great; thanks for that.
> > > 
> > > > [...]
> > > > Therefore:
> > > > 
> > > >  WORD cpu_group = cpu_number / num_cpu_per_group;
> > > >  KAFFINITY cpu_mask = 1L << (cpu_number % num_cpu_per_group);
> > > > 
> > > > That also means the transposition between the groupless linux system
> > > > and the WIndows system is fairly easy.
> > > 
> > > Yes, dealing with an array of unsigned longs vs bitblt ops FTW.
> 
> I've been doing research to more fully understand the non-symmetric API for
> Windows affinity ops.  I came across a non-MS document online that discusses
> affinity on Windows with >64 CPUs.  The author works on "Process Lasso", a
> product that attempts to balance performance of apps across CPUs.
> 
> Anyway, he says processors are divided evenly among groups.  One reason for
> this is that Windows allocates new processes round-robin among the processor
> groups.  This won't balance properly if some groups have more processors
> than other groups.  Here's a link to the doc:
> https://bitsum.com/general/the-64-core-threshold-processor-groups-and-windows/
> 
> I'm not trying to muddy the waters, I'm just trying to figure out if there
> are different processor group assignment methods for different kinds of
> systems, SMP vs NUMA for instance.

That's what the __get_cpus_per_group function in miscfuncs.cc is for, so
you know the number of CPUs per group, and the transposition from
grouped vs. linear representation and vice versa is no problem.

The non-NUMA vs. NUMA problem is just some under the hood design which
tries to keep closely related nodes together if possible.

> I don't think the code I've got is robust enough to submit yet.  I suppose I
> could ship what should work, i.e., single-group processes and threads and
> just return ENOSYS for multi-group ops.  Or just hold off 'til done.

Nah, no worries.  We're in no hurry.


Thanks,
Corinna

-- 
Corinna Vinschen
Cygwin Maintainer

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2019-04-26  8:53 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-11  4:21 Implement sched_[gs]etaffinity() Mark Geisert
2019-04-11  8:26 ` Corinna Vinschen
2019-04-11  8:38   ` Corinna Vinschen
2019-04-11 20:52     ` Mark Geisert
2019-04-12  7:46       ` Corinna Vinschen
2019-04-16  8:19         ` Mark Geisert
2019-04-16 10:45           ` Corinna Vinschen
2019-04-17  4:31             ` Mark Geisert
2019-04-17  7:57               ` Corinna Vinschen
2019-04-26  8:44                 ` Mark Geisert
2019-04-26  8:53                   ` Corinna Vinschen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).