From: jeff <jeff@jeffunit.com>
To: cygwin@cygwin.com
Subject: Re: posix thread scaling issue
Date: Sat, 2 Sep 2023 11:27:09 -0700 [thread overview]
Message-ID: <cf618819-c30c-439d-ad5f-54b2311bd936@jeffunit.com> (raw)
In-Reply-To: <2cfbcf8d-911f-a64b-8916-12b005c9f6f6@Shaw.ca>
On 9/2/2023 10:56, Brian Inglis wrote:
> On 2023-09-02 08:57, jeff via Cygwin wrote:
>> I have a program that is embarrassing parallel.
>> On my older computer which has an epyc 7302 (16 cores, 32 threads)
>> it scales very well using cygwin, and fully utilized all threads.
>> On my new computer which has an epyc 7B13 (64 cores, 128 threads) it
>> does not scale very well.
>>
>> According to the windows task manager, it only uses 74% of the cpu
>> resources.
>> The time it takes the program to run on windows is 166 seconds.
>> Using the same hardware on a recent version of linux, I can get 100%
>> cpu utilization and the program takes 100 seconds to run.
>>
>> I suspect there may be something in cygwin that doesn't scale well
>> with lots of posix threads.
>> I know this is a bit of an unusual situation, but you can buy a 128
>> core / 256 thread system now.
>>
>> Enclosed is the output of cygcheck.
>> I updated my version of cygwin to be current as of today, Sep 2 2023.
>
> What Windows edition and version are you running?
> For details run:
>
> $ reg query "HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows
> NT\CurrentVersion" \
> | sed '/^\s\+\.*\s/!d;/^.\{80,\}/d'
>
> Some retail editions limit you to 64 threads and that seems to be your
> case:
>
> NUMBER_OF_PROCESSORS = '64'
>
> To make full use of your processors, you may have to upgrade your
> Windows to a commercial licence (and installation) of Windows 10/11
> Pro for Workstations, enabling server features on non-server
> "Worskations" ~ HEDTs (High-End DeskTops); see:
>
> https://www.anandtech.com/show/15483/amd-threadripper-3990x-review/3
>
> or just run Linux!
>
> Watch out for terms misused like processor == socket on some sites!
>
> Also, you have to consider these are server systems, mainly designed
> for VM not HPC (High Performance Computing) parallelism.
>
> Your older system has higher base and boost/turbo clocks 3.0-3.3GHz:
> your newer system has lower clocks 2.25-2.65/3/3.5GHz which seems to
> depend on OEM target.
>
> You may also need to upgrade your memory, as each core could run
> ~10GB/s instructions, and these workstations are often provisioned
> with 128-256GB (2-4GB/core), so that may also need a Windows edition
> upgrade.
I am running windows 10 professional. Using the task manager, 64 cores
and 128 threads shows up for my processor.
Here is the output of your regex:
SystemRoot REG_SZ C:\Windows
BaseBuildRevisionNumber REG_DWORD 0x1
BuildBranch REG_SZ vb_release
BuildGUID REG_SZ ffffffff-ffff-ffff-ffff-ffffffffffff
BuildLab REG_SZ 19041.vb_release.191206-1406
BuildLabEx REG_SZ 19041.1.amd64fre.vb_release.191206-1406
CompositionEditionID REG_SZ Enterprise
CurrentBuild REG_SZ 19045
CurrentBuildNumber REG_SZ 19045
CurrentMajorVersionNumber REG_DWORD 0xa
CurrentMinorVersionNumber REG_DWORD 0x0
CurrentType REG_SZ Multiprocessor Free
CurrentVersion REG_SZ 6.3
EditionID REG_SZ Professional
EditionSubManufacturer REG_SZ
EditionSubstring REG_SZ
EditionSubVersion REG_SZ
InstallationType REG_SZ Client
InstallDate REG_DWORD 0x61e2300a
ProductName REG_SZ Windows 10 Pro
ReleaseId REG_SZ 2009
SoftwareType REG_SZ System
UBR REG_DWORD 0xcfc
PathName REG_SZ C:\Windows
ProductId REG_SZ 00330-80000-00000-AA073
DisplayVersion REG_SZ 22H2
RegisteredOwner REG_SZ jdeifik
RegisteredOrganization REG_SZ
InstallTime REG_QWORD 0x1d809b6d4ce7b09
In practice, but the new and old processors typically run at about 3ghz
when under load.
When idling, both processors use about the same amount of power.
I have 128gb of ram, in 4 slots. Using that configuration, I can get
100% load and significant faster performance on linux.
Therefore I conclude the issue is either with windows or cygwin, and is
not a hardware issue.
When I run cinebench, I can get to 100% cpu utulization (at around 3ghz)
on windows.
As for what the processors are 'designed' for, I really don't care.
I want a reliable, fast computer with ECC memory, and I can get that
with an EPYC processor.
If a workload needs more than 128gb of memory, you pretty much need to
use server processors.
I can put in up to 2tb of memory in my system, if I have the need for that.
jeff
next prev parent reply other threads:[~2023-09-02 18:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-02 14:57 jeff
2023-09-02 17:56 ` Brian Inglis
2023-09-02 18:27 ` jeff [this message]
2023-09-02 19:59 ` Brian Inglis
2023-09-02 20:04 ` jeff
2023-09-03 6:13 ` ASSI
2023-09-03 3:50 ` Mark Geisert
2023-09-03 4:13 ` Mark Geisert
2023-09-02 19:30 André Bleau
[not found] ` <e36d50d5-75d0-40d5-92e2-02d04092fd77@jeffunit.com>
2023-09-02 21:23 ` André Bleau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cf618819-c30c-439d-ad5f-54b2311bd936@jeffunit.com \
--to=jeff@jeffunit.com \
--cc=cygwin@cygwin.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).