From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) by sourceware.org (Postfix) with ESMTPS id F2BB63858413 for ; Wed, 4 Aug 2021 09:55:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org F2BB63858413 Received: by mail-ed1-x535.google.com with SMTP id y7so2753796eda.5 for ; Wed, 04 Aug 2021 02:55:07 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=bzaZe5udhLPC0qFCulHajjUuyA+ycjs/UZJ3mPhfNO0=; b=pVese9EsucRQjSpPNqGCw52mNOXmw52kUE9YiSbg+KjoPZvvss+/1Y8037s1gSVqnY vb2sNMifBQK/836qOfDBjXNhSO4JHXJ/auI78VjwvFV9ldRxj2B9Oyckm014PdhtZWFZ jCR2LH0xMor/yyp6SfHtb2puxWeUfmiPGG6yxghUquIdmvy+xm0xq4qDEiOfyWNZ4sha dnZA/vF4Kx2ZDB/bORxDfk6C3HfAO7hf30c7NoA7UsV9h2Evg6n1IvrJ/pLBpqhKyJRw g5u8x3UkCPrhw5EPUGPzzTG8nZ+kTWLbqNsB0ndSKkCX8cYgPZHKZqrB/JRtzcFLwTju z8qg== X-Gm-Message-State: AOAM532cdYkYCwFrvY90d6suBVPB53gcEmxZNPJRTuQXl8zg7sQuHRbl ETU8ApIDmbNwuZSxdhEweADw+Fhl8i+evQWhNpg= X-Google-Smtp-Source: ABdhPJyi39mEHBft+zfUqDA1XjFI50BkFetWwE7gdIUXsRq9P55oFVCncvZvcPQ9gbYOIGxxlYtn0YZhf4kOOdxCbz4= X-Received: by 2002:aa7:d54d:: with SMTP id u13mr10520657edr.138.1628070906962; Wed, 04 Aug 2021 02:55:06 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Wed, 4 Aug 2021 11:54:55 +0200 Message-ID: Subject: Re: Add ops_num to targetm.sched.reassociation_width hook To: Aaron Sawdey Cc: gcc , Segher Boessenkool , Bill Schmidt Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 04 Aug 2021 09:55:09 -0000 On Wed, Aug 4, 2021 at 2:07 AM Aaron Sawdey wrote: > > Richard, > > So, I=E2=80=99m noticing that in get_reassociation_width() we know how ma= ny ops (ops_num) are in the expression being considered for parallel reasso= ciation, but this is not passed to the target hook. In my testing this seem= s like it might be useful to have. If you determine the maximum width that = gives additional speedup for a large number of terms, and then use that as = the width from the target hook, get_reassociation_width() is more aggressiv= e than you would like for small expressions with maybe 4-16 terms and produ= ces code that is slower than optimal. For example in many cases you want to= continue using a width of 1 until you get to 16 terms or so. My testing sh= ows this to be the case for power8, power9, and power10 processors. > > So, I=E2=80=99m wondering how it might be received if I posted a patch th= at adds this to the reassociation_width target hook (and of course fixes al= l uses of that target hook)? You probably saw that get_reassociation_width already tries to optimize things. So what exactly would you change and why is it slower for 4-16 terms but not for 17+ ones? I suppose "is slower" is --param mining on some benchmarks on your side and eventually you manage to pick the best threshold to not run into register pressure issues (by luck) for those benchmarks? That said, I question you can explain why it is slower, right? Richard. > Thanks! > Aaron > > > Aaron Sawdey, Ph.D. sawdey@linux.ibm.com > IBM Linux on POWER Toolchain > >