From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by sourceware.org (Postfix) with ESMTPS id 10DEA3858C2A for ; Fri, 7 Jul 2023 14:31:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 10DEA3858C2A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x330.google.com with SMTP id 5b1f17b1804b1-3fbfcc6daa9so9791425e9.3 for ; Fri, 07 Jul 2023 07:31:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1688740313; x=1691332313; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=6qjsA06SdiwyVMj+pg/6X/dnwXh5jzOe5tFkLSiOk58=; b=cM/GnDjZcsU//2IGoDfNPRNA3NZ6aQRM+6yNo4yW110Bk4ZLqROoj7pPgmf8vn32u+ zXg8SaZNvi+e3/HNHsK/dkRXPqwIKwMBtothUH49JHL+mMVyX4OzSqiubtJTlFueGEfr wwliTO+TWd/QNiUL0hDifFrkVhvhZh/a0TBuYYHKy0yCul+2/H3/d3ewVtcXNHxWmSNJ mR2EbV2/V/T5TK/i/q8C/EKdZOFWvHEr76clPAe3jy/OcAEWTo2bqdgNOe+qlFjiLU7u XGTDwWtBnTtjX0HfGSSu9nBM9XKq2e0sTJK+GGFSFonrXtC1ylZM7/G+EiyzILHQonpx 5dlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688740313; x=1691332313; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6qjsA06SdiwyVMj+pg/6X/dnwXh5jzOe5tFkLSiOk58=; b=at/5MGiXYfEdCsU60KJTRcNGsjp5LlnvD2pPZez/wVTiFXV7SCPuxNCBpp889cJvOh loHhJri+LDlkTRTnR7o/pxGxog6cCeHWT3n0AKH0rXP0F0eunwT2X98XytzQAUDHkGAc eIkvRaxkFm7He06w5Cbn11ft+z2PQYdIMqtTmOuwOirNoDWMs1V6LzqteXf8tVEPZRLG UGU0WzIadgfALZIoXKrJyjMIJHeBImr+LRyggdK4B5t3wqeUF014Nn6OO09Bs/SUTqi3 tL7R2c5Uk+kC2AF1uJjSP2UV5+75NIYZnUgT1myi72oZjFxqTAL6KoYPyDraJwh422yh OsqA== X-Gm-Message-State: ABy/qLbiV7yhuWAYQy03wD4urSUhr07c3wVjf9fbywX3S5k2ajoD5CSP 9kVC1R90Ch0d5Fh5wB81u47LBkPJtk4UbNY4hzHY2R+5TNxjrqE3FYxHOQ== X-Google-Smtp-Source: APBJJlEj2ZccmievV6jazcf7ifIJqKAx8EpQrF3ijDVvc49nQuhxxH/V1/2W24AYVjFDwnW8ew4EuI7+JbPSRj9rL9Q= X-Received: by 2002:a05:600c:294c:b0:3fb:ffef:d058 with SMTP id n12-20020a05600c294c00b003fbffefd058mr1366810wmd.0.1688740312232; Fri, 07 Jul 2023 07:31:52 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Philipp Tomsich Date: Fri, 7 Jul 2023 16:31:40 +0200 Message-ID: Subject: Re: [PING][PATCH] tree-optimization/110279- Check for nested FMA chains in reassoc To: Di Zhao OS Cc: "gcc-patches@gcc.gnu.org" Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,JMQ_SPF_NEUTRAL,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Fri, 7 Jul 2023 at 10:28, Di Zhao OS via Gcc-patches wrote: > > Update the patch so it can apply. > > Tested on spec2017 fprate cases again. With option "-funroll-loops -Ofast -flto", > the improvements of 1-copy run are: > > Ampere1: > 508.namd_r 4.26% > 510.parest_r 2.55% > Overall 0.54% > Intel Xeon: > 503.bwaves_r 1.3% > 508.namd_r 1.58% > overall 0.42% This looks like a worthwhile improvement. >From reviewing the patch, a few nit-picks: - given that 'has_fma' can now take three values { -1, 0, 1 }, an enum with more descriptive names for these 3 states should be used; - using "has_fma >= 0" and "fma > 0" tests are hard to read; after changing this to an enum, you can use macros or helper functions to test the predicates (i.e., *_P macros or *_p helpers) for readability - the meaning of the return values of rank_ops_for_fma should be documented in the comment describing the function - changing convert_mult_to_fma_1 to return a tree* (i.e., return_lhs or NULL_TREE) removes the need for an in/out parameter Thanks, Philipp. > > > Thanks, > Di Zhao > > > > -----Original Message----- > > From: Di Zhao OS > > Sent: Friday, June 16, 2023 4:51 PM > > To: gcc-patches@gcc.gnu.org > > Subject: [PATCH] tree-optimization/110279- Check for nested FMA chains in > > reassoc > > > > This patch is to fix the regressions found in SPEC2017 fprate cases > > on aarch64. > > > > 1. Reused code in pass widening_mul to check for nested FMA chains > > (those connected by MULT_EXPRs), since re-writing to parallel > > generates worse codes. > > > > 2. Avoid re-arrange to produce less FMA chains that can be slow. > > > > Tested on ampere1 and neoverse-n1, this fixed the regressions in > > 508.namd_r and 510.parest_r 1 copy run. While I'm still collecting data > > on x86 machines we have, I'd like to know what do you think of this. > > > > (Previously I tried to improve things with FMA by adding a widening_mul > > pass before reassoc2 for it's easier to recognize different patterns > > of FMA chains and decide whether to split them. But I suppose handling > > them all in reassoc pass is more efficient.) > > > > Thanks, > > Di Zhao > > > > --- > > gcc/ChangeLog: > > > > * tree-ssa-math-opts.cc (convert_mult_to_fma_1): Add new parameter. > > Support new mode that merely do the checking. > > (struct fma_transformation_info): Moved to header. > > (class fma_deferring_state): Moved to header. > > (convert_mult_to_fma): Add new parameter. > > * tree-ssa-math-opts.h (struct fma_transformation_info): > > (class fma_deferring_state): Moved from .cc. > > (convert_mult_to_fma): Add function decl. > > * tree-ssa-reassoc.cc (rewrite_expr_tree_parallel): > > (rank_ops_for_fma): Return -1 if nested FMAs are found. > > (reassociate_bb): Avoid rewriting to parallel if nested FMAs are > > found. >