From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x230.google.com (mail-lj1-x230.google.com [IPv6:2a00:1450:4864:20::230]) by sourceware.org (Postfix) with ESMTPS id 793AF3858D28 for ; Mon, 19 Jun 2023 08:43:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 793AF3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x230.google.com with SMTP id 38308e7fff4ca-2b479d53d48so7836031fa.1 for ; Mon, 19 Jun 2023 01:43:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1687164209; x=1689756209; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=UY0AJUDIydUqIb3BsiilwuQrvWde+oTJzJcfkqb0hs4=; b=BPkduYzoNQmZ4v8JnHAc+2kGPmpqwIYTJePp8WCHMzpKj9EveR+hipkZrwX4z1cF2C fbDzkPlTmgKxW3tKViRwqSFsx9/LPpXqx427bbitLu6P01wHt3fnmuj9hzREcTN9jtuh OK0fyFj/AywCyZTCtIzHALY0xccO6XtFFvvmPH8LuwlF9oorQmWL/gh/WjqYkoVfh24l 3htEIpVVRD9gSovEBbhpEfpsxWAfcgUA+OO+/pKO3glNVU3v7alIEdLkX44Yvm7sfFtK 9eRdyvj0JoIdj7k6/URFOiZaYZKi498vOorJ+H6vjYj3ORluKWaKqZgb3F8PSJdtnuny oaXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687164209; x=1689756209; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UY0AJUDIydUqIb3BsiilwuQrvWde+oTJzJcfkqb0hs4=; b=hGeQaAJ36RI4MmxQJM/4dns8dwdVL2CeSf8GYtus5fyGelN/gkYLdPiUT2Fqxu8eJ7 PWC4vgB33LLx53tDEZBn5U5Jv2/zGLn2GiXbtZ+hVut09uM6LIqoKP4ESMBu3FgVViH8 e0URg3WeAwQX840az3W6RNjAT9tpq5aE0SVyTBKtuHa9baRCS/zLBsUtd8s9/2KNzl7V B9HI6ROIeYF0re/jHS1TX4qdUDBapMCYYofwhjLSjQkGtsf89oOdQsOdcvuFY2BXemgG pwUrADwk1PICYiBpYTWt1l4Rt7TpFs09vdLzJwn6fxiEH2NWue1T2qtZv5G6vVysWS2l oRpg== X-Gm-Message-State: AC+VfDx7W6EDIUtScwHNaM3BDUbXjAtgAN9KPdmu6laEv/pF7OW3PsMF gYg69kAxJjzM3Srjwtxyvk9uhnuZlXLxv2A0IpQ= X-Google-Smtp-Source: ACHHUZ6+wJCS4gmb7UNKEOw2GIk2rMXwb6oXl5F38pkmGs5wVTPAIFn7/a43YlbwK2bAQXrBtbcq7g1VZ87KeOxpe/Q= X-Received: by 2002:a2e:b70a:0:b0:2b3:4eb9:246a with SMTP id j10-20020a2eb70a000000b002b34eb9246amr4978390ljo.25.1687164208618; Mon, 19 Jun 2023 01:43:28 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Mon, 19 Jun 2023 10:40:34 +0200 Message-ID: Subject: Re: Optimize std::max early To: Jan Hubicka Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Sun, Jun 18, 2023 at 5:55=E2=80=AFPM Jan Hubicka via Gcc-patches wrote: > > Hi, > we currently produce very bad code on loops using std::vector as a stack,= since > we fail to inline push_back which in turn prevents SRA and we fail to opt= imize > out some store-to-load pairs (PR109849). > > I looked into why this function is not inlined and it is inlined by clang= . We > currently estimate it to 66 instructions and inline limits are 15 at -O2 = and 30 > at -O3. Clang has similar estimate, but still decides to inline at -O2. > > I looked into reason why the body is so large and one problem I spotted i= s the > way std::max is implemented by taking and returning reference to the valu= es. > > const T& max( const T& a, const T& b ); > > This makes it necessary to store the values to memory and load them later > (Max is used by code computing new size of vector on resize.) > Two stores, conditional and load accounts as 8 instructions, while > MAX_EXPR as 1 and has a lot better chance to fold with the surrounding > code. > > We optimize this to MAX_EXPR, but only during late optimizations. I thin= k this > is a common enough coding pattern and we ought to make this transparent t= o > early opts and IPA. The following is easist fix that simply adds phiprop= pass > that turns the PHI of address values into PHI of values so later FRE can > propagate values across memory, phiopt discover the MAX_EXPR pattern and = DSE > remove the memory stores. > > Bootstrapped/regtested x86_64-linux, does this look resonable thing to do= ? OK. Thanks, Richard. > Looking into how expensive the pass is, I think it is very cheap, except > that it computes postdominator and updates ssa even if no patterns > are matched. I will send patch to avoid that. > > gcc/ChangeLog: > > PR tree-optimization/109811 > PR tree-optimization/109849 > * passes.def: Add phiprop to early optimization passes. > * tree-ssa-phiprop.cc: Allow clonning. > > gcc/testsuite/ChangeLog: > > PR tree-optimization/109811 > PR tree-optimization/109849 > * gcc.dg/tree-ssa/phiprop-1.c: New test. > > diff --git a/gcc/passes.def b/gcc/passes.def > index c9a8f19747b..faa5208b26b 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -88,6 +88,8 @@ along with GCC; see the file COPYING3. If not see > /* pass_build_ealias is a dummy pass that ensures that we > execute TODO_rebuild_alias at this point. */ > NEXT_PASS (pass_build_ealias); > + /* Do phiprop before FRE so we optimize std::min and std::max w= ell. */ > + NEXT_PASS (pass_phiprop); > NEXT_PASS (pass_fre, true /* may_iterate */); > NEXT_PASS (pass_early_vrp); > NEXT_PASS (pass_merge_phi); > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phiprop-1.c b/gcc/testsuite/gc= c.dg/tree-ssa/phiprop-1.c > new file mode 100644 > index 00000000000..9f52c2a7298 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/tree-ssa/phiprop-1.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O1 -fdump-tree-phiprop1-details -fdump-tree-release_ss= a" } */ > +int max(int a, int b) > +{ > + int *ptr; > + if (a > b) > + ptr =3D &a; > + else > + ptr =3D &b; > + return *ptr; > +} > + > +/* { dg-final { scan-tree-dump-times "Inserting PHI for result of load" = 1 "phiprop1"} } */ > +/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "release_ssa"} } */ > diff --git a/gcc/tree-ssa-phiprop.cc b/gcc/tree-ssa-phiprop.cc > index 3cb4900b6be..5dc505df420 100644 > --- a/gcc/tree-ssa-phiprop.cc > +++ b/gcc/tree-ssa-phiprop.cc > @@ -476,6 +476,7 @@ public: > {} > > /* opt_pass methods: */ > + opt_pass * clone () final override { return new pass_phiprop (m_ctxt);= } > bool gate (function *) final override { return flag_tree_phiprop; } > unsigned int execute (function *) final override; >