From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 9F3FA3857835; Mon, 19 Jun 2023 16:29:01 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9F3FA3857835
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1687192141;
	bh=co1V1G9tP238/zKpdwoI2wEfWCl0RReGPLP+9E14U64=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=jyiWJv2C88PznbKI/+DiHtWb+0EAQqGOszJzbs+qRBHRnD/Og1xsNmNOVTp0uv6Px
	 nq494peH45ND11ccr4Kla08pVDGuf0AXr6gS60Hus5j/IplCjq9IpP2RPKwXD8q+GJ
	 XXAd0kZNvV/O7NOG6jMDoO9n0sqpHZOarF7S8TU0=
From: "cvs-commit at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/109811] libjxl 0.7 is a lot slower in GCC 13.1 vs Clang
 16
Date: Mon, 19 Jun 2023 16:28:57 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 13.1.1
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: cvs-commit at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-109811-4-swXo0hyTb0@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-109811-4@http.gcc.gnu.org/bugzilla/>
References: <bug-109811-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109811
--- Comment #12 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jan Hubicka <hubicka@gcc.gnu.org>:

https://gcc.gnu.org/g:7b34cacc5735385e7e2855d7c0a6fad60ef4a99b

commit r14-1951-g7b34cacc5735385e7e2855d7c0a6fad60ef4a99b
Author: Jan Hubicka <jh@suse.cz>
Date:   Mon Jun 19 18:28:17 2023 +0200

    optimize std::max early

    we currently produce very bad code on loops using std::vector as a stac=
k,
since
    we fail to inline push_back which in turn prevents SRA and we fail to
optimize
    out some store-to-load pairs.

    I looked into why this function is not inlined and it is inlined by cla=
ng.=20
We
    currently estimate it to 66 instructions and inline limits are 15 at -O2
and 30
    at -O3.  Clang has similar estimate, but still decides to inline at -O2.

    I looked into reason why the body is so large and one problem I spotted=
 is
the
    way std::max is implemented by taking and returning reference to the
values.

      const T& max( const T& a, const T& b );

    This makes it necessary to store the values to memory and load them lat=
er
    and max is used by code computing new size of vector on resize.

    We optimize this to MAX_EXPR, but only during late optimizations.  I th=
ink
this
    is a common enough coding pattern and we ought to make this transparent=
 to
    early opts and IPA.  The following is easist fix that simply adds phipr=
op
pass
    that turns the PHI of address values into PHI of values so later FRE can
    propagate values across memory, phiopt discover the MAX_EXPR pattern and
DSE
    remove the memory stores.

    gcc/ChangeLog:

            PR tree-optimization/109811
            PR tree-optimization/109849
            * passes.def: Add phiprop to early optimization passes.
            * tree-ssa-phiprop.cc: Allow clonning.

    gcc/testsuite/ChangeLog:

            PR tree-optimization/109811
            PR tree-optimization/109849
            * gcc.dg/tree-ssa/phiprop-1.c: New test.
            * gcc.dg/tree-ssa/pr21463.c: Adjust template.=