From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id ABBDF3858034; Tue,  6 Apr 2021 08:25:25 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org ABBDF3858034
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/99912] Unnecessary / inefficient spilling of AVX2 ymm
 registers
Date: Tue, 06 Apr 2021 08:25:24 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 11.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: cc keywords cf_gcctarget
Message-ID: <bug-99912-4-3yph6Qu9mZ@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-99912-4@http.gcc.gnu.org/bugzilla/>
References: <bug-99912-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Apr 2021 08:25:25 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D99912

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rguenth at gcc dot gnu.org
           Keywords|                            |missed-optimization
             Target|                            |x86_64-*-*
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Which function does the loop kernel reside in?  I see you have some lambdas
in Z4c_RHS, done fancy as out-of-line functions, that do look like they
could comprise the actual kernels.  In apply_upwind_diss I see cases without
stack usage.

I'm looking at -O2 -march=3Dskylake compiles

Note that with C++ it's easy to retain some abstraction and thus misinterpr=
et
stack accesses as spilling where they are aggregates not eliminated.  For
example in one of the lambdas I see

  _61489 =3D __builtin_ia32_maskloadpd256 (_104487, _61513);
  D.545024[1].elts.car =3D _61489;
...
  MEM[(struct vect *)&D.544982].elts._M_elems[1] =3D MEM[(const struct simd
&)&D.545024 + 32];
...
  MEM[(struct mat3 *)&vars + 992B] =3D MEM[(const struct mat3 &)&D.544982];

and D.544982 is later variable indexed in some MIN/MAX, FMA using code
(instead of using 'vars' there).  Looking at what -fdump-tree-optimized
produces is sometimes pointing at problems.

That said, the code is large so please point at some source lines within the
important kernel(s) (of the preprocessed source, that is) and the compile
options used.=