From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
 id 36FE23858C51; Fri, 22 Apr 2022 16:28:08 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 36FE23858C51
From: "jakub at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug target/104723] [12 regression] Redundant usage of stack
Date: Fri, 22 Apr 2022 16:28:07 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: target
X-Bugzilla-Version: 12.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: jakub at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 12.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-104723-4-DM7kTvG7dJ@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-104723-4@http.gcc.gnu.org/bugzilla/>
References: <bug-104723-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-BeenThere: gcc-bugs@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-bugs mailing list <gcc-bugs.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-bugs>,
 <mailto:gcc-bugs-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Apr 2022 16:28:08 -0000

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D104723
--- Comment #10 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to H.J. Lu from comment #8)
> > DSE can remove redundant load/store for TI, but not OI/XI.

DSE can remove redundant load/store for OI/XI just fine, just remove the la=
st 7
from the string so that it is 48 bytes instead of 49 and all of sudden it w=
orks
fine.
It is indeed due to:

> It is due to overlapping store.

this.
Wonder if we couldn't special case overlapping stores if they are loaded fr=
om
constant pool and the overlapping bytes have the same values.

And for the backend, the question is how big the penalty for the overlapping
store is compared to doing multiple non-overlapping stores.  Say for those =
49
bytes one could do one OI, one TI/V1TI and one QI load/store as opposed to
one aligned and one misaligned OI load/store.

For say:
void
foo (void *p, void *q)
{
  __builtin_memcpy (p, q, 49);
}
we emit the 2 overlapping loads/stores for -mavx512f and 4 non-overlapping
loads/stores with say -mavx2.=