From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=eMem=BO=gmail.com=richard.guenther@sourceware.org>
Received: from mail-lj1-x22d.google.com (mail-lj1-x22d.google.com [IPv6:2a00:1450:4864:20::22d])
	by sourceware.org (Postfix) with ESMTPS id 57C503858D32
	for <gcc-patches@gcc.gnu.org>; Thu, 25 May 2023 10:51:29 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 57C503858D32
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com
Received: by mail-lj1-x22d.google.com with SMTP id 38308e7fff4ca-2af177f12d1so4721611fa.0
        for <gcc-patches@gcc.gnu.org>; Thu, 25 May 2023 03:51:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=gmail.com; s=20221208; t=1685011888; x=1687603888;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:from:to:cc:subject:date
         :message-id:reply-to;
        bh=VlwlNhwG5vVqLRFZL2WkvgnTo5mgu3SMhnpe4sK4SPc=;
        b=YCnuddeB3bfCimBHSx2swT8xSzH6Fl812jXwPaRY16nPCdqChkfUpPSOzt2cccXe7c
         IjL5/LrItgeMPjD2YgY/U09zrxTms34a9rMUy6/wYoTq37QDc3IhPvZ5A/bNifFEi4uW
         ZR6oZIClTM19Vl4YLZMGSo8Rk50rA6YWjb7IbYrb0Q/lyp6lbvy1cRvj5rQfbhXvVufT
         hnUvdmA0zG4MU0qR9LVE3L1sxI0sWRF8gQg5flER09OPIF4+tR36KDOPpa0lb+Vbx2oB
         c3WLg2ZBd5YtPGR2MOkeyOX2tVgmvUzNGAre/O046ZedvnNy9uEflZI2KqJOUvhDKrjP
         sh0Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20221208; t=1685011888; x=1687603888;
        h=content-transfer-encoding:cc:to:subject:message-id:date:from
         :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc
         :subject:date:message-id:reply-to;
        bh=VlwlNhwG5vVqLRFZL2WkvgnTo5mgu3SMhnpe4sK4SPc=;
        b=ZgPNPX4ubrv/F+V2IKGNB/2vPtoaWx+7VElpiwlnEJnagngFuxzruYB8NkK45AOokW
         /V+EaigcPdZPuqitJ4Jmlb8o0jeugIbhUJ0roy0mCkZEDH8pCaWGuAsfF1dknU63x6F4
         SifgFY/tmdKONph2Hhj7u+E6sbsIxOfnREaxHSP2zN9y7rb3+ENXqzyXFi1JKlitr55b
         oEHH085PbynRfxs1a0ZFRPDg04wzBNQtIXRB0oj/XAARHXXMMR9JyErO7FrWC6NOCl3E
         OrbKFb8l+qFlLVwtYytOnwGF+JtUrKEPAHbvjH8nvuFfrpadLHmzRIqezbzhb8ustf7w
         dQwg==
X-Gm-Message-State: AC+VfDwdHraNSl3EmKpSCNshsW4nDi45AHLVwtBMUpaP0J9OdLlFuo9H
	Km2qjY62LkJRMESqc5bt+PAEpI3gD4DkWWW8vLAzVyfw
X-Google-Smtp-Source: ACHHUZ7S5DRI+08CvdiR64rD+bFlb0qYSYbQsYfr60NthuNRoB4OqzecNGq0kAWLHOT/ZKN/f4itquvUSBZAFBkrzEc=
X-Received: by 2002:a2e:b177:0:b0:2a8:d021:4121 with SMTP id
 a23-20020a2eb177000000b002a8d0214121mr872071ljm.26.1685011887700; Thu, 25 May
 2023 03:51:27 -0700 (PDT)
MIME-Version: 1.0
References: <oredn6p9gk.fsf@lxoliva.fsfla.org> <CAFiYyc069Qz=RYjt5NhARwRYiRjLw4tP5pDcEvoM3xsLtcvrRg@mail.gmail.com>
 <ora5xsbuhu.fsf@lxoliva.fsfla.org>
In-Reply-To: <ora5xsbuhu.fsf@lxoliva.fsfla.org>
From: Richard Biener <richard.guenther@gmail.com>
Date: Thu, 25 May 2023 12:49:20 +0200
Message-ID: <CAFiYyc0+qtvnK4hyRhC16Q3o9B9OuupXijCZFvJU0w09-jXvbw@mail.gmail.com>
Subject: Re: [PATCH] [x86] reenable dword MOVE_MAX for better memmove inlining
To: Alexandre Oliva <oliva@adacore.com>
Cc: gcc-patches@gcc.gnu.org, "H.J. Lu" <hjl.tools@gmail.com>, 
	Jan Hubicka <hubicka@ucw.cz>, Uros Bizjak <ubizjak@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

On Thu, May 25, 2023 at 12:01=E2=80=AFPM Alexandre Oliva <oliva@adacore.com=
> wrote:
>
> --text follows this line--
> On May 24, 2023, Richard Biener <richard.guenther@gmail.com> wrote:
>
> > gimple_fold_builtin_memory_op tries to expand the call to a single
> > load plus a single store so we can handle overlaps by first loading
> > everything to registers and then storing:
>
> *nod*, that's why I figured we could afford to go back to allowing
> DImode (with -m32) or TImode (with -m64) even without vector modes: we'd
> just use a pair of registers, a single insn, even though not a single
> hardware instruction.
>
> > using DImode on i?86 without SSE means we eventually perform two
> > loads and two stores which means we need two registers available.
>
> *nod*.  But the alternative is to issue an out-of-line call to memmove,
> which would clobber more than 2 registers.  ISTM that inlining such
> calls is better, whether optimizing for speed or size.
>
> > So I think if we want to expand this further at the GIMPLE level we
> > should still honor MOVE_MAX but eventually emit multiple loads/stores
> > honoring the MOVE_MAX_PIECES set of constraints there and avoid
> > expanding to sequences where we cannot interleave the loads/stores
> > (aka for the memmove case).
>
> But...  don't we already?  If I'm reading the code right, we'll already
> issue gimple code to load the whole block into a temporary and then
> store it, but current MOVE_MAX won't let us go past 4 bytes on SSE-less
> x86.

I mean we could do what RTL expansion would do later and do
by-pieces, thus emit multiple loads/stores but not n loads and then
n stores but interleaved.

Richard.

>
> --
> Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
>    Free Software Activist                       GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about <https://stallmansupport.org>