From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lj1-x22f.google.com (mail-lj1-x22f.google.com [IPv6:2a00:1450:4864:20::22f]) by sourceware.org (Postfix) with ESMTPS id 4A0443858D32 for ; Thu, 25 May 2023 13:34:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4A0443858D32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-x22f.google.com with SMTP id 38308e7fff4ca-2af189d323fso6832841fa.1 for ; Thu, 25 May 2023 06:34:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1685021668; x=1687613668; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bwFqUIngWJX3UlAI40rZoXuyO6oPeqVaSthghY4vk6U=; b=osEgXW8iF6IRPTNogvGIjFRpdaMKdASnSWMFCZ+j7Oci1kbPcpDh/+QT1Hra/YDCud iayp2d+F31yzTLUBBTwLuXRdepJcQDTP092THOOFbPwLuZ07eqzAiKkvzuHhrcoZWUKu 331mqWOwZJZllGiaCiOa9llkhqizmUmzMpDOGg2ZQiItCauiQVnNcxZVANoB9nbqyUop kzIyAf0cy3e+tL0A4gyVKxg2UWtnWY0cP3dNW/iW0PRHq/thh3/U0KNWaIV08lke8yp2 bM0ueFVT9dasabcCQ+1IO82YFayh1ueHLuJWEDQcFDTMfdNjRR3rNwJgSpnFm3FPJemF pHGw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685021668; x=1687613668; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bwFqUIngWJX3UlAI40rZoXuyO6oPeqVaSthghY4vk6U=; b=LtsgEYL8XV489A7AcL6B3sEzfWQyUH6hfSBCTfgOoOVlS3GwfLz7NhxrVkPtvKt8VB 4bVc4cmT/drTF8svDfXhxvOMuO7Hz+pfBnwCqLvI4u5E+gp4UqT1SAmSDGEBm/XXoynw yrq/yKAeE5od/XzAI/wdD05NBSHbWO4BFSbsmz4Ddsl6zjucBnCUuBD3qHjKPiCqE0jx 46glJRDb85Q7GVVjwmohNH0ouSRxWkn77BPyOSS6q24pyaJxH2HljoZ0fYLC4wwOY6ws TzPPfLG4iljF3TjjGoAg1kNogsSqKntTc3ZZ7HBUdpuyGqVJbWvTMHGPqBy2PKgHSCbv riFQ== X-Gm-Message-State: AC+VfDzoxyvoLywB2TkUZouOO3h9Ps8BsMzJVY8XAdQ5Bm8pfPu8aIy4 Qqon8AJfNdBceQe13qqcTTLJYSz4qZ8WofHEjSc= X-Google-Smtp-Source: ACHHUZ6GKp2KRLnfUCk2xSlb/aCYOK/JTZ4qAmxgR99Chh8qvisJ2to/oeXxtoqQsVqWjgtIX3qe5GmqrfqZnolML1I= X-Received: by 2002:a2e:9b97:0:b0:2a7:7493:9966 with SMTP id z23-20020a2e9b97000000b002a774939966mr888524lji.24.1685021667462; Thu, 25 May 2023 06:34:27 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Richard Biener Date: Thu, 25 May 2023 15:32:20 +0200 Message-ID: Subject: Re: [PATCH] [x86] reenable dword MOVE_MAX for better memmove inlining To: Alexandre Oliva Cc: gcc-patches@gcc.gnu.org, "H.J. Lu" , Jan Hubicka , Uros Bizjak Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Thu, May 25, 2023 at 3:25=E2=80=AFPM Alexandre Oliva = wrote: > > On May 25, 2023, Richard Biener wrote: > > > On Thu, May 25, 2023 at 1:10=E2=80=AFPM Alexandre Oliva wrote: > >> > >> On May 25, 2023, Richard Biener wrote: > >> > >> > I mean we could do what RTL expansion would do later and do > >> > by-pieces, thus emit multiple loads/stores but not n loads and then > >> > n stores but interleaved. > >> > >> That wouldn't help e.g. gcc.dg/memcpy-6.c's fold_move_8, because > >> MOVE_MAX and MOVE_MAX_PIECES currently limits inline expansion to 4 > >> bytes on x86 without SSE, both in gimple and RTL, and interleaved load= s > >> and stores wouldn't help with memmove. We can't fix that by changing > >> code that uses MOVE_MAX and/or MOVE_MAX_PIECES, when these limits are > >> set too low. > > > Btw, there was a short period where the MOVE_MAX limit was restricted > > but that had fallout and we've reverted since then. > > Erhm... Are we even talking about the same issue? > > i386/i386.h reduced the 32-bit non-SSE MOVE_MAX from 16 to 4, which > broke this test; I'm proposing to bounce it back up to 8, so that we get > a little more memmove inlining, enough for tests that expect that much > to pass. > > You may be focusing on the gimple-fold bit, because I mentioned it, but > even the rtl expander is failing to expand the memmove because of the > setting, as evidenced by the test's failure in the scan for memmove in > the final dump. So indeed fold_move_8 expands to the following, even with -minline-all-stri= ngops fold_move_8: .LFB5: .cfi_startproc pushl %ebp .cfi_def_cfa_offset 8 .cfi_offset 5, -8 movl %esp, %ebp .cfi_def_cfa_register 5 subl $8, %esp movl $a+3, %eax subl $4, %esp pushl $8 pushl $a pushl %eax call memmove addl $16, %esp nop I do think it's still up to RTL expansion or the target to decide whether its worth spending two registers to handle the overlap or maybe emit a compare & jump to do forward and backward variants. Yes, increasing MOVE_MAX to 8 makes this expand at the GIMPLE level already, which I belive is premature and difficult to undo. > That MOVE_MAX change was a significant regression in codegen for 32-bit > non-SSE x86, and I'm proposing to fix that. Compensating for that > regression elsewhere doesn't seem desirable to me: MOVE_MAX can be much > higher even on other x86 variants, so the effects of such attempts may > harm quite significantly more modern CPUs. > > Conversely, I don't expect the reduction of MOVE_MAX on SSE-less x86 a > couple of years ago to have been measured for performance effects, given > the little overall relevance of such CPUs, and the very visible and > undesirable effects on codegen that change brought onto them. And yet, > I'm being very conservative in the proposed reversion, because > benchmarking such targets in any meaningful way would be somewhat > challenging for myself as well. > > So, could we please have this narrow fix of this limited regression at > the spot where it was introduced accepted, rather than debating > tangents? > > -- > Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ > Free Software Activist GNU Toolchain Engineer > Disinformation flourishes because many people care deeply about injustice > but very few check the facts. Ask me about