From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <richard.guenther@gmail.com>
Received: from mail-ed1-x52a.google.com (mail-ed1-x52a.google.com
 [IPv6:2a00:1450:4864:20::52a])
 by sourceware.org (Postfix) with ESMTPS id C5C4E3851C10
 for <gcc-patches@gcc.gnu.org>; Fri, 20 Aug 2021 07:28:45 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org C5C4E3851C10
Received: by mail-ed1-x52a.google.com with SMTP id i6so12665407edu.1
 for <gcc-patches@gcc.gnu.org>; Fri, 20 Aug 2021 00:28:45 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=h87VHUlAPoZ6rLoQH4NrWAZtFByDg+jzngi3O/0fa7I=;
 b=D69KxTH06WSQC5YI9I0vXZ0SPKoawyJRHWz2M41mkV5hyZMew3U1LwY/+NqO9SH0ab
 a9oZl1NXR8M5ZkigIFegG3uaxdHvdU7pcXrXl3+8SzjXae+TDyH7ISz+tk/X3Be/EsBG
 X7PwUej3PE+lItSxaGjttHbEzsnnH0WCoDs/0Ejrn7TRHQvnoDHF1MKGYEjPQD0TdN3E
 zqKa53ZaBP6MwljM53zZfmmKNVL8X3nYbnvxPGCAU5Z0aeGsTGE6wiXhnCea8C4QV15i
 sfxpeumDqhgd/D9bXUozMyDFb4fyOhYToFEMuQa2SqiOr21gxgQo8NnMVm+EKqkD49E3
 +/+w==
X-Gm-Message-State: AOAM5321AB34/sjboNHEWRZsjKKHX8lSuS4KVL/imSQ+21CmmWg1NC49
 mo5OOPKa1uy8cFcyi8CGwLgKfOInAlJovOgWYHqlPTJdGqc=
X-Google-Smtp-Source: ABdhPJzEV7nWVmhX/hb8NG0dOsHcyZ9qRW9zMseXKn4MxTjVFbrikT0VaUPxrLCHo3DX1tIcupjhLwvnQbVHPnn/mzA=
X-Received: by 2002:a50:d749:: with SMTP id i9mr21266219edj.248.1629444524697; 
 Fri, 20 Aug 2021 00:28:44 -0700 (PDT)
MIME-Version: 1.0
References: <02bd01d79513$619dd590$24d980b0$@nextmovesoftware.com>
In-Reply-To: <02bd01d79513$619dd590$24d980b0$@nextmovesoftware.com>
From: Richard Biener <richard.guenther@gmail.com>
Date: Fri, 20 Aug 2021 09:28:33 +0200
Message-ID: <CAFiYyc2zXanoc7MHa5fjNfA7ffZvy4W=OMVyupT3reLz1GMHSw@mail.gmail.com>
Subject: Re: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.
To: Roger Sayle <roger@nextmovesoftware.com>
Cc: GCC Patches <gcc-patches@gcc.gnu.org>
Content-Type: text/plain; charset="UTF-8"
X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT,
 RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Fri, 20 Aug 2021 07:28:56 -0000

On Thu, Aug 19, 2021 at 6:01 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> Doh!  ENOPATCH.
>
> -----Original Message-----
> From: Roger Sayle <roger@nextmovesoftware.com>
> Sent: 19 August 2021 16:59
> To: 'GCC Patches' <gcc-patches@gcc.gnu.org>
> Subject: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.
>
>
> Back in June I briefly mentioned in one of my gcc-patches posts that a
> change that should have always reduced code size, would mysteriously
> occasionally result in slightly larger code (according to CSiBE):
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573233.html
>
> Investigating further, the cause turns out to be that x86_64's
> scalar-to-vector (stv) pass is relying on poor estimates of the size
> costs/benefits.  This patch tweaks the backend's compute_convert_gain method
> to provide slightly more accurate values when compiling with -Os.
> Compilation without -Os is (should be) unaffected.  And for completeness,
> I'll mention that the stv pass is a net win for code size so it's much
> better to improve its heuristics than simply gate the pass on
> !optimize_for_size.
>
> The net effect of this change is to save 1399 bytes on the CSiBE code size
> benchmark when compiling with -Os.
>
> This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
> and "make -k check" with no new failures.
>
> Ok for mainline?

+                   /* xor (2 bytes) vs. xorps (3 bytes).  */
+                   if (src == const0_rtx)
+                     igain -= COSTS_N_BYTES (1);
+                   /* movdi_internal vs. movv2di_internal.  */
+                   /* => mov (5 bytes) vs. movaps (7 bytes).  */
+                   else if (x86_64_immediate_operand (src, SImode))
+                     igain -= COSTS_N_BYTES (2);

doesn't it need two GPR xor for 32bit DImode and two mov?  Thus
the non-SSE cost should be times 'm'?  For const0_rtx we may
eventually re-use the zero reg for the high part so that is eventually
correct.

Also I'm missing a 'else' - in the default case there's no cost/benefit
of using SSE vs. GPR regs?  For SSE it would be a constant pool
load.

I also wonder, since I now see COSTS_N_BYTES for the first time (heh),
whether with -Os we'd need to replace all COSTS_N_INSNS (1)
scaling with COSTS_N_BYTES scaling?  OTOH costs_add_n_insns
uses COSTS_N_INSNS for the size part as well.

That said, it looks like we're eventually mixing apples and oranges
now or even previously?

Thanks,
Richard.

>
>
> 2021-08-19  Roger Sayle  <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
>         * config/i386/i386-features.c (compute_convert_gain): Provide
>         more accurate values for CONST_INT, when optimizing for size.
>         * config/i386/i386.c (COSTS_N_BYTES): Move definition from here...
>         * config/i386/i386.h (COSTS_N_BYTES): to here.
>
> Roger
> --
>