From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <roger@nextmovesoftware.com>
Received: from server.nextmovesoftware.com (server.nextmovesoftware.com
 [162.254.253.69])
 by sourceware.org (Postfix) with ESMTPS id 454E83857C43
 for <gcc-patches@gcc.gnu.org>; Tue,  2 Aug 2022 17:02:26 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 454E83857C43
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
 header.from=nextmovesoftware.com
Authentication-Results: sourceware.org;
 spf=pass smtp.mailfrom=nextmovesoftware.com
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;
 d=nextmovesoftware.com; s=default; h=Content-Transfer-Encoding:Content-Type:
 MIME-Version:Message-ID:Date:Subject:In-Reply-To:References:Cc:To:From:Sender
 :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:
 Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:
 List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive;
 bh=digjTpj/d5xKjqA9O/3WsAWjkPKVArk1723HcATRQNw=; b=Pw4ydX7z+HMQsboUXig6rx6Aa8
 uTz53XTDAQJUWmc6AceV26zVUXx/z1sOZT5IN1hwsH/HJe+nOe29lf9iuJF/a+sZillEYUe0OAXmD
 D9cVM758/2wJt5fG3F0P+nVyDykymMqxytr/PEVkgfurhtH4ITKuUKGUuCr30ixQ/BKKomxOhDAwp
 Kv7oJKqquwhoOOjvK7pi/3NRwYXw24SnaJRToL94xr7BaLumNim5+DY94TDBuPQfXUXnS4sP/E79L
 JzscU8Z+hamM1WcSnR00lcmQPG/yRrh/JK0zgUTbNyiTg0cF0N4w7rpGeLR0WB+kECmgEsdf6D1A2
 paHlj2Ng==;
Received: from host86-169-41-119.range86-169.btcentralplus.com
 ([86.169.41.119]:57561 helo=Dell)
 by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls
 TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2)
 (envelope-from <roger@nextmovesoftware.com>)
 id 1oIvHl-0002XZ-Fa; Tue, 02 Aug 2022 13:02:25 -0400
From: "Roger Sayle" <roger@nextmovesoftware.com>
To: "'Uros Bizjak'" <ubizjak@gmail.com>
Cc: <gcc-patches@gcc.gnu.org>
References: <032901d8a2cf$fc07cfd0$f4176f70$@nextmovesoftware.com>
 <CAFULd4Z98PCfCt5a3skRGDVLcNWFvZ5RRiLFBP2Lonw4WUjmOA@mail.gmail.com>
In-Reply-To: <CAFULd4Z98PCfCt5a3skRGDVLcNWFvZ5RRiLFBP2Lonw4WUjmOA@mail.gmail.com>
Subject: RE: [x86 PATCH] Support logical shifts by (some) integer constants in
 TImode STV.
Date: Tue, 2 Aug 2022 18:02:23 +0100
Message-ID: <007f01d8a691$a3b95310$eb2bf930$@nextmovesoftware.com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="utf-8"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQDLH3GmL9E1BuvLEfV2ftcFX9KAgwIzRAwvr6UvzbA=
Content-Language: en-gb
X-AntiAbuse: This header was added to track abuse,
 please include it with any abuse report
X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com
X-AntiAbuse: Original Domain - gcc.gnu.org
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain - nextmovesoftware.com
X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id:
 roger@nextmovesoftware.com
X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com
X-Source: 
X-Source-Args: 
X-Source-Dir: 
X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_BARRACUDACENTRAL,
 SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
X-List-Received-Date: Tue, 02 Aug 2022 17:02:27 -0000


Hi Uros,

> From: Uros Bizjak <ubizjak@gmail.com>
> Sent: 31 July 2022 18:23
> To: Roger Sayle <roger@nextmovesoftware.com>
> On Fri, Jul 29, 2022 at 12:18 AM Roger Sayle =
<roger@nextmovesoftware.com>
> wrote:
> >
> > This patch improves TImode STV by adding support for logical shifts =
by
> > integer constants that are multiples of 8.  For the test case:
> >
> > __int128 a, b;
> > void foo() { a =3D b << 16; }
> >
> > on x86_64, gcc -O2 currently generates:
> >
> >         movq    b(%rip), %rax
> >         movq    b+8(%rip), %rdx
> >         shldq   $16, %rax, %rdx
> >         salq    $16, %rax
> >         movq    %rax, a(%rip)
> >         movq    %rdx, a+8(%rip)
> >         ret
> >
> > with this patch we now generate:
> >
> >         movdqa  b(%rip), %xmm0
> >         pslldq  $2, %xmm0
> >         movaps  %xmm0, a(%rip)
> >         ret
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make =
bootstrap
> > and make -k check. both with and without =
--target_board=3Dunix{-m32},
> > with no new failures.  Ok for mainline?
> >
> >
> > 2022-07-28  Roger Sayle  <roger@nextmovesoftware.com>
> >
> > gcc/ChangeLog
> >         * config/i386/i386-features.cc (compute_convert_gain): Add =
gain
> >         for converting suitable TImode shift to a V1TImode shift.
> >         (timode_scalar_chain::convert_insn): Add support for =
converting
> >         suitable ASHIFT and LSHIFTRT.
> >         (timode_scalar_to_vector_candidate_p): Consider logical =
shifts
> >         by integer constants that are multiples of 8 to be =
candidates.
> >
> > gcc/testsuite/ChangeLog
> >         * gcc.target/i386/sse4_1-stv-7.c: New test case.
>=20
> + case ASHIFT:
> + case LSHIFTRT:
> +  /* For logical shifts by constant multiples of 8. */  igain =3D
> + optimize_insn_for_size_p () ? COSTS_N_BYTES (4)
> +      : COSTS_N_INSNS (1);
>=20
> Isn't the conversion an universal win for -O2 as well as for -Os? The =
conversion
> to/from XMM register is already accounted for, so for -Os substituting
> shldq/salq with pslldq should always be a win. I'd expect the cost =
calculation to
> be similar to the general_scalar_chain::compute_convert_gain cost =
calculation
> with m =3D 2.

I agree that the terminology is perhaps a little confusing.  The
compute_convert_gain function calculates the total "gain" from an
STV chain, summing the igain of each instruction, and performs
the STV transformation if this total gain is greater than zero.
Hence positive values are good and negative values are bad.

In this case, of a logical shift by multiple of 8, converting the chain =
is indeed always
beneficial, reducing by 4 bytes in size when optimizing for size, and =
avoiding 1 fast
instruction when optimizing for speed.  Having a "positive gain of four =
bytes" sounds bad,
but in this sense the gain is used as a synonym of "benefit" not =
"magnitude".

By comparison, shifting by a single bit 128 bit value is always a net =
loss, requiring
three addition fast instructions, or 15 extra bytes in size.  However, =
it's still worth
considering/capturing these counter-productive (i.e. negative) values, =
as they
might be compensated for by other wins in the chain.

Dealing with COSTS_N_BYTES (when optimizing for size) and COSTS_N_INSNS
(when optimizing for speed) allows much finer granularity.  For example,
the constant number of bits used in a shift/rotate, or the value of an
immediate constant in a compare have significant effects on the =
size/speed
of scalar vs. vector code, and this isn't (yet) something easily handled =
by the
simple "m" approximation used in =
general_scalar_chain::compute_convert_gain.

See (comment #5 of) PR target/105034 which mentions the need for more
accurate parameterization of compute_convert_gain (in response to the
undesirable suggestion of simply disabling STV when optimizing for =
size).=20

I hope this explains the above idiom.  Hopefully, things will become =
clearer
when support for shifts by other bit counts, and arithmetic shifts, are =
added
to this part of the code (STV).  I'll be sure to add more comments.

[Ok for mainline?]

Cheers,
Roger
--