From: Uros Bizjak <ubizjak@gmail.com>
To: Roger Sayle <roger@nextmovesoftware.com>
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [x86 PATCH] Fine tune STV register conversion costs for -Os.
Date: Tue, 24 Oct 2023 13:39:10 +0200 [thread overview]
Message-ID: <CAFULd4YgHhOM1vT2oDqAKZ4gJ4=Cs53wCK++L=oJsXDe9BxaCw@mail.gmail.com> (raw)
In-Reply-To: <008701da05bf$e2196b20$a64c4160$@nextmovesoftware.com>
On Mon, Oct 23, 2023 at 4:47 PM Roger Sayle <roger@nextmovesoftware.com> wrote:
>
>
> The eagle-eyed may have spotted that my recent testcases for DImode shifts
> on x86_64 included -mno-stv in the dg-options. This is because the
> Scalar-To-Vector (STV) pass currently transforms these shifts to use
> SSE vector operations, producing larger code even with -Os. The issue
> is that the compute_convert_gain currently underestimates the size of
> instructions required for interunit moves, which is corrected with the
> patch below.
>
> For the simple test case:
>
> unsigned long long shl1(unsigned long long x) { return x << 1; }
>
> without this patch, GCC -m32 -Os -mavx2 currently generates:
>
> shl1: push %ebp // 1 byte
> mov %esp,%ebp // 2 bytes
> vmovq 0x8(%ebp),%xmm0 // 5 bytes
> pop %ebp // 1 byte
> vpaddq %xmm0,%xmm0,%xmm0 // 4 bytes
> vmovd %xmm0,%eax // 4 bytes
> vpextrd $0x1,%xmm0,%edx // 6 bytes
> ret // 1 byte = 24 bytes total
>
> with this patch, we now generate the shorter
>
> shl1: push %ebp // 1 byte
> mov %esp,%ebp // 2 bytes
> mov 0x8(%ebp),%eax // 3 bytes
> mov 0xc(%ebp),%edx // 3 bytes
> pop %ebp // 1 byte
> add %eax,%eax // 2 bytes
> adc %edx,%edx // 2 bytes
> ret // 1 byte = 15 bytes total
>
> Benchmarking using CSiBE, shows that this patch saves 1361 bytes
> when compiling with -m32 -Os, and saves 172 bytes when compiling
> with -Os.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures. Ok for mainline?
>
>
> 2023-10-23 Roger Sayle <roger@nextmovesoftware.com>
>
> gcc/ChangeLog
> * config/i386/i386-features.cc (compute_convert_gain): Provide
> more accurate values (sizes) for inter-unit moves with -Os.
LGTM.
Thanks,
Uros.
>
>
> Thanks in advance,
> Roger
> --
>
prev parent reply other threads:[~2023-10-24 11:39 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-10-23 14:47 Roger Sayle
2023-10-24 11:39 ` Uros Bizjak [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAFULd4YgHhOM1vT2oDqAKZ4gJ4=Cs53wCK++L=oJsXDe9BxaCw@mail.gmail.com' \
--to=ubizjak@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=roger@nextmovesoftware.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).