public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/96938] New: Failure to optimize bit-setting pattern when not using temporary
@ 2020-09-04 14:10 gabravier at gmail dot com
  2020-09-04 16:31 ` [Bug tree-optimization/96938] " glisse at gcc dot gnu.org
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: gabravier at gmail dot com @ 2020-09-04 14:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96938

            Bug ID: 96938
           Summary: Failure to optimize bit-setting pattern when not using
                    temporary
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: gabravier at gmail dot com
  Target Milestone: ---

void g(char *f, int offset, char value)
{
        *f = (int)(*f & ~(1 << (offset & 0x1F))) | (value << (offset & 0x1F));
}

This has much worse code generation than this:

void g(char *f, int offset, char value)
{
        int tmp = *f & ~(1 << (offset & 0x1F));
        *f = tmp | (value << (offset & 0x1F));
}

Which should be equivalent to the first example.

Example of the worse code generation, on x86 the first example compiles to
this:

g(char*, int, char):
  movzx ecx, BYTE PTR [rdi]
  mov eax, 1
  movsx edx, dl
  shlx eax, eax, esi
  shlx edx, edx, esi
  andn eax, eax, ecx
  or eax, edx
  mov BYTE PTR [rdi], al
  ret

Whereas the second example compiles to this:

g(char*, int, char):
  movsx eax, BYTE PTR [rdi]
  movsx edx, dl
  shlx edx, edx, esi
  btr eax, esi
  or eax, edx
  mov BYTE PTR [rdi], al
  ret

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/96938] Failure to optimize bit-setting pattern when not using temporary
  2020-09-04 14:10 [Bug tree-optimization/96938] New: Failure to optimize bit-setting pattern when not using temporary gabravier at gmail dot com
@ 2020-09-04 16:31 ` glisse at gcc dot gnu.org
  2021-01-12 16:22 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: glisse at gcc dot gnu.org @ 2020-09-04 16:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96938

--- Comment #1 from Marc Glisse <glisse at gcc dot gnu.org> ---
With "char tmp" instead of "int tmp", we get the same code as the first
function.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/96938] Failure to optimize bit-setting pattern when not using temporary
  2020-09-04 14:10 [Bug tree-optimization/96938] New: Failure to optimize bit-setting pattern when not using temporary gabravier at gmail dot com
  2020-09-04 16:31 ` [Bug tree-optimization/96938] " glisse at gcc dot gnu.org
@ 2021-01-12 16:22 ` jakub at gcc dot gnu.org
  2021-01-12 16:42 ` jakub at gcc dot gnu.org
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-12 16:22 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96938

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Reduced testcase:
void
foo (char *x, int y)
{
  *x &= ~(1 << y);
}

void
bar (char *x, int y)
{
  int t = *x & ~(1 << y);
  *x = t;
}
With type narrowing which only happens in the FEs right now, we are then unable
to match bt* instruction.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/96938] Failure to optimize bit-setting pattern when not using temporary
  2020-09-04 14:10 [Bug tree-optimization/96938] New: Failure to optimize bit-setting pattern when not using temporary gabravier at gmail dot com
  2020-09-04 16:31 ` [Bug tree-optimization/96938] " glisse at gcc dot gnu.org
  2021-01-12 16:22 ` jakub at gcc dot gnu.org
@ 2021-01-12 16:42 ` jakub at gcc dot gnu.org
  2021-01-12 18:17 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-12 16:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96938

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Plus the less reduced testcase:
void
baz (char *f, int o, char v)
{
  *f = (*f & ~(1 << o)) | (v << o);
}
On bar above, we match this during combine:
Trying 10, 11 -> 13:
   10: r96:SI=0xfffffffffffffffe
   11: {r98:SI=r96:SI<-<r93:SI#0;clobber flags:CC;}
      REG_UNUSED flags:CC
      REG_DEAD r96:SI
   13: {r100:SI=r98:SI&r99:SI;clobber flags:CC;}
      REG_DEAD r99:SI
      REG_DEAD r98:SI
      REG_UNUSED flags:CC
Successfully matched this instruction into *btrsi:
(parallel [
        (set (reg:SI 100 [ t ])
            (and:SI (rotate:SI (const_int -2 [0xfffffffffffffffe])
                    (subreg:QI (reg/v:SI 93 [ o ]) 0))
                (reg:SI 99 [ *f_11(D) ])))
        (clobber (reg:CC 17 flags))
    ])
but on baz the similar 3 insns fails:
Trying 10, 11 -> 12:
   10: r95:SI=0xfffffffffffffffe
   11: {r97:QI#0=r95:SI<-<r92:SI#0;clobber flags:CC;}
      REG_UNUSED flags:CC
      REG_DEAD r95:SI
   12: {r98:QI=r97:QI&[r91:DI];clobber flags:CC;}
      REG_DEAD r97:QI
      REG_UNUSED flags:CC
Failed to match this instruction:
(parallel [
        (set (reg:QI 98)
            (and:QI (subreg:QI (rotate:SI (const_int -2 [0xfffffffffffffffe])
                        (subreg:QI (reg/v:SI 92 [ o ]) 0)) 0)
                (mem:QI (reg/v/f:DI 91 [ f ]) [0 *f_11(D)+0 S1 A8])))
        (clobber (reg:CC 17 flags))
    ])
and similarly with foo:
Trying 8, 9 -> 10:
    8: r89:SI=0xfffffffffffffffe
    9: {r91:QI#0=r89:SI<-<r93:SI#0;clobber flags:CC;}
      REG_UNUSED flags:CC
      REG_DEAD r93:SI
      REG_DEAD r89:SI
   10: {[r92:DI]=[r92:DI]&r91:QI;clobber flags:CC;}
      REG_DEAD r92:DI
      REG_UNUSED flags:CC
      REG_DEAD r91:QI
Failed to match this instruction:
(parallel [
        (set (zero_extract:HI (mem:QI (reg:DI 92) [0 *x_7(D)+0 S1 A8])
                (const_int 1 [0x1])
                (zero_extend:SI (subreg:QI (reg:SI 93) 0)))
            (const_int 0 [0]))
        (clobber (reg:CC 17 flags))
    ])
There is no btrb instruction (there is btrw though) but we couldn't really use
that anyway because the shifts are 32-bit and need to be well defined even if
the shift count is >= 8.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/96938] Failure to optimize bit-setting pattern when not using temporary
  2020-09-04 14:10 [Bug tree-optimization/96938] New: Failure to optimize bit-setting pattern when not using temporary gabravier at gmail dot com
                   ` (2 preceding siblings ...)
  2021-01-12 16:42 ` jakub at gcc dot gnu.org
@ 2021-01-12 18:17 ` jakub at gcc dot gnu.org
  2021-01-13  9:16 ` cvs-commit at gcc dot gnu.org
  2021-01-13  9:25 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-12 18:17 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96938

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |ASSIGNED
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
   Last reconfirmed|                            |2021-01-12

--- Comment #4 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 49956
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49956&action=edit
gcc11-pr96938.patch

Untested fix.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/96938] Failure to optimize bit-setting pattern when not using temporary
  2020-09-04 14:10 [Bug tree-optimization/96938] New: Failure to optimize bit-setting pattern when not using temporary gabravier at gmail dot com
                   ` (3 preceding siblings ...)
  2021-01-12 18:17 ` jakub at gcc dot gnu.org
@ 2021-01-13  9:16 ` cvs-commit at gcc dot gnu.org
  2021-01-13  9:25 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-01-13  9:16 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96938

--- Comment #5 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:5d057bfeff70e5b8d00e521844c476f62d51e22c

commit r11-6631-g5d057bfeff70e5b8d00e521844c476f62d51e22c
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Jan 13 10:15:13 2021 +0100

    i386: Add define_insn_and_split patterns for btrl [PR96938]

    In the following testcase we only optimize f2 and f7 to btrl, although we
    should optimize that way all of the functions.  The problem is the type
    demotion/narrowing (which is performed solely during the generic folding
and
    not later), without it we see the AND performed in SImode and match it as
    btrl, but with it while the shifts are still performed in SImode, the
    AND is already done in QImode or HImode low part of the shift.

    2021-01-13  Jakub Jelinek  <jakub@redhat.com>

            PR target/96938
            * config/i386/i386.md (*btr<mode>_1, *btr<mode>_2): New
            define_insn_and_split patterns.
            (splitter after *btr<mode>_2): New splitter.

            * gcc.target/i386/pr96938.c: New test.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/96938] Failure to optimize bit-setting pattern when not using temporary
  2020-09-04 14:10 [Bug tree-optimization/96938] New: Failure to optimize bit-setting pattern when not using temporary gabravier at gmail dot com
                   ` (4 preceding siblings ...)
  2021-01-13  9:16 ` cvs-commit at gcc dot gnu.org
@ 2021-01-13  9:25 ` jakub at gcc dot gnu.org
  5 siblings, 0 replies; 7+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-01-13  9:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96938

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

--- Comment #6 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Fixed.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-01-13  9:25 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-04 14:10 [Bug tree-optimization/96938] New: Failure to optimize bit-setting pattern when not using temporary gabravier at gmail dot com
2020-09-04 16:31 ` [Bug tree-optimization/96938] " glisse at gcc dot gnu.org
2021-01-12 16:22 ` jakub at gcc dot gnu.org
2021-01-12 16:42 ` jakub at gcc dot gnu.org
2021-01-12 18:17 ` jakub at gcc dot gnu.org
2021-01-13  9:16 ` cvs-commit at gcc dot gnu.org
2021-01-13  9:25 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).