public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/114490] New: Optimization: x86 "shl" condition codes never reused
@ 2024-03-26 22:50 Explorer09 at gmail dot com
  2024-03-26 23:11 ` [Bug target/114490] " pinskia at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Explorer09 at gmail dot com @ 2024-03-26 22:50 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490

            Bug ID: 114490
           Summary: Optimization: x86 "shl" condition codes never reused
           Product: gcc
           Version: 13.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: Explorer09 at gmail dot com
  Target Milestone: ---

This seems to happen with x86 target only.

For the example code below, I can use the SF (sign flag) after a left shift
(<<) operation, but somehow I cannot make GCC reuse the condition code of a
"shl" instruction. It always generates an unnecessary "testl %edx, %edx" before
the conditional jump.

This is one case where the condition code of a bit shift can be useful. I know
in the ARM target of GCC, the condition code is reused after the shift.

```c
#include <stdbool.h>
#include <stdint.h>

bool my_isxdigit(unsigned char ch) {
  uint32_t mask1 = 0x7E00FFC0;
  // Prevent the compiler from transforming the constants.
  // Suppose we have to use them as written.
  __asm__ ("" : "+r" (mask1));

  if (!((mask1 << (ch & 0x1F)) >> 31))
    return false;

  uint32_t mask2 = 0x1A << 24;
  __asm__ ("" : "+r" (mask2));

  if (!((mask2 << (ch >> 4)) >> 31))
    return false;

  return true;
}
```

x86-64 gcc with "-O3" option
(Actually I tested in Compiler Explorer (godbolt.org))

```x86asm
my_isxdigit:
  movl   %edi, %ecx
  movl   $2113994688, %eax
  sall   %cl, %eax
  movl   %eax, %edx
  xorl   %eax, %eax
  testl  %edx, %edx
  jns    .L1
  shrb   $4, %cl
  movl   $436207616, %eax
  sall   %cl, %eax
  shrl   $31, %eax
.L1:
  ret
```

Possible shorter code:

```x86asm
my_isxdigit_2:
  movl   %edi, %ecx
  xorl   %eax, %eax
  movl   $2113994688, %edx
  sall   %cl, %edx
  jns    .L2
  shrb   $4, %cl
  movl   $436207616, %eax
  sall   %cl, %eax
  shrl   $31, %eax
.L2:
  ret
```

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/114490] Optimization: x86 "shl" condition codes never reused
  2024-03-26 22:50 [Bug target/114490] New: Optimization: x86 "shl" condition codes never reused Explorer09 at gmail dot com
@ 2024-03-26 23:11 ` pinskia at gcc dot gnu.org
  2024-03-26 23:26 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-26 23:11 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Hmm, the backend has:
;; This pattern can't accept a variable shift count, since shifts by
;; zero don't affect the flags.  We assume that shifts by constant
;; zero are optimized away.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/114490] Optimization: x86 "shl" condition codes never reused
  2024-03-26 22:50 [Bug target/114490] New: Optimization: x86 "shl" condition codes never reused Explorer09 at gmail dot com
  2024-03-26 23:11 ` [Bug target/114490] " pinskia at gcc dot gnu.org
@ 2024-03-26 23:26 ` pinskia at gcc dot gnu.org
  2024-03-26 23:52 ` Explorer09 at gmail dot com
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-26 23:26 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |INVALID
             Status|UNCONFIRMED                 |RESOLVED

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Read `7.3.6.1 Shift Instructions` from the `Intel® 64 and IA-32 Architectures
Software Developer’s Manual` volume 1.

Basically if you are shifting by 0, then the CF bit never changes from the
previous and this can't be done.

"The CF flag is loaded with the last bit shifted out of the
operand."
There was no bit shifted if doing a shift by 0.

So the comment in the backend is correct and there is no way to remove the test
(unless you know ch can't be 0).

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/114490] Optimization: x86 "shl" condition codes never reused
  2024-03-26 22:50 [Bug target/114490] New: Optimization: x86 "shl" condition codes never reused Explorer09 at gmail dot com
  2024-03-26 23:11 ` [Bug target/114490] " pinskia at gcc dot gnu.org
  2024-03-26 23:26 ` pinskia at gcc dot gnu.org
@ 2024-03-26 23:52 ` Explorer09 at gmail dot com
  2024-03-27 11:28 ` Explorer09 at gmail dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Explorer09 at gmail dot com @ 2024-03-26 23:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490

--- Comment #3 from Kang-Che Sung <Explorer09 at gmail dot com> ---
(In reply to Andrew Pinski from comment #2)
> 
> Basically if you are shifting by 0, then the CF bit never changes from the
> previous and this can't be done.
> 
> So the comment in the backend is correct and there is no way to remove the
> test (unless you know ch can't be 0).

What if I had the CF cleared to 0 already before the shift?

Even when the flags don't change when the value in the CL register is zero, if
I had the flags in a known state, the optimization can still be done.

And my suggested 'my_isxdigit_2' code worked before I know the caveat of the
shift instructions... ("xorl  %eax, %eax" -> CF = SF = 0)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/114490] Optimization: x86 "shl" condition codes never reused
  2024-03-26 22:50 [Bug target/114490] New: Optimization: x86 "shl" condition codes never reused Explorer09 at gmail dot com
                   ` (2 preceding siblings ...)
  2024-03-26 23:52 ` Explorer09 at gmail dot com
@ 2024-03-27 11:28 ` Explorer09 at gmail dot com
  2024-03-27 12:41 ` Explorer09 at gmail dot com
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Explorer09 at gmail dot com @ 2024-03-27 11:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490

--- Comment #4 from Kang-Che Sung <Explorer09 at gmail dot com> ---
1. I just read "AMD64 Architecture Programmer's Manual - Volume 3:
General-Purpose and System Instructions"
(https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/24594.pdf)

It has a clearer wording in the "SAL / SHL" section:

"If the shift count is 0, no flags are modified."

Just mention for reference.

2. I still don't believe there is no chance of optimizing this thing, but it
requires GCC to track the state of the FLAGS register. I don't know if GCC can
do this internally. If GCC can't do this for now, that's OK for me (the example
I posted can be rewritten to another pattern that might produce even smaller
code in x86). But maybe label this as a WONTFIX and not INVALID?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/114490] Optimization: x86 "shl" condition codes never reused
  2024-03-26 22:50 [Bug target/114490] New: Optimization: x86 "shl" condition codes never reused Explorer09 at gmail dot com
                   ` (3 preceding siblings ...)
  2024-03-27 11:28 ` Explorer09 at gmail dot com
@ 2024-03-27 12:41 ` Explorer09 at gmail dot com
  2024-03-27 19:08 ` pinskia at gcc dot gnu.org
  2024-03-28  8:53 ` Explorer09 at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: Explorer09 at gmail dot com @ 2024-03-27 12:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490

Kang-Che Sung <Explorer09 at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|INVALID                     |---
             Status|RESOLVED                    |UNCONFIRMED

--- Comment #5 from Kang-Che Sung <Explorer09 at gmail dot com> ---
(Trying to mark this bug as UNCONFIRMED again in the hope of getting some
attention. I'm not sure if this is the right way of using GCC's Bugzilla here.)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/114490] Optimization: x86 "shl" condition codes never reused
  2024-03-26 22:50 [Bug target/114490] New: Optimization: x86 "shl" condition codes never reused Explorer09 at gmail dot com
                   ` (4 preceding siblings ...)
  2024-03-27 12:41 ` Explorer09 at gmail dot com
@ 2024-03-27 19:08 ` pinskia at gcc dot gnu.org
  2024-03-28  8:53 ` Explorer09 at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-03-27 19:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |missed-optimization
           Severity|normal                      |enhancement

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Kang-Che Sung from comment #4)
> I don't know if GCC
> can do this internally. If GCC can't do this for now, that's OK for me (the
> example I posted can be rewritten to another pattern that might produce even
> smaller code in x86). But maybe label this as a WONTFIX and not INVALID?

The way GCC tracks the flags register is either clobber or set. Almost never
merge. In the case of the shift it is either set (for constant case) or
clobber. 
Changing this will almost surely be a very invasive patch to the backend for a
very small gain. I will leave it up to the target maintainers to decide if they
want to do it or not.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/114490] Optimization: x86 "shl" condition codes never reused
  2024-03-26 22:50 [Bug target/114490] New: Optimization: x86 "shl" condition codes never reused Explorer09 at gmail dot com
                   ` (5 preceding siblings ...)
  2024-03-27 19:08 ` pinskia at gcc dot gnu.org
@ 2024-03-28  8:53 ` Explorer09 at gmail dot com
  6 siblings, 0 replies; 8+ messages in thread
From: Explorer09 at gmail dot com @ 2024-03-28  8:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114490

--- Comment #7 from Kang-Che Sung <Explorer09 at gmail dot com> ---
I just come here to say that PR 114512 is the real issue that I wish to see
fixed. It might be okay for me when the FLAGS of "shl" becomes unused by GCC,
since other than CF (carry) and SF (sign) I don't see any other flag of
"shl"/"shr"/”sar” being useful in most applications.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-03-28  8:53 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-26 22:50 [Bug target/114490] New: Optimization: x86 "shl" condition codes never reused Explorer09 at gmail dot com
2024-03-26 23:11 ` [Bug target/114490] " pinskia at gcc dot gnu.org
2024-03-26 23:26 ` pinskia at gcc dot gnu.org
2024-03-26 23:52 ` Explorer09 at gmail dot com
2024-03-27 11:28 ` Explorer09 at gmail dot com
2024-03-27 12:41 ` Explorer09 at gmail dot com
2024-03-27 19:08 ` pinskia at gcc dot gnu.org
2024-03-28  8:53 ` Explorer09 at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).