public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates
@ 2021-03-05 12:38 jakub at gcc dot gnu.org
  2021-03-05 12:43 ` [Bug target/99405] " jakub at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-05 12:38 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405

            Bug ID: 99405
           Summary: Rotate with mask not optimized on x86 for QI/HImode
                    rotates
           Product: gcc
           Version: 11.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: enhancement
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: jakub at gcc dot gnu.org
                CC: jakub at gcc dot gnu.org, unlvsur at live dot com
        Depends on: 99396
  Target Milestone: ---

+++ This bug was initially created as a clone of Bug #99396 +++

In
unsigned char f1 (unsigned char x, unsigned y) { return (x << (y & 7)) | (x >>
(-y & 7)); }
unsigned short f2 (unsigned short x, unsigned y) { return (x << (y & 15)) | (x
>> (-y & 15)); }
unsigned int f3 (unsigned int x, unsigned y) { return (x << (y & 31)) | (x >>
(-y & 31)); }
unsigned char f4 (unsigned char x, unsigned y) { return (x >> (y & 7)) | (x <<
(-y & 7)); }
unsigned short f5 (unsigned short x, unsigned y) { return (x >> (y & 15)) | (x
<< (-y & 15)); }
unsigned int f6 (unsigned int x, unsigned y) { return (x >> (y & 31)) | (x <<
(-y & 31)); }
unsigned char f7 (unsigned char x, unsigned char y) { unsigned char v = y & 7;
unsigned char w = -y & 7; return (x << v) | (x >> w); }
unsigned short f8 (unsigned short x, unsigned char y) { unsigned char v = y &
15; unsigned char w = -y & 15; return (x << v) | (x >> w); }
unsigned int f9 (unsigned int x, unsigned char y) { unsigned char v = y & 31;
unsigned char w = -y & 31; return (x << v) | (x >> w); }
unsigned char f10 (unsigned char x, unsigned char y) { unsigned char v = y & 7;
unsigned char w = -y & 7; return (x >> v) | (x << w); }
unsigned short f11 (unsigned short x, unsigned char y) { unsigned char v = y &
15; unsigned char w = -y & 15; return (x >> v) | (x << w); }
unsigned int f12 (unsigned int x, unsigned char y) { unsigned char v = y & 31;
unsigned char w = -y & 31; return (x >> v) | (x << w); }
#ifdef __x86_64__
unsigned long long f13 (unsigned long long x, unsigned y) { return (x << (y &
63)) | (x >> (-y & 63)); }
unsigned long long f14 (unsigned long long x, unsigned y) { return (x >> (y &
63)) | (x << (-y & 63)); }
unsigned long long f15 (unsigned long long x, unsigned char y) { unsigned char
v = y & 63; unsigned char w = -y & 63; return (x << v) | (x >> w); }
unsigned long long f16 (unsigned long long x, unsigned char y) { unsigned char
v = y & 63; unsigned char w = -y & 63; return (x >> v) | (x << w); }
#endif

we don't optimize away the and instructions in f{1,2,4,5,7,8,10,11}.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99396
[Bug 99396] std::rotl and std::rotr Does not convert into ROTATE on the gimple
level

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/99405] Rotate with mask not optimized on x86 for QI/HImode rotates
  2021-03-05 12:38 [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates jakub at gcc dot gnu.org
@ 2021-03-05 12:43 ` jakub at gcc dot gnu.org
  2021-03-05 12:59 ` ubizjak at gmail dot com
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-05 12:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |jakub at gcc dot gnu.org
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |ASSIGNED
   Last reconfirmed|                            |2021-03-05

--- Comment #1 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50306
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50306&action=edit
gcc11-pr99405.patch

Untested fix.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/99405] Rotate with mask not optimized on x86 for QI/HImode rotates
  2021-03-05 12:38 [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates jakub at gcc dot gnu.org
  2021-03-05 12:43 ` [Bug target/99405] " jakub at gcc dot gnu.org
@ 2021-03-05 12:59 ` ubizjak at gmail dot com
  2021-03-05 13:03 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: ubizjak at gmail dot com @ 2021-03-05 12:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405

--- Comment #2 from Uroš Bizjak <ubizjak at gmail dot com> ---
(In reply to Jakub Jelinek from comment #1)
> Created attachment 50306 [details]
> gcc11-pr99405.patch
> 
> Untested fix.

-             (match_operand:SI 2 "register_operand" "c")
+             (match_operand:SI 2 "register_operand")

The constraint is here on purpose, you are risking reload failures with
compound instruction without it. IIRC from discussion with Segher, the combine
pass shouldn't propagate hard registers around anymore, but it can still
happen. So, if there is no compelling reason, I'd suggest to leave the
constraint.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/99405] Rotate with mask not optimized on x86 for QI/HImode rotates
  2021-03-05 12:38 [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates jakub at gcc dot gnu.org
  2021-03-05 12:43 ` [Bug target/99405] " jakub at gcc dot gnu.org
  2021-03-05 12:59 ` ubizjak at gmail dot com
@ 2021-03-05 13:03 ` jakub at gcc dot gnu.org
  2021-03-05 14:55 ` unlvsur at live dot com
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-05 13:03 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #50306|0                           |1
        is obsolete|                            |

--- Comment #3 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Created attachment 50307
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50307&action=edit
gcc11-pr99405.patch

No particular reason, it just feeled weird to have constraints on pre-reload
splitters.

Anyway, I've tried also to replace the pre-reload splitters with combine
splitters, but for some reason it didn't work.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/99405] Rotate with mask not optimized on x86 for QI/HImode rotates
  2021-03-05 12:38 [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates jakub at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-03-05 13:03 ` jakub at gcc dot gnu.org
@ 2021-03-05 14:55 ` unlvsur at live dot com
  2021-03-05 14:59 ` jakub at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: unlvsur at live dot com @ 2021-03-05 14:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405

--- Comment #4 from cqwrteur <unlvsur at live dot com> ---
(In reply to Jakub Jelinek from comment #3)
> Created attachment 50307 [details]
> gcc11-pr99405.patch
> 
> No particular reason, it just feeled weird to have constraints on pre-reload
> splitters.
> 
> Anyway, I've tried also to replace the pre-reload splitters with combine
> splitters, but for some reason it didn't work.

I am a noob of the compiler. A question here. Are these pattern matchings as
good as intrinsics? Particularly things like emulated addcarry are extremely
likely to fail.

Should I still prefer intrinsics rol/ror if I can compare to pattern matching
versions of std::rotl/std::rotr after this patch?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/99405] Rotate with mask not optimized on x86 for QI/HImode rotates
  2021-03-05 12:38 [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates jakub at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-03-05 14:55 ` unlvsur at live dot com
@ 2021-03-05 14:59 ` jakub at gcc dot gnu.org
  2021-03-05 15:06 ` unlvsur at live dot com
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-05 14:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405

--- Comment #5 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The valid C code if it is correct without UB and is pattern matched is
definitely better than some intrinsics, it is portable and can be optimized
generally sooner and better than the intrinsics.
Just be warned that there are many ways how to write not very good rotates,
e.g. (x << y) | (x >> (32 - y)) which has UB for y <= 0 or >= 32 etc.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/99405] Rotate with mask not optimized on x86 for QI/HImode rotates
  2021-03-05 12:38 [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates jakub at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-03-05 14:59 ` jakub at gcc dot gnu.org
@ 2021-03-05 15:06 ` unlvsur at live dot com
  2021-03-06 10:13 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: unlvsur at live dot com @ 2021-03-05 15:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405

--- Comment #6 from cqwrteur <unlvsur at live dot com> ---
(In reply to Jakub Jelinek from comment #5)
> The valid C code if it is correct without UB and is pattern matched is
> definitely better than some intrinsics, it is portable and can be optimized
> generally sooner and better than the intrinsics.
> Just be warned that there are many ways how to write not very good rotates,
> e.g. (x << y) | (x >> (32 - y)) which has UB for y <= 0 or >= 32 etc.

what about addcarry?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/99405] Rotate with mask not optimized on x86 for QI/HImode rotates
  2021-03-05 12:38 [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates jakub at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-03-05 15:06 ` unlvsur at live dot com
@ 2021-03-06 10:13 ` jakub at gcc dot gnu.org
  2021-04-27  8:18 ` cvs-commit at gcc dot gnu.org
  2022-05-27 18:43 ` roger at nextmovesoftware dot com
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2021-03-06 10:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405
Bug 99405 depends on bug 99396, which changed state.

Bug 99396 Summary: std::rotl and std::rotr Does not convert into ROTATE on the gimple level
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99396

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|ASSIGNED                    |RESOLVED
         Resolution|---                         |FIXED

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/99405] Rotate with mask not optimized on x86 for QI/HImode rotates
  2021-03-05 12:38 [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates jakub at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-03-06 10:13 ` jakub at gcc dot gnu.org
@ 2021-04-27  8:18 ` cvs-commit at gcc dot gnu.org
  2022-05-27 18:43 ` roger at nextmovesoftware dot com
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-27  8:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:71c8aaf29bb122ebe5e67c84903fd23ff05f04ec

commit r12-140-g71c8aaf29bb122ebe5e67c84903fd23ff05f04ec
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Tue Apr 27 10:17:45 2021 +0200

    i386: Improve [QH]Imode rotates with masked shift count [PR99405]

    The following testcase shows that while we nicely optimize away the
    useless and? of shift count before rotation for [SD]Imode rotates,
    we don't do that for [QH]Imode.

    The following patch optimizes that by using the right iterator on those
    4 patterns.

    2021-04-27  Jakub Jelinek  <jakub@redhat.com>

            PR target/99405
            * config/i386/i386.md (*<insn><mode>3_mask, *<insn><mode>3_mask_1):
            For any_rotate define_insn_split and following splitters, use
            SWI iterator instead of SWI48.

            * gcc.target/i386/pr99405.c: New test.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug target/99405] Rotate with mask not optimized on x86 for QI/HImode rotates
  2021-03-05 12:38 [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates jakub at gcc dot gnu.org
                   ` (7 preceding siblings ...)
  2021-04-27  8:18 ` cvs-commit at gcc dot gnu.org
@ 2022-05-27 18:43 ` roger at nextmovesoftware dot com
  8 siblings, 0 replies; 10+ messages in thread
From: roger at nextmovesoftware dot com @ 2022-05-27 18:43 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99405

Roger Sayle <roger at nextmovesoftware dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |roger at nextmovesoftware dot com
   Target Milestone|---                         |12.0
             Status|ASSIGNED                    |RESOLVED
      Known to work|                            |12.0
         Resolution|---                         |FIXED

--- Comment #8 from Roger Sayle <roger at nextmovesoftware dot com> ---
This was fixed for GCC 12.0.  Thanks Jakub.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-05-27 18:43 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-05 12:38 [Bug target/99405] New: Rotate with mask not optimized on x86 for QI/HImode rotates jakub at gcc dot gnu.org
2021-03-05 12:43 ` [Bug target/99405] " jakub at gcc dot gnu.org
2021-03-05 12:59 ` ubizjak at gmail dot com
2021-03-05 13:03 ` jakub at gcc dot gnu.org
2021-03-05 14:55 ` unlvsur at live dot com
2021-03-05 14:59 ` jakub at gcc dot gnu.org
2021-03-05 15:06 ` unlvsur at live dot com
2021-03-06 10:13 ` jakub at gcc dot gnu.org
2021-04-27  8:18 ` cvs-commit at gcc dot gnu.org
2022-05-27 18:43 ` roger at nextmovesoftware dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).