public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/111252] New: LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m
@ 2023-08-31  4:31 xry111 at gcc dot gnu.org
  2023-08-31  4:33 ` [Bug target/111252] " xry111 at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-08-31  4:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111252

            Bug ID: 111252
           Summary: LoongArch: Suboptimal code for (a & ~mask) | (b &
                    mask) where mask is a constant with value ((1 << n) -
                    1) << m
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: xry111 at gcc dot gnu.org
  Target Milestone: ---

int test(int a, int b)
{
  return (a & ~0x10) | (b & 0x10);
}

compiles to:

        addi.w  $r12,$r0,-17                    # 0xffffffffffffffef
        and     $r12,$r12,$r4
        andi    $r5,$r5,16
        or      $r12,$r12,$r5
        slli.w  $r4,$r12,0
        jr      $r1

It should be improved:

bstrpick.w $r4, $r4, 4, 4
bstrins.w  $r5, $r4, 4, 4
or         $r5, $r4, $r0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111252] LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m
  2023-08-31  4:31 [Bug target/111252] New: LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m xry111 at gcc dot gnu.org
@ 2023-08-31  4:33 ` xry111 at gcc dot gnu.org
  2023-08-31  4:42 ` pinskia at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-08-31  4:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111252

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |chenglulu at loongson dot cn,
                   |                            |chenxiaolong at loongson dot cn
             Target|                            |loongarch*-*-*

--- Comment #1 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
In particular this issue causes the compiler to compile __builtin_copysignf128
into very stupid code.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111252] LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m
  2023-08-31  4:31 [Bug target/111252] New: LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m xry111 at gcc dot gnu.org
  2023-08-31  4:33 ` [Bug target/111252] " xry111 at gcc dot gnu.org
@ 2023-08-31  4:42 ` pinskia at gcc dot gnu.org
  2023-08-31  4:44 ` xry111 at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-31  4:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111252

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |pinskia at gcc dot gnu.org
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2023-08-31
           Keywords|                            |missed-optimization

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Interesting:
int test(int a, int b)
{
  return (a & ~0x80000000) | (b & 0x80000000);
}

Produces better code:
        lu12i.w $r12,-2147483648>>12                  # 0xffffffff80000000
        and     $r12,$r12,$r5
        bstrpick.w      $r4,$r4,30,0
        or      $r4,$r4,$r12
        slli.w  $r4,$r4,0
        jr      $r1


But note the expansion of __builtin_copysignf128 should be using
extract_bit_field and friends to extract the bit and do the insertation. I have
not looked into that yet though.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111252] LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m
  2023-08-31  4:31 [Bug target/111252] New: LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m xry111 at gcc dot gnu.org
  2023-08-31  4:33 ` [Bug target/111252] " xry111 at gcc dot gnu.org
  2023-08-31  4:42 ` pinskia at gcc dot gnu.org
@ 2023-08-31  4:44 ` xry111 at gcc dot gnu.org
  2023-08-31  4:46 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-08-31  4:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111252

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111252] LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m
  2023-08-31  4:31 [Bug target/111252] New: LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m xry111 at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2023-08-31  4:44 ` xry111 at gcc dot gnu.org
@ 2023-08-31  4:46 ` pinskia at gcc dot gnu.org
  2023-08-31  4:53 ` xry111 at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2023-08-31  4:46 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111252

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The easiest fix for __builtin_copysignf128 is change expand_copysign_bit in
optabs.cc to use extract_bit_field to do the extraction and store_bit_field for
the insert instead of what it currently does of using ands and ors ...

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111252] LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m
  2023-08-31  4:31 [Bug target/111252] New: LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m xry111 at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2023-08-31  4:46 ` pinskia at gcc dot gnu.org
@ 2023-08-31  4:53 ` xry111 at gcc dot gnu.org
  2023-08-31  4:54 ` xry111 at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-08-31  4:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111252

--- Comment #4 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #2)
> Interesting:
> int test(int a, int b)
> {
>   return (a & ~0x80000000) | (b & 0x80000000);
> }
> 
> Produces better code:
>         lu12i.w $r12,-2147483648>>12                  # 0xffffffff80000000
>         and     $r12,$r12,$r5
>         bstrpick.w      $r4,$r4,30,0
>         or      $r4,$r4,$r12
>         slli.w  $r4,$r4,0
>         jr      $r1

Hmm, this seems a separate issue.  The compiler knows to optimize (a & mask) if
mask is ((1 << a) - 1) << b iff a + b = 32 or b = 0, but not for any other
masks even if it's "expensive" to materialize the mask:

long test(long a, long b)
{
  return a & 0xfffff0000l;
}

compiles to:

        lu12i.w $r12,-65536>>12                 # 0xffffffffffff0000
        lu32i.d $r12,0xf00000000>>32
        and     $r4,$r4,$r12
        jr      $r1

But the following is better:

bstrpick.d $r12, $r12, 35, 16
slli.d     $r12, $r12, 16

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111252] LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m
  2023-08-31  4:31 [Bug target/111252] New: LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m xry111 at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2023-08-31  4:53 ` xry111 at gcc dot gnu.org
@ 2023-08-31  4:54 ` xry111 at gcc dot gnu.org
  2023-09-07  7:59 ` cvs-commit at gcc dot gnu.org
  2023-09-07  8:01 ` xry111 at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-08-31  4:54 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111252

--- Comment #5 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
(In reply to Xi Ruoyao from comment #4)

> Hmm, this seems a separate issue.  The compiler knows to optimize (a & mask)
> if mask is ((1 << a) - 1) << b iff a + b = 32 or b = 0, but not for any

I mean "32 or 64".

> other masks even if it's "expensive" to materialize the mask:

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111252] LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m
  2023-08-31  4:31 [Bug target/111252] New: LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m xry111 at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2023-08-31  4:54 ` xry111 at gcc dot gnu.org
@ 2023-09-07  7:59 ` cvs-commit at gcc dot gnu.org
  2023-09-07  8:01 ` xry111 at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-09-07  7:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111252

--- Comment #6 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Xi Ruoyao <xry111@gcc.gnu.org>:

https://gcc.gnu.org/g:5b857e87201335148f23ec7134cf7fbf97c04c72

commit r14-3773-g5b857e87201335148f23ec7134cf7fbf97c04c72
Author: Xi Ruoyao <xry111@xry111.site>
Date:   Tue Sep 5 19:42:30 2023 +0800

    LoongArch: Use bstrins instruction for (a & ~mask) and (a & mask) | (b &
~mask) [PR111252]

    If mask is a constant with value ((1 << N) - 1) << M we can perform this
    optimization.

    gcc/ChangeLog:

            PR target/111252
            * config/loongarch/loongarch-protos.h
            (loongarch_pre_reload_split): Declare new function.
            (loongarch_use_bstrins_for_ior_with_mask): Likewise.
            * config/loongarch/loongarch.cc
            (loongarch_pre_reload_split): Implement.
            (loongarch_use_bstrins_for_ior_with_mask): Likewise.
            * config/loongarch/predicates.md (ins_zero_bitmask_operand):
            New predicate.
            * config/loongarch/loongarch.md (bstrins_<mode>_for_mask):
            New define_insn_and_split.
            (bstrins_<mode>_for_ior_mask): Likewise.
            (define_peephole2): Further optimize code sequence produced by
            bstrins_<mode>_for_ior_mask if possible.

    gcc/testsuite/ChangeLog:

            * g++.target/loongarch/bstrins-compile.C: New test.
            * g++.target/loongarch/bstrins-run.C: New test.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/111252] LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m
  2023-08-31  4:31 [Bug target/111252] New: LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m xry111 at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2023-09-07  7:59 ` cvs-commit at gcc dot gnu.org
@ 2023-09-07  8:01 ` xry111 at gcc dot gnu.org
  7 siblings, 0 replies; 9+ messages in thread
From: xry111 at gcc dot gnu.org @ 2023-09-07  8:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111252

Xi Ruoyao <xry111 at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #7 from Xi Ruoyao <xry111 at gcc dot gnu.org> ---
Done.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2023-09-07  8:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-31  4:31 [Bug target/111252] New: LoongArch: Suboptimal code for (a & ~mask) | (b & mask) where mask is a constant with value ((1 << n) - 1) << m xry111 at gcc dot gnu.org
2023-08-31  4:33 ` [Bug target/111252] " xry111 at gcc dot gnu.org
2023-08-31  4:42 ` pinskia at gcc dot gnu.org
2023-08-31  4:44 ` xry111 at gcc dot gnu.org
2023-08-31  4:46 ` pinskia at gcc dot gnu.org
2023-08-31  4:53 ` xry111 at gcc dot gnu.org
2023-08-31  4:54 ` xry111 at gcc dot gnu.org
2023-09-07  7:59 ` cvs-commit at gcc dot gnu.org
2023-09-07  8:01 ` xry111 at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).