public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost
@ 2022-01-17  3:36 crazylht at gmail dot com
  2022-01-17  4:56 ` [Bug rtl-optimization/104059] [12 Regression] " pinskia at gcc dot gnu.org
                   ` (8 more replies)
  0 siblings, 9 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2022-01-17  3:36 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104059

            Bug ID: 104059
           Summary: cprop_hardreg propgates hard registers for mov
                    instructions between different REG_CLASS without
                    considering cost
           Product: gcc
           Version: 12.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: crazylht at gmail dot com
  Target Milestone: ---

int test (uint8_t *p, uint32_t t[1][1], int n) {

  int sum = 0;
  uint32_t a0;
  for (int i = 0; i < 4; i++, p++)
    t[i][0] = p[0];

  for (int i = 0; i < 4; i++) {
    {
      int t0 = t[0][i] + t[0][i];
      a0 = t0;
    };
    sum += a0;
  }
  return (((uint16_t)sum) + ((uint32_t)sum >> 16)) >> 1;
}

testcase is from PR104049, for x86 with -O3 -mavx2 ,before cprop_hardregs it's

----before cprop_hardreg------
(insn 100 79 81 2 (set (reg:SI 1 dx [orig:90 stmp__9.14 ] [90])
        (reg:SI 20 xmm0 [114])) 81 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 20 xmm0 [114])
        (nil)))
(debug_insn 81 100 96 2 (debug_marker) "/app/example.cpp":16:3 -1
     (nil))
(insn 96 81 82 2 (set (reg:SI 0 ax [116])
        (reg:SI 1 dx [orig:90 stmp__9.14 ] [90])) "/app/example.cpp":16:44 81
{*movsi_internal}
     (nil))
---end------------

------after cprop_hardreg--------
(insn 100 79 81 2 (set (reg:SI 1 dx [orig:90 stmp__9.14 ] [90])
        (reg:SI 20 xmm0 [114])) 81 {*movsi_internal}
     (nil))
(debug_insn 81 100 96 2 (debug_marker) "/app/example.cpp":16:3 -1
     (nil))
(insn 96 81 82 2 (set (reg:SI 0 ax [116])
        (reg:SI 20 xmm0 [orig:90 stmp__9.14 ] [90])) "/app/example.cpp":16:44
81 {*movsi_internal}
     (expr_list:REG_DEAD (reg:SI 20 xmm0 [orig:90 stmp__9.14 ] [90])
        (nil)))
------end--------------

it's
        vmovd   edx, xmm0
        movl   eax, edx

vs
        vmovd   edx, xmm0
        vmovd   eax, xmm0

vmovd is expensive for many x86 targets.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/104059] [12 Regression] cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost
  2022-01-17  3:36 [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost crazylht at gmail dot com
@ 2022-01-17  4:56 ` pinskia at gcc dot gnu.org
  2022-01-17  8:40 ` rguenth at gcc dot gnu.org
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-01-17  4:56 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104059

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Target Milestone|---                         |12.0
             Status|UNCONFIRMED                 |NEW
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2022-01-17
            Summary|cprop_hardreg propgates     |[12 Regression]
                   |hard registers for mov      |cprop_hardreg propgates
                   |instructions between        |hard registers for mov
                   |different REG_CLASS without |instructions between
                   |considering cost            |different REG_CLASS without
                   |                            |considering cost
          Component|target                      |rtl-optimization

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Confirmed, I see it even on aarch64:

        addv    s0, v0.4s
        fmov    w0, s0
        fmov    w1, s0


In GCC 11 we get:

        ldr     q0, [x1]
        shl     v0.4s, v0.4s, 1
        addv    s0, v0.4s
        fmov    w0, s0
        lsr     w1, w0, 16
        add     w0, w1, w0, uxth
        lsr     w0, w0, 1


While on the trunk we get:

        shl     v0.4s, v0.4s, 1
        addv    s0, v0.4s
        fmov    w0, s0
        fmov    w1, s0
        and     w0, w0, 65535
        add     w0, w0, w1, lsr 16
        lsr     w0, w0, 1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/104059] [12 Regression] cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost
  2022-01-17  3:36 [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost crazylht at gmail dot com
  2022-01-17  4:56 ` [Bug rtl-optimization/104059] [12 Regression] " pinskia at gcc dot gnu.org
@ 2022-01-17  8:40 ` rguenth at gcc dot gnu.org
  2022-01-17 17:53 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-01-17  8:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104059

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P1
           Keywords|                            |needs-bisection

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/104059] [12 Regression] cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost
  2022-01-17  3:36 [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost crazylht at gmail dot com
  2022-01-17  4:56 ` [Bug rtl-optimization/104059] [12 Regression] " pinskia at gcc dot gnu.org
  2022-01-17  8:40 ` rguenth at gcc dot gnu.org
@ 2022-01-17 17:53 ` jakub at gcc dot gnu.org
  2022-01-17 17:59 ` pinskia at gcc dot gnu.org
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: jakub at gcc dot gnu.org @ 2022-01-17 17:53 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104059

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Just testing for number of vmovd insns on the testcase (2 in r12-1 and 3 in
current trunk) bisects to r12-5358-g32221357007666124409ec3ee0d3a1cf263ebc9e

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/104059] [12 Regression] cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost
  2022-01-17  3:36 [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost crazylht at gmail dot com
                   ` (2 preceding siblings ...)
  2022-01-17 17:53 ` jakub at gcc dot gnu.org
@ 2022-01-17 17:59 ` pinskia at gcc dot gnu.org
  2022-01-19  3:35 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-01-17 17:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104059

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|needs-bisection             |
                 CC|                            |pinskia at gcc dot gnu.org

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Figures. I might take a look tomorrow.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/104059] [12 Regression] cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost
  2022-01-17  3:36 [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost crazylht at gmail dot com
                   ` (3 preceding siblings ...)
  2022-01-17 17:59 ` pinskia at gcc dot gnu.org
@ 2022-01-19  3:35 ` pinskia at gcc dot gnu.org
  2022-01-19  3:41 ` pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-01-19  3:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104059

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|x86_64-*-* i?86-*-*         |x86_64-*-* i?86-*-*
                   |aarch64*-*-*                |

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #3)
> Figures. I might take a look tomorrow.


Before in GCC 11 for gimple level:
  vect_sum_26.13_34 = VIEW_CONVERT_EXPR<vector(4) int>(vect__7.9_48);
  _32 = (vector(4) unsigned int) vect_sum_26.13_34;
  _31 = VEC_PERM_EXPR <_32, { 0, 0, 0, 0 }, { 2, 3, 4, 5 }>;
  _25 = _31 + _32;
  _19 = VEC_PERM_EXPR <_25, { 0, 0, 0, 0 }, { 1, 2, 3, 4 }>;
  _18 = _19 + _25;
  stmp_sum_26.14_17 = BIT_FIELD_REF <_18, 32, 0>;
  _44 = VEC_PERM_EXPR <vect__7.9_48, { 0, 0, 0, 0 }, { 2, 3, 4, 5 }>;
  _38 = _44 + vect__7.9_48;
  _37 = VEC_PERM_EXPR <_38, { 0, 0, 0, 0 }, { 1, 2, 3, 4 }>;
  _36 = _37 + _38;
  stmp__9.12_35 = BIT_FIELD_REF <_36, 32, 0>;
  _20 = stmp_sum_26.14_17 & 65535;
  _10 = (unsigned int) _20;
  _12 = stmp__9.12_35 >> 16;
  _13 = _10 + _12;
  _14 = _13 >> 1;
  _23 = (int) _14;

After on the trunk:
  _43 = VEC_PERM_EXPR <vect__7.11_47, { 0, 0, 0, 0 }, { 2, 3, 4, 5 }>;
  _42 = _43 + vect__7.11_47;
  _41 = VEC_PERM_EXPR <_42, { 0, 0, 0, 0 }, { 1, 2, 3, 4 }>;
  _34 = _41 + _42;
  stmp__9.14_33 = BIT_FIELD_REF <_34, 32, 0>;
  _37 = stmp__9.14_33 & 65535;
  _12 = stmp__9.14_33 >> 16;
  _13 = _12 + _37;
  _14 = _13 >> 1;
  _23 = (int) _14;

As you can see the number of adds and PERM is better. I don't see anything that
should be done on the gimple level, the gimple level looks decent now.
Basically what was _10  previously is now _37 and all of the extra casts were
removed.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/104059] [12 Regression] cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost
  2022-01-17  3:36 [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost crazylht at gmail dot com
                   ` (4 preceding siblings ...)
  2022-01-19  3:35 ` pinskia at gcc dot gnu.org
@ 2022-01-19  3:41 ` pinskia at gcc dot gnu.org
  2022-01-27  6:02 ` crazylht at gmail dot com
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 10+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-01-19  3:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104059

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Hmm, -fno-tree-ter "fixes" this one too. I have not looked into why though.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/104059] [12 Regression] cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost
  2022-01-17  3:36 [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost crazylht at gmail dot com
                   ` (5 preceding siblings ...)
  2022-01-19  3:41 ` pinskia at gcc dot gnu.org
@ 2022-01-27  6:02 ` crazylht at gmail dot com
  2022-02-08  4:39 ` cvs-commit at gcc dot gnu.org
  2022-02-15  6:37 ` crazylht at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2022-01-27  6:02 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104059

--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
A patch is posted at
https://gcc.gnu.org/pipermail/gcc-patches/2022-January/589209.html

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/104059] [12 Regression] cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost
  2022-01-17  3:36 [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost crazylht at gmail dot com
                   ` (6 preceding siblings ...)
  2022-01-27  6:02 ` crazylht at gmail dot com
@ 2022-02-08  4:39 ` cvs-commit at gcc dot gnu.org
  2022-02-15  6:37 ` crazylht at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2022-02-08  4:39 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104059

--- Comment #7 from CVS Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:0103c2e4082c5a342a6834d31ea52bc7e5498016

commit r12-7089-g0103c2e4082c5a342a6834d31ea52bc7e5498016
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Jan 24 18:17:47 2022 +0800

    Don't propagate for a more expensive reg-reg move.

    For i386, it enables optimization like:

            vmovd   %xmm0, %edx
    -       vmovd   %xmm0, %eax
    +       movl    %edx, %eax

    gcc/ChangeLog:

            PR rtl-optimization/104059
            * regcprop.cc (copyprop_hardreg_forward_1): Don't propagate
            for a more expensive reg-reg move.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr104059.c: New test.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug rtl-optimization/104059] [12 Regression] cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost
  2022-01-17  3:36 [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost crazylht at gmail dot com
                   ` (7 preceding siblings ...)
  2022-02-08  4:39 ` cvs-commit at gcc dot gnu.org
@ 2022-02-15  6:37 ` crazylht at gmail dot com
  8 siblings, 0 replies; 10+ messages in thread
From: crazylht at gmail dot com @ 2022-02-15  6:37 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104059

Hongtao.liu <crazylht at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|---                         |FIXED

--- Comment #8 from Hongtao.liu <crazylht at gmail dot com> ---
Fixed in GCC12.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-02-15  6:37 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-17  3:36 [Bug target/104059] New: cprop_hardreg propgates hard registers for mov instructions between different REG_CLASS without considering cost crazylht at gmail dot com
2022-01-17  4:56 ` [Bug rtl-optimization/104059] [12 Regression] " pinskia at gcc dot gnu.org
2022-01-17  8:40 ` rguenth at gcc dot gnu.org
2022-01-17 17:53 ` jakub at gcc dot gnu.org
2022-01-17 17:59 ` pinskia at gcc dot gnu.org
2022-01-19  3:35 ` pinskia at gcc dot gnu.org
2022-01-19  3:41 ` pinskia at gcc dot gnu.org
2022-01-27  6:02 ` crazylht at gmail dot com
2022-02-08  4:39 ` cvs-commit at gcc dot gnu.org
2022-02-15  6:37 ` crazylht at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).