[Bug target/67325] New: Optimize shift (aka subreg) of load to simple load

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load
@ 2015-08-23  6:10 glisse at gcc dot gnu.org
  2015-08-23  6:14 ` [Bug rtl-optimization/67325] " pinskia at gcc dot gnu.org
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: glisse at gcc dot gnu.org @ 2015-08-23  6:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325

            Bug ID: 67325
           Summary: Optimize shift (aka subreg) of load to simple load
           Product: gcc
           Version: 6.0
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: glisse at gcc dot gnu.org
  Target Milestone: ---
            Target: x86_64-linux-gnu

int f(long*l){
  return *l>>32;
}

        movq    (%rdi), %rax
        sarq    $32, %rax

While it seems to me that a single movl would do.

Classified as target (x86_64) for now, but it is more likely tree-optimization
or rtl-optimization.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/67325] Optimize shift (aka subreg) of load to simple load
  2015-08-23  6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
@ 2015-08-23  6:14 ` pinskia at gcc dot gnu.org
  2015-08-23  6:41 ` glisse at gcc dot gnu.org
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-08-23  6:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|target                      |rtl-optimization

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
There was some code in combine/simplify-rtx which does this but maybe it only
handles logical shift right and not arithmetic shift right.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/67325] Optimize shift (aka subreg) of load to simple load
  2015-08-23  6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
  2015-08-23  6:14 ` [Bug rtl-optimization/67325] " pinskia at gcc dot gnu.org
@ 2015-08-23  6:41 ` glisse at gcc dot gnu.org
  2015-08-23  6:59 ` trippels at gcc dot gnu.org
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: glisse at gcc dot gnu.org @ 2015-08-23  6:41 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325

--- Comment #2 from Marc Glisse <glisse at gcc dot gnu.org> ---
I am seeing the same code (well, with shrq) if I make the types unsigned.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug rtl-optimization/67325] Optimize shift (aka subreg) of load to simple load
  2015-08-23  6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
  2015-08-23  6:14 ` [Bug rtl-optimization/67325] " pinskia at gcc dot gnu.org
  2015-08-23  6:41 ` glisse at gcc dot gnu.org
@ 2015-08-23  6:59 ` trippels at gcc dot gnu.org
  2015-08-23  7:08 ` [Bug target/67325] " pinskia at gcc dot gnu.org
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: trippels at gcc dot gnu.org @ 2015-08-23  6:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325

Markus Trippelsdorf <trippels at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2015-08-23
                 CC|                            |trippels at gcc dot gnu.org
     Ever confirmed|0                           |1

--- Comment #3 from Markus Trippelsdorf <trippels at gcc dot gnu.org> ---
Apparently never worked with gcc. Clang gets it right.
https://goo.gl/BB4KMZ


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/67325] Optimize shift (aka subreg) of load to simple load
  2015-08-23  6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2015-08-23  6:59 ` trippels at gcc dot gnu.org
@ 2015-08-23  7:08 ` pinskia at gcc dot gnu.org
  2015-08-23  7:40 ` glisse at gcc dot gnu.org
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2015-08-23  7:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
          Component|rtl-optimization            |target

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
This is a target issue:
Trying 6 -> 7:
Failed to match this instruction:
(parallel [
        (set (reg:DI 65)
            (sign_extend:DI (mem:SI (plus:DI (reg/v/f:DI 63 [ l ])
                        (const_int 4 [0x4])) [2 *l_1(D)+4 S4 A32])))
        (clobber (reg:CC 17 flags))
    ])
Successfully matched this instruction:
(set (reg:DI 65)
    (sign_extend:DI (mem:SI (plus:DI (reg/v/f:DI 63 [ l ])
                (const_int 4 [0x4])) [2 *l_1(D)+4 S4 A32])))
rejecting combination of insns 6 and 7
original costs 4 + 4 = 8
replacement cost 12
starting the processing of deferred insns
ending the processing of deferred insns

So GCC is able to do it but rejects it because the cost is worse for some
reason.

That is it is replacing:
(insn 6 3 7 2 (set (reg:DI 66 [ *l_1(D) ])
        (mem:DI (reg/v/f:DI 63 [ l ]) [2 *l_1(D)+0 S8 A64])) t1.c:2 62
{*movdi_internal_rex64}
     (expr_list:REG_DEAD (reg/v/f:DI 63 [ l ])
        (nil)))

(insn 7 6 13 2 (parallel [
            (set (reg:DI 65)
                (ashiftrt:DI (reg:DI 66 [ *l_1(D) ])
                    (const_int 32 [0x20])))
            (clobber (reg:CC 17 flags))
        ]) t1.c:2 530 {*ashrdi3_1}
     (expr_list:REG_DEAD (reg:DI 66 [ *l_1(D) ])
        (expr_list:REG_UNUSED (reg:CC 17 flags)
            (nil))))

With the above.  Note the clobber gets in the way of combining with the next
insn of the subreg.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/67325] Optimize shift (aka subreg) of load to simple load
  2015-08-23  6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2015-08-23  7:08 ` [Bug target/67325] " pinskia at gcc dot gnu.org
@ 2015-08-23  7:40 ` glisse at gcc dot gnu.org
  2024-05-28 22:59 ` cvs-commit at gcc dot gnu.org
  2024-05-28 23:00 ` liuhongt at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: glisse at gcc dot gnu.org @ 2015-08-23  7:40 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325

--- Comment #5 from Marc Glisse <glisse at gcc dot gnu.org> ---
(In reply to Andrew Pinski from comment #4)
> So GCC is able to do it but rejects it because the cost is worse for some
> reason.

Indeed, and -Os produces the expected
        movl    4(%rdi), %eax

(I did not benchmark)


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/67325] Optimize shift (aka subreg) of load to simple load
  2015-08-23  6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2015-08-23  7:40 ` glisse at gcc dot gnu.org
@ 2024-05-28 22:59 ` cvs-commit at gcc dot gnu.org
  2024-05-28 23:00 ` liuhongt at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-05-28 22:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325

--- Comment #6 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by hongtao Liu <liuhongt@gcc.gnu.org>:

https://gcc.gnu.org/g:1d6199e5f8c1c08083eeb0279f71333234fe14ad

commit r15-882-g1d6199e5f8c1c08083eeb0279f71333234fe14ad
Author: liuhongt <hongtao.liu@intel.com>
Date:   Mon Feb 19 13:57:24 2024 +0800

    Reduce cost of MEM (A + imm).

    For MEM, rtx_cost iterates each subrtx, and adds up the costs,
    so for MEM (reg) and MEM (reg + 4), the former costs 5,
    the latter costs 9, it is not accurate for x86. Ideally
    address_cost should be used, but it reduce cost too much.
    So current solution is make constant disp as cheap as possible.

    gcc/ChangeLog:

            PR target/67325
            * config/i386/i386.cc (ix86_rtx_costs): Reduce cost of MEM (A
            + imm) to "cost of MEM (A)" + 1.

    gcc/testsuite/ChangeLog:

            * gcc.target/i386/pr67325.c: New test.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [Bug target/67325] Optimize shift (aka subreg) of load to simple load
  2015-08-23  6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2024-05-28 22:59 ` cvs-commit at gcc dot gnu.org
@ 2024-05-28 23:00 ` liuhongt at gcc dot gnu.org
  6 siblings, 0 replies; 8+ messages in thread
From: liuhongt at gcc dot gnu.org @ 2024-05-28 23:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67325

Hongtao Liu <liuhongt at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |liuhongt at gcc dot gnu.org
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED

--- Comment #7 from Hongtao Liu <liuhongt at gcc dot gnu.org> ---
Fixed in GCC15.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2024-05-28 23:00 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-23  6:10 [Bug target/67325] New: Optimize shift (aka subreg) of load to simple load glisse at gcc dot gnu.org
2015-08-23  6:14 ` [Bug rtl-optimization/67325] " pinskia at gcc dot gnu.org
2015-08-23  6:41 ` glisse at gcc dot gnu.org
2015-08-23  6:59 ` trippels at gcc dot gnu.org
2015-08-23  7:08 ` [Bug target/67325] " pinskia at gcc dot gnu.org
2015-08-23  7:40 ` glisse at gcc dot gnu.org
2024-05-28 22:59 ` cvs-commit at gcc dot gnu.org
2024-05-28 23:00 ` liuhongt at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).