public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/34011] Memory load is not eliminated from tight vectorized loop
       [not found] <bug-34011-4@http.gcc.gnu.org/bugzilla/>
@ 2012-01-20 10:42 ` ubizjak at gmail dot com
  2021-07-26 19:49 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 7+ messages in thread
From: ubizjak at gmail dot com @ 2012-01-20 10:42 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34011

Uros Bizjak <ubizjak at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|2009-09-12 20:02:13         |2012-01-20 20:02:13
      Known to fail|                            |4.7.0

--- Comment #8 from Uros Bizjak <ubizjak at gmail dot com> 2012-01-20 10:29:54 UTC ---
Reconfirmed with

"GCC: (GNU) 4.7.0 20120118 (experimental) [trunk revision 183277]"


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/34011] Memory load is not eliminated from tight vectorized loop
       [not found] <bug-34011-4@http.gcc.gnu.org/bugzilla/>
  2012-01-20 10:42 ` [Bug tree-optimization/34011] Memory load is not eliminated from tight vectorized loop ubizjak at gmail dot com
@ 2021-07-26 19:49 ` pinskia at gcc dot gnu.org
  1 sibling, 0 replies; 7+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-07-26 19:49 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=34011

--- Comment #9 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
good function:
.L3:
        movdqu  (%rdi,%rax), %xmm0
        pslld   %xmm1, %xmm0
        movups  %xmm0, (%rsi,%rax)
        addq    $16, %rax
        cmpq    $1024, %rax
        jne     .L3

bad function:
.L11:
        movdqu  (%rdi,%rax), %xmm0
        movdqu  (%rsi,%rax), %xmm2
        pslld   %xmm1, %xmm0
        por     %xmm2, %xmm0
        movups  %xmm0, (%rsi,%rax)
        addq    $16, %rax
        cmpq    $1024, %rax
        jne     .L11


Looks good to me now.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/34011] Memory load is not eliminated from tight vectorized loop
  2007-11-07  9:05 [Bug rtl-optimization/34011] New: " ubizjak at gmail dot com
                   ` (3 preceding siblings ...)
  2009-09-16  8:51 ` rguenth at gcc dot gnu dot org
@ 2009-09-17  9:08 ` rguenth at gcc dot gnu dot org
  4 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-09-17  9:08 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #7 from rguenth at gcc dot gnu dot org  2009-09-17 09:08 -------
The problem is now back to the original one.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|rguenth at gcc dot gnu dot  |unassigned at gcc dot gnu
                   |org                         |dot org
             Status|ASSIGNED                    |NEW
           Keywords|                            |missed-optimization, ra


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34011


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/34011] Memory load is not eliminated from tight vectorized loop
  2007-11-07  9:05 [Bug rtl-optimization/34011] New: " ubizjak at gmail dot com
                   ` (2 preceding siblings ...)
  2009-09-15 14:40 ` rguenth at gcc dot gnu dot org
@ 2009-09-16  8:51 ` rguenth at gcc dot gnu dot org
  2009-09-17  9:08 ` rguenth at gcc dot gnu dot org
  4 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-09-16  8:51 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #6 from rguenth at gcc dot gnu dot org  2009-09-16 08:50 -------
Subject: Bug 34011

Author: rguenth
Date: Wed Sep 16 08:50:46 2009
New Revision: 151740

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=151740
Log:
2009-09-16  Richard Guenther  <rguenther@suse.de>

        PR middle-end/34011
        * tree-flow-inline.h (may_be_aliased): Compute readonly variables
        as non-aliased.

        * gcc.dg/tree-ssa/ssa-lim-7.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-7.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-flow-inline.h


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34011


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/34011] Memory load is not eliminated from tight vectorized loop
  2007-11-07  9:05 [Bug rtl-optimization/34011] New: " ubizjak at gmail dot com
  2009-09-12 20:02 ` [Bug tree-optimization/34011] " rguenth at gcc dot gnu dot org
  2009-09-15 14:07 ` rguenth at gcc dot gnu dot org
@ 2009-09-15 14:40 ` rguenth at gcc dot gnu dot org
  2009-09-16  8:51 ` rguenth at gcc dot gnu dot org
  2009-09-17  9:08 ` rguenth at gcc dot gnu dot org
  4 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-09-15 14:40 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #5 from rguenth at gcc dot gnu dot org  2009-09-15 14:40 -------
Which is likely because it decides to allocate $cx for the load destination
(operand for the scalar shift) and then needs to re-load it to $xmm? for the
vector shift.  The placement of the re-load inside the loop is unfortunate...

Reloads for insn # 67
Reload 0: reload_in (SI) = (reg:SI 116 [ pretmp.11 ])
        SSE_REGS, RELOAD_FOR_INPUT (opnum = 2)
        reload_in_reg: (reg:SI 116 [ pretmp.11 ])
        reload_reg_rtx: (reg:SI 22 xmm1)

Reloads for insn # 83
Reload 0: reload_in (QI) = (subreg:QI (reg:SI 116 [ pretmp.11 ]) 0)
        CREG, RELOAD_FOR_INPUT (opnum = 2)
        reload_in_reg: (subreg:QI (reg:SI 116 [ pretmp.11 ]) 0)
        reload_reg_rtx: (reg:QI 2 cx)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34011


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/34011] Memory load is not eliminated from tight vectorized loop
  2007-11-07  9:05 [Bug rtl-optimization/34011] New: " ubizjak at gmail dot com
  2009-09-12 20:02 ` [Bug tree-optimization/34011] " rguenth at gcc dot gnu dot org
@ 2009-09-15 14:07 ` rguenth at gcc dot gnu dot org
  2009-09-15 14:40 ` rguenth at gcc dot gnu dot org
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-09-15 14:07 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #4 from rguenth at gcc dot gnu dot org  2009-09-15 14:07 -------
With the alias issue fixed I get

good:
.LFB0:
        .cfi_startproc
        movd    srcshift(%rip), %xmm1
        xorl    %eax, %eax
        .p2align 4,,10
        .p2align 3
.L2:
        movdqu  (%rdi,%rax), %xmm0
        pslld   %xmm1, %xmm0
        movdqu  %xmm0, (%rsi,%rax)
        addq    $16, %rax
        cmpq    $1024, %rax
        jne     .L2
        rep
        ret

bad:
.LFB1:
        .cfi_startproc
        movd    srcshift(%rip), %xmm2
        leaq    1024(%rsi), %rax
        .p2align 4,,10
        .p2align 3
.L6:
        movdqu  (%rdi), %xmm0
        addq    $16, %rdi
        movdqu  (%rsi), %xmm1
        pslld   %xmm2, %xmm0
        por     %xmm1, %xmm0
        movdqu  %xmm0, (%rsi)
        addq    $16, %rsi
        cmpq    %rax, %rsi
        jne     .L6
        rep
        ret

which looks good in both cases.

For the original testcase which results in a runtime alias check we get

bad:
.LFB1:
        .cfi_startproc
        leaq    16(%rdi), %rax
        cmpq    %rax, %rsi
        leaq    16(%rsi), %rax
        seta    %dl
        cmpq    %rax, %rdi
        seta    %al
        orb     %al, %dl
        je      .L10
        leaq    1024(%rsi), %rax
        .p2align 4,,10
        .p2align 3
.L11:
        movdqu  (%rdi), %xmm0
        addq    $16, %rdi
        movd    srcshift(%rip), %xmm1
        pslld   %xmm1, %xmm0
        movdqu  (%rsi), %xmm1
        por     %xmm1, %xmm0
        movdqu  %xmm0, (%rsi)
        addq    $16, %rsi
        cmpq    %rax, %rsi
        jne     .L11
        rep
        ret
.L10:
        movzbl  srcshift(%rip), %ecx
        xorl    %eax, %eax
        .p2align 4,,10
        .p2align 3
.L13:
        movl    (%rdi,%rax), %edx
        sall    %cl, %edx
        orl     %edx, (%rsi,%rax)
        addq    $4, %rax
        cmpq    $1024, %rax
        jne     .L13
        rep
        ret

thus still bad.  It is IRA / reload that moves the srcshift load back into
the loop for some reason.


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |vmakarov at gcc dot gnu dot
                   |                            |org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34011


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug tree-optimization/34011] Memory load is not eliminated from tight vectorized loop
  2007-11-07  9:05 [Bug rtl-optimization/34011] New: " ubizjak at gmail dot com
@ 2009-09-12 20:02 ` rguenth at gcc dot gnu dot org
  2009-09-15 14:07 ` rguenth at gcc dot gnu dot org
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: rguenth at gcc dot gnu dot org @ 2009-09-12 20:02 UTC (permalink / raw)
  To: gcc-bugs



------- Comment #3 from rguenth at gcc dot gnu dot org  2009-09-12 20:02 -------
srcshift is not moved out of the loop because we think the store to dstdata may
alias it.  I'll fix that.

Index: tree-ssa-alias.c
===================================================================
--- tree-ssa-alias.c    (revision 151651)
+++ tree-ssa-alias.c    (working copy)
@@ -633,6 +633,9 @@ indirect_ref_may_alias_decl_p (tree ref1
                               HOST_WIDE_INT offset2, HOST_WIDE_INT max_size2,
                               alias_set_type base2_alias_set)
 {
+  if (TREE_READONLY (base2))
+    return false;
+
   /* If only one reference is based on a variable, they cannot alias if
      the pointer access is beyond the extent of the variable access.
      (the pointer base cannot validly point to an offset less than zero


-- 

rguenth at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         AssignedTo|unassigned at gcc dot gnu   |rguenth at gcc dot gnu dot
                   |dot org                     |org
             Status|UNCONFIRMED                 |ASSIGNED
     Ever Confirmed|0                           |1
   Last reconfirmed|0000-00-00 00:00:00         |2009-09-12 20:02:13
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34011


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-07-26 19:49 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-34011-4@http.gcc.gnu.org/bugzilla/>
2012-01-20 10:42 ` [Bug tree-optimization/34011] Memory load is not eliminated from tight vectorized loop ubizjak at gmail dot com
2021-07-26 19:49 ` pinskia at gcc dot gnu.org
2007-11-07  9:05 [Bug rtl-optimization/34011] New: " ubizjak at gmail dot com
2009-09-12 20:02 ` [Bug tree-optimization/34011] " rguenth at gcc dot gnu dot org
2009-09-15 14:07 ` rguenth at gcc dot gnu dot org
2009-09-15 14:40 ` rguenth at gcc dot gnu dot org
2009-09-16  8:51 ` rguenth at gcc dot gnu dot org
2009-09-17  9:08 ` rguenth at gcc dot gnu dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).