public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/53705] New: wrong code with custom flags - stores to memory are lost
@ 2012-06-17 12:20 zsojka at seznam dot cz
  2012-06-17 15:21 ` [Bug rtl-optimization/53705] " matz at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: zsojka at seznam dot cz @ 2012-06-17 12:20 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53705

             Bug #: 53705
           Summary: wrong code with custom flags - stores to memory are
                    lost
    Classification: Unclassified
           Product: gcc
           Version: 4.8.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
        AssignedTo: unassigned@gcc.gnu.org
        ReportedBy: zsojka@seznam.cz


Created attachment 27640
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27640
reduced testcase (from testsuite/gcc.c-torture/execute/loop-2e.c)

GCC 4.3-4.8 fails, while with GCC 4.2, the code is reduced just to return 0. I
am not sure if this should be marked as a regression.

Compiler output:
$ gcc -O2 -fno-omit-frame-pointer -fpeel-loops -fsched2-use-superblocks
-fno-tree-loop-optimize -fno-web --param=max-completely-peel-times=256
testcase.c
$ ./a.out
Aborted

Looking at the assembly:
...
    sub    rsp, 320    #,
    mov    rdx, QWORD PTR p[rip]    # p.0, p
    lea    rcx, [rbp-320]    # q,
...
    mov    QWORD PTR [rcx+240], rdi    # *q_19, tmp73
    lea    rax, [rdx+144]    # tmp73,
    lea    rsi, [rdx+148]    # tmp73,
    lea    rdi, [rdx+152]    # tmp73,
    lea    rdx, [rdx+156]    # tmp73,
    cmp    QWORD PTR [rbp-8], rdx    # q, tmp73
    mov    QWORD PTR [rcx+280], rax    # *q_19, tmp73
    mov    QWORD PTR [rcx+248], r8    # *q_19, tmp73
    mov    QWORD PTR [rcx+256], r9    # *q_19, tmp73
    mov    QWORD PTR [rcx+264], r10    # *q_19, tmp73
    mov    QWORD PTR [rcx+272], r11    # *q_19, tmp73
    mov    QWORD PTR [rcx+288], rsi    # *q_19, tmp73
    mov    QWORD PTR [rcx+296], rdi    # *q_19, tmp73
    mov    QWORD PTR [rcx+304], rdx    # *q_19, tmp73
    jne    .L46    #,
...
It seems the dependency between storing to *q++ (esp. q[39]) in foo() and
verifying it in main() is lost. Furthermore, it seems q[39] ([rcx+312]) is not
stored to at all.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/53705] wrong code with custom flags - stores to memory are lost
  2012-06-17 12:20 [Bug rtl-optimization/53705] New: wrong code with custom flags - stores to memory are lost zsojka at seznam dot cz
@ 2012-06-17 15:21 ` matz at gcc dot gnu.org
  2012-06-17 15:36 ` matz at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: matz at gcc dot gnu.org @ 2012-06-17 15:21 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53705

Michael Matz <matz at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |matz at gcc dot gnu.org

--- Comment #1 from Michael Matz <matz at gcc dot gnu.org> 2012-06-17 15:21:10 UTC ---
Problem is somewhere in cselib.  We test true_dependence of these two insns
(last write to q[] and the read of the last element):

(insn 13 299 22 2 (set (mem/f:DI (plus:DI (reg/v/f:DI 2 cx [orig:63 q ] [63])
                (const_int 144 [0x90])) [2 *q_19+0 S8 A64])
        (reg:DI 1 dx [73])) pr53705.c:9 62 {*movdi_internal_rex64}
     (expr_list:REG_DEAD (reg/v/f:DI 2 cx [orig:63 q ] [63])
        (nil)))

(insn 21 350 257 2 (set (reg:CCZ 17 flags)
        (compare:CCZ (mem/f/c:DI (plus:DI (reg/f:DI 6 bp)
                    (const_int -8 [0xfffffffffffffff8])) [2 q+152 S8 A64])
            (reg:DI 1 dx [73]))) pr53705.c:19 7 {*cmpdi_1}
     (expr_list:REG_DEAD (reg:DI 1 dx [73])
        (nil)))

For that we check write_dependence of:
  (mem/f:DI (plus:DI (reg/v/f:DI 2 cx [orig:63 q ] [63])
                     (const_int 144 [0x90])) [2 *q_19+0 S8 A64])
vs
  (mem/f/c:DI (plus:DI (reg/f:DI 6 bp)
                       (const_int -8 [0xfffffffffffffff8])) [2 q+152 S8 A64])

(note that at this point cx == bp-152, hence this accesses the same memory.

true_dependence uses base_alias_check to test these two mems and wants
to disambiguate the bases.  cselib is used for that and in valueized form
the two addresses look like so:
  (plus:DI (value:DI 8:12039 @0x1b9ef68/0x1b55410) (const_int 144 [0x90]))
  (plus:DI (value:DI 2:4059 @0x1b9eed8/0x1b552f0) (const_int -8))

>From cselib we have these details of the two involved VALUEs:

(value:DI 8:12039 @0x1b9ef68/0x1b55410)
 locs:
  from insn 43 (reg/v/f:DI 2 cx [orig:63 q ] [63])
  from insn 43 (plus:DI (value:DI 5:7965 @0x1b9ef20/0x1b55380)
            (const_int 8 [0x8]))

(value:DI 2:4059 @0x1b9eed8/0x1b552f0)
 locs:
  from insn 352 (reg/f:DI 6 bp)
  from insn 351 (plus:DI (value:DI 1:1 @0x1b9eec0/0x1b552c0)
            (const_int -8 [0xfffffffffffffff8]))

That is, value 2 is bp-based (or value 1) based, and value 8 is value 5 based,
which itself is:

(value:DI 5:7965 @0x1b9ef20/0x1b55380)
 locs:
  from insn 353 (reg/f:DI 7 sp)
  from insn 353 (plus:DI (value:DI 2:4059 @0x1b9eed8/0x1b552f0)
            (const_int -160 [0xffffffffffffff60]))

I.e. value 5 (and hence 8) is value 2 based (like the other mem), or sp based.

Now, find_base_term() for [bp-8] will return "(address:DI -4)", which comes
from
the REG_BASE_VALUE of the (reg/f:DI 6 bp).

find_base_term ([cx+144]) otoh will go via value 8 to value 5, from there
to REG_BASE_VALUE([sp]), which returns "(address:DI -1)".  If find_base_term
would skip the first loc ("sp") and try to look into the second loc (val 2)
it would also return (address:DI -4).

Now, two ADDRESS rtxes that aren't pointer-equal aren't equivalent, and hence
the disambiguator thinks that the two mems cannot point into the same memory.

Obviously the problem is some confusion in setting up REG_BASE_VALUE for
sp and bp.  When we have a frame pointer then both should have the same base,
not different ones.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/53705] wrong code with custom flags - stores to memory are lost
  2012-06-17 12:20 [Bug rtl-optimization/53705] New: wrong code with custom flags - stores to memory are lost zsojka at seznam dot cz
  2012-06-17 15:21 ` [Bug rtl-optimization/53705] " matz at gcc dot gnu.org
@ 2012-06-17 15:36 ` matz at gcc dot gnu.org
  2012-06-17 17:43 ` pinskia at gcc dot gnu.org
  2013-07-30 13:37 ` amylaar at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: matz at gcc dot gnu.org @ 2012-06-17 15:36 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53705

--- Comment #2 from Michael Matz <matz at gcc dot gnu.org> 2012-06-17 15:36:19 UTC ---
Or alternatively cselib doesn't respect one invariant in constructing the
locations of its VALUEs.  As seen above it constructs two values for the same
memory area, one referring to stack pointer, the other to (hard) frame pointer.

But alias.c explains:

     2. stack_pointer_rtx, frame_pointer_rtx, hard_frame_pointer_rtx
        (if distinct from frame_pointer_rtx) and arg_pointer_rtx.
        Each of these rtxes has a separate ADDRESS associated with it,
        each with a negative id.

        GCC is (and is required to be) precise in which register it
        chooses to access a particular region of stack.  We can therefore
        assume that accesses based on one of these rtxes do not alias
        accesses based on another of these rtxes.

Note the last paragraph.  The RTL instructions themself respect this invariant
(there are no accesses via [sp], only via [bp] or derived values).  But the
cselib values don't.  I'd say value 5 (the one referring to sp and value 2)
is the broken one.  It should only refer to value 2.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/53705] wrong code with custom flags - stores to memory are lost
  2012-06-17 12:20 [Bug rtl-optimization/53705] New: wrong code with custom flags - stores to memory are lost zsojka at seznam dot cz
  2012-06-17 15:21 ` [Bug rtl-optimization/53705] " matz at gcc dot gnu.org
  2012-06-17 15:36 ` matz at gcc dot gnu.org
@ 2012-06-17 17:43 ` pinskia at gcc dot gnu.org
  2013-07-30 13:37 ` amylaar at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2012-06-17 17:43 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53705

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |alias
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2012-06-17
     Ever Confirmed|0                           |1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug rtl-optimization/53705] wrong code with custom flags - stores to memory are lost
  2012-06-17 12:20 [Bug rtl-optimization/53705] New: wrong code with custom flags - stores to memory are lost zsojka at seznam dot cz
                   ` (2 preceding siblings ...)
  2012-06-17 17:43 ` pinskia at gcc dot gnu.org
@ 2013-07-30 13:37 ` amylaar at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: amylaar at gcc dot gnu.org @ 2013-07-30 13:37 UTC (permalink / raw)
  To: gcc-bugs

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53705

Jorn Wolfgang Rennecke <amylaar at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |amylaar at gcc dot gnu.org

--- Comment #3 from Jorn Wolfgang Rennecke <amylaar at gcc dot gnu.org> ---
(In reply to Zdenek Sojka from comment #0)
> Created attachment 27640 [details]
> reduced testcase (from testsuite/gcc.c-torture/execute/loop-2e.c)
> 
> GCC 4.3-4.8 fails, while with GCC 4.2, the code is reduced just to return 0.
> I am not sure if this should be marked as a regression.
> 
> Compiler output:
> $ gcc -O2 -fno-omit-frame-pointer -fpeel-loops -fsched2-use-superblocks
> -fno-tree-loop-optimize -fno-web --param=max-completely-peel-times=256
> testcase.c
> $ ./a.out
> Aborted
> 
> Looking at the assembly:
> ...
> 	sub	rsp, 320	#,
> 	mov	rdx, QWORD PTR p[rip]	# p.0, p
> 	lea	rcx, [rbp-320]	# q,
> ...
> 	mov	QWORD PTR [rcx+240], rdi	# *q_19, tmp73
> 	lea	rax, [rdx+144]	# tmp73,
> 	lea	rsi, [rdx+148]	# tmp73,
> 	lea	rdi, [rdx+152]	# tmp73,
> 	lea	rdx, [rdx+156]	# tmp73,
> 	cmp	QWORD PTR [rbp-8], rdx	# q, tmp73
> 	mov	QWORD PTR [rcx+280], rax	# *q_19, tmp73
> 	mov	QWORD PTR [rcx+248], r8	# *q_19, tmp73
> 	mov	QWORD PTR [rcx+256], r9	# *q_19, tmp73
> 	mov	QWORD PTR [rcx+264], r10	# *q_19, tmp73
> 	mov	QWORD PTR [rcx+272], r11	# *q_19, tmp73
> 	mov	QWORD PTR [rcx+288], rsi	# *q_19, tmp73
> 	mov	QWORD PTR [rcx+296], rdi	# *q_19, tmp73
> 	mov	QWORD PTR [rcx+304], rdx	# *q_19, tmp73
> 	jne	.L46	#,
> ...
> It seems the dependency between storing to *q++ (esp. q[39]) in foo() and
> verifying it in main() is lost. Furthermore, it seems q[39] ([rcx+312]) is
> not stored to at all.

I can't reproduce the unrolling with an i686-pc-linux-gnu X x86_64-pc-linux-gnu
compiler, neither for 4.8.2 20130729 (prerelease) , nor for 4.9.0 20130729
(experimental) sources.

Is that an issuse with the cross-compiler, or with the compiler version?
>From gcc-bugs-return-426980-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org Tue Jul 30 13:46:50 2013
Return-Path: <gcc-bugs-return-426980-listarch-gcc-bugs=gcc.gnu.org@gcc.gnu.org>
Delivered-To: listarch-gcc-bugs@gcc.gnu.org
Received: (qmail 7088 invoked by alias); 30 Jul 2013 13:46:50 -0000
Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-bugs.gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-bugs/>
List-Post: <mailto:gcc-bugs@gcc.gnu.org>
List-Help: <mailto:gcc-bugs-help@gcc.gnu.org>
Sender: gcc-bugs-owner@gcc.gnu.org
Delivered-To: mailing list gcc-bugs@gcc.gnu.org
Received: (qmail 6981 invoked by uid 48); 30 Jul 2013 13:46:47 -0000
From: "amylaar at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug rtl-optimization/58029] New: base_alias_check says pretend-args saves and varargs accesses don't alias
Date: Tue, 30 Jul 2013 13:46:00 -0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: rtl-optimization
X-Bugzilla-Version: 4.9.0
X-Bugzilla-Keywords: wrong-code
X-Bugzilla-Severity: normal
X-Bugzilla-Who: amylaar at gcc dot gnu.org
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags:
X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status keywords bug_severity priority component assigned_to reporter
Message-ID: <bug-58029-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
X-SW-Source: 2013-07/txt/msg01487.txt.bz2
Content-length: 977

http://gcc.gnu.org/bugzilla/show_bug.cgi?idX029

            Bug ID: 58029
           Summary: base_alias_check says pretend-args saves and varargs
                    accesses don't alias
           Product: gcc
           Version: 4.9.0
            Status: UNCONFIRMED
          Keywords: wrong-code
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: amylaar at gcc dot gnu.org

When the frame is set up by the prologue, some pretend args saves have to
be done using the stack pointer, as the frame pointer can't be saved until
later.
For va_arg, OTOH, it is natural to use the hard frame pointer, if no frame
pointer elimination takes place.

gcc.dg/torture/stackalign/vararg-1.c fails at -Os / -O2 or higher for
epiphany-elf because base_alias_check says the store of the pretend args
and the va_arg loads don't alias, and the stores get scheduled after the
load.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-07-30 13:37 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-06-17 12:20 [Bug rtl-optimization/53705] New: wrong code with custom flags - stores to memory are lost zsojka at seznam dot cz
2012-06-17 15:21 ` [Bug rtl-optimization/53705] " matz at gcc dot gnu.org
2012-06-17 15:36 ` matz at gcc dot gnu.org
2012-06-17 17:43 ` pinskia at gcc dot gnu.org
2013-07-30 13:37 ` amylaar at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).