public inbox for gcc-bugs@sourceware.org help / color / mirror / Atom feed
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/60172] ARM performance regression from trunk@207239 Date: Fri, 14 Feb 2014 10:22:00 -0000 [thread overview] Message-ID: <bug-60172-4-BwkIxBXRmW@http.gcc.gnu.org/bugzilla/> (raw) In-Reply-To: <bug-60172-4@http.gcc.gnu.org/bugzilla/> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=60172 --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- I can't really interpret the asm differences but it seems we need more registers? Forwprop applies the association transform (those that fold-const.c already does when presented with large enough GENERIC trees) - it transforms (p +p off1) +p off2 to (p +p (off1 + off2)), that is, associates the pointer that is offsetted first and computes the offset using unsigned integer arithmetic. That enables the reassociation pass to process the offset expression and simplifying it (that pass cannot handle a pointer addition chain). This happens in forwprop4 only - thus does -fdisable-tree-forwprop4 fix the regression? I really can't see a fundamental difference (but the associated adds) in the resulting code. So I wonder what RTL transform does / does not trigger with one of the variants. On x86_64 the code difference with -O2 [-fno-tree-forwprop4] is @@ -11,22 +11,25 @@ .cfi_startproc leal 5(%rdx), %r8d movslq %edx, %rdx + salq $2, %rdx movslq %r8d, %rax leaq 0(,%rax,4), %r9 - addq %r9, %rax leaq (%rdi,%r9), %r10 - leaq (%rax,%rax,4), %rax + addq %r9, %rax movl %ecx, (%r10) movl %ecx, 4(%rdi,%r9) - leaq (%rsi,%rax,4), %rax + leaq (%rax,%rax,4), %rcx movl %r8d, 60(%rdi,%r9) - leaq (%rax,%rdx,4), %rax + salq $2, %rcx + leaq (%rdx,%rcx), %rax + addq %rsi, %rax addl $1, 16(%rax) movl %r8d, 20(%rax) movl %r8d, 24(%rax) - movl (%r10), %edx + movl (%r10), %edi + leaq 1000(%rsi,%rcx), %rax movl $5, Int_Glob(%rip) - movl %edx, 1020(%rax) + movl %edi, 20(%rdx,%rax) ret .cfi_endproc If we look at immediate uses before RTL expansion relevant changes (single-use -> non-single-use change or vice-versa - enables combine/fwprop) are -_32 : --> single use. +_32 : -->2 uses. +_16 = _41 + _32; _33 = Arr_2_Par_Ref_22(D) + _32; which happens when associating _32 = pretmp_20 + 1000; _33 = Arr_2_Par_Ref_22(D) + _32; _34 = *_8; - _51 = _33 + _41; + _16 = _41 + _32; + _51 = Arr_2_Par_Ref_22(D) + _16; MEM[(int[25] *)_51 + 20B] = _34; but _33 is dead after the transform. +_33 : --> no uses so that's a spurious difference. Stmts with no uses are not expanded, but it seems to change what TER does. Hmm. -_32 replace with --> _32 = pretmp_20 + 1000; - Killing dead stmts with Index: gcc/tree-outof-ssa.c =================================================================== --- gcc/tree-outof-ssa.c (revision 207757) +++ gcc/tree-outof-ssa.c (working copy) @@ -876,6 +876,21 @@ eliminate_useless_phis (void) } } } + + for (unsigned i = 1; i < num_ssa_names; ++i) + { + tree name = ssa_name (i); + if (!name || !has_zero_uses (name) || virtual_operand_p (name)) + continue; + gimple def_stmt = SSA_NAME_DEF_STMT (name); + if (!is_gimple_assign (def_stmt) + || gimple_has_side_effects (def_stmt) + || stmt_could_throw_p (def_stmt)) + continue; + gimple_stmt_iterator gsi = gsi_for_stmt (def_stmt); + gsi_remove (&gsi, true); + release_defs (def_stmt); + } } fixes that (hack alert). With that we get strictly more TER. Does -fno-tree-ter also make the testcase regress, even with -fdisable-tree-forwprop4?
next prev parent reply other threads:[~2014-02-14 10:22 UTC|newest] Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top 2014-02-13 9:54 [Bug tree-optimization/60172] New: " joey.ye at arm dot com 2014-02-14 8:20 ` [Bug tree-optimization/60172] " joey.ye at arm dot com 2014-02-14 10:22 ` rguenth at gcc dot gnu.org [this message] 2014-02-14 10:50 ` joey.ye at arm dot com 2014-02-14 12:19 ` rguenth at gcc dot gnu.org 2014-02-14 14:03 ` rguenth at gcc dot gnu.org 2014-02-17 9:56 ` joey.ye at arm dot com 2014-02-17 10:07 ` rguenther at suse dot de 2014-02-19 11:19 ` joey.ye at arm dot com 2014-02-19 11:21 ` joey.ye at arm dot com 2014-02-19 23:06 ` steven at gcc dot gnu.org 2014-02-20 10:02 ` rguenther at suse dot de 2014-04-14 7:58 ` [Bug tree-optimization/60172] [4.9/4.10 Regression] " rguenth at gcc dot gnu.org 2014-05-09 8:51 ` thomas.preudhomme at arm dot com 2014-05-15 3:29 ` thomas.preudhomme at arm dot com 2014-05-15 8:01 ` rguenth at gcc dot gnu.org 2014-05-15 8:54 ` thomas.preudhomme at arm dot com 2014-05-15 9:51 ` thomas.preudhomme at arm dot com 2014-05-15 10:12 ` rguenther at suse dot de 2014-06-18 14:21 ` bpringlemeir at gmail dot com 2014-06-18 15:15 ` bpringlemeir at gmail dot com 2014-07-16 13:28 ` jakub at gcc dot gnu.org 2014-10-30 10:40 ` [Bug tree-optimization/60172] [4.9/5 " jakub at gcc dot gnu.org 2015-03-13 14:55 ` joey.ye at arm dot com 2015-06-26 19:59 ` [Bug tree-optimization/60172] [4.9/5/6 " jakub at gcc dot gnu.org 2015-06-26 20:30 ` jakub at gcc dot gnu.org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=bug-60172-4-BwkIxBXRmW@http.gcc.gnu.org/bugzilla/ \ --to=gcc-bugzilla@gcc.gnu.org \ --cc=gcc-bugs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).