From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 26217 invoked by alias); 30 Sep 2013 09:06:59 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 26146 invoked by uid 48); 30 Sep 2013 09:06:54 -0000 From: "amker.cheng at gmail dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug rtl-optimization/50749] Auto-inc-dec does not find subsequent contiguous mem accesses Date: Mon, 30 Sep 2013 09:06:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: rtl-optimization X-Bugzilla-Version: 4.7.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: enhancement X-Bugzilla-Who: amker.cheng at gmail dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2013-09/txt/msg02032.txt.bz2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50749 bin.cheng changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |amker.cheng at gmail dot com --- Comment #15 from bin.cheng --- There must be another scenario for the example, and in this case example: int test_0 (char* p, int c) { int r = 0; r += *p++; r += *p++; r += *p++; return r; } should be translated into sth like: //... ldrb [rx] ldrb [rx+1] ldrb [rx+2] add rx, rx, #3 //... This way all loads are independent and can be issued on super scalar machine. Actuall for targets like arm which supports post-increment constant (other than size of memory access), it can be further changed into: //... ldrb [rx], #3 ldrb [rx-2] ldrb [rx-1] //... For now auto-increment pass can't do this optimization. I once have a patch for this but benchmark shows the case is not common. This case is common especially after loop unrolling and rtl passes deliberately break down long dependence of RX, which I think is right.