From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 325 invoked by alias); 14 Oct 2014 02:24:06 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 32630 invoked by uid 48); 14 Oct 2014 02:23:54 -0000 From: "congh at google dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/63530] New: GCC generates incorrect aligned store on ARM after the loop is unrolled. Date: Tue, 14 Oct 2014 02:24:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: congh at google dot com X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter attachments.created Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2014-10/txt/msg01033.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63530 Bug ID: 63530 Summary: GCC generates incorrect aligned store on ARM after the loop is unrolled. Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: congh at google dot com Created attachment 33710 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33710&action=edit assembly When compile the code shown below using GCC 5.0 for ARM with the following options: -O2 -ftree-vectorize -march=armv7-a -mfpu=neon -funroll-loops --param=max-completely-peeled-insns=400 // The code: typedef struct { unsigned char map[256]; int i; } A, *AP; void* calloc(int, int); AP foo(int n) { AP b = calloc(1, sizeof(A)); int i; for (i = n; i < 256; i++) b->map[i] = i; return b; } A instruction vst1.64 {d0-d1}, [r2:64] is generated, which is an aligned store with 8 bytes alignment requirement. However this requirement cannot be satisfied as the loop is not peeled for alignment, and the start address on the array is unknown at compile time. I have attached the generated assembly code here.