From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 32463 invoked by alias); 5 Feb 2015 13:05:11 -0000 Mailing-List: contact gcc-bugs-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-bugs-owner@gcc.gnu.org Received: (qmail 32427 invoked by uid 48); 5 Feb 2015 13:05:08 -0000 From: "vekumar at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/64946] New: For Aarch64, vectorization with "abs" instruction is not hapenning with vector elements of char/short type. Date: Thu, 05 Feb 2015 13:05:00 -0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 5.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: vekumar at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status bug_severity priority component assigned_to reporter Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-SW-Source: 2015-02/txt/msg00455.txt.bz2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64946 Bug ID: 64946 Summary: For Aarch64, vectorization with "abs" instruction is not hapenning with vector elements of char/short type. Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vekumar at gcc dot gnu.org For the below test case. signed char a[100],b[100]; void absolute_s8 (void) { int i; for (i=0; i<16; i++) a[i] = (b[i] > 0 ? b[i] : -b[i]); }; gcc version 5.0.0 20150203 (experimental) (GCC) with -O3 -S on aarch64-none-linux-gnu generates the following assembly absolute_s8: adrp x1, b adrp x0, a add x1, x1, :lo12:b add x0, x0, :lo12:a ldr q0, [x1] <== loads vector of 16 char elements sshll v1.8h, v0.8b, 0 <== sshll2 v0.8h, v0.16b, 0 <== sshll v3.4s, v1.4h, 0 <== sshll v2.4s, v0.4h, 0 <== sshll2 v1.4s, v1.8h, 0 <== sshll2 v0.4s, v0.8h, 0 <== promotes every element to "int" abs v3.4s, v3.4s <== Performs abs as vector of ints. abs v2.4s, v2.4s abs v1.4s, v1.4s abs v0.4s, v0.4s xtn v4.4h, v3.4s xtn2 v4.8h, v1.4s xtn v1.4h, v2.4s xtn2 v1.8h, v0.4s xtn v0.8b, v4.8h xtn2 v0.16b, v1.8h str q0, [x0] ret Vectorization is done in INT or SI mode although Aarch64 supports abs v0.16b v0.16b. Expected code absolute_s8: adrp x1, b adrp x0, a add x1, x1, :lo12:b add x0, x0, :lo12:a ldr q0, [x1] <== loads vector of 16 char elements abs v0.16b, v0.16b <== abs in vector of chars str q0, [x0] ret