From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 5499 invoked by alias); 18 Jul 2012 08:20:54 -0000 Received: (qmail 5483 invoked by uid 22791); 18 Jul 2012 08:20:52 -0000 X-SWARE-Spam-Status: No, hits=-5.4 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,KHOP_RCVD_TRUST,KHOP_THREADED,RCVD_IN_DNSWL_LOW,RCVD_IN_HOSTKARMA_YE,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail-gh0-f175.google.com (HELO mail-gh0-f175.google.com) (209.85.160.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 18 Jul 2012 08:20:38 +0000 Received: by ghbz2 with SMTP id z2so1342769ghb.20 for ; Wed, 18 Jul 2012 01:20:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-system-of-record:x-gm-message-state; bh=I7Xp8dqyp+IqvQQyc+FsMhHwWLxxDxT13O6JopJ4f4E=; b=B0OPMCb6L0DdHcVT+ZNZHv4N/1jBwMSFxwkboz5e8LguRWRapuvdHP8H91CCeY1OBn Mn4ZHFNF9ZAl1QJgyV8pYfKNbHEK6/CRfRQJxwQaSw3wfzIGZkyH3UI98jCvSB3n10ei qeWJMM6bCGXacXUwHtIGJ6ntcWWZzB1drpwqDw8HiK7EBBOJUzU1fub9oHZhb5VrxzrA COg9tLfcxhzrrn+t4GXSpWkvUb6QAKSnp/pHhG+JhNEcdrRhUQV1UGjNJH1oDoFb+AHc Y59Ya1/+GUssNyHRjicBaV8r6L9aoAfxVsfYILpGrsPbBj2IQtpCP5B/jmFul9D5hJZR Y/8w== Received: by 10.50.41.165 with SMTP id g5mr1197200igl.13.1342599637530; Wed, 18 Jul 2012 01:20:37 -0700 (PDT) MIME-Version: 1.0 Received: by 10.50.41.165 with SMTP id g5mr1197196igl.13.1342599637408; Wed, 18 Jul 2012 01:20:37 -0700 (PDT) Received: by 10.231.180.83 with HTTP; Wed, 18 Jul 2012 01:20:37 -0700 (PDT) In-Reply-To: References: Date: Wed, 18 Jul 2012 08:20:00 -0000 Message-ID: Subject: Re: [ARM Patch 1/3]PR53189: optimizations of 64bit logic operation with constant From: Carrot Wei To: Ramana Radhakrishnan Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset=ISO-8859-1 X-System-Of-Record: true X-Gm-Message-State: ALoCoQkE9OPrZh96Nla4GmFTT/D7CFMvdfgySXCcj8jUrVIRIKzOdgHCp/3IpsdffWZ4bMte5rdwqOxSRtyluq0Ohgiic76wfGfXf2Nce/xyyEFgkBQi9dPEk9rPSsIAB7Uzg6iESHX4FmX+TJ45AWTPmJBFtIxImFrMits4Z5zR0DN6W9GgxhHJYCKX/p3XZ4HXY01uHYaq X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2012-07/txt/msg00803.txt.bz2 On Tue, Jul 17, 2012 at 9:47 PM, Ramana Radhakrishnan wrote: > Carrot, > > Sorry about the delayed response. > > On 3 July 2012 12:28, Carrot Wei wrote: >> On Thu, Jun 28, 2012 at 12:14 AM, Ramana Radhakrishnan >> wrote: >>> On 28 May 2012 11:08, Carrot Wei wrote: >>>> Hi >>>> >>>> This is the second part of the patches that deals with 64bit and. It directly >>>> extends the patterns anddi3, anddi3_insn and anddi3_neon to handle 64bit >>>> constant operands. >>>> >>> >>> Comments about const_di_ok_for_op still apply from my review of your add patch. >>> >>> However I don't see why and /ior / xor with constants that have either >>> the low or high parts set can't be expanded directly into ands of >>> subregs with moves of zero's or the original value especially if you >>> aren't looking at doing 64 bit operations in neon .With Neon being >>> used for 64 bit arithmetic it gets more interesting. >>> >>> Finally this should target PR target/53189. >>> >> >> Hi Ramana >> >> Thanks for the review. Following is the updated patch according to >> your comments. > > You've missed answering this part of my review :) > >>> However I don't see why and /ior / xor with constants that have either >>> the low or high parts set can't be expanded directly into ands of >>> subregs with moves of zero's or the original value especially if you >>> aren't looking at doing 64 bit operations in neon .With Neon being >>> used for 64 bit arithmetic it gets more interesting. > It has been handled by the const_ok_for_dimode_op and the output part of corresponding SI mode insn. Let's take the IOR case as an example. In the const_ok_for_dimode_op patch --- arm.c (revision 189278) +++ arm.c (working copy) @@ -2524,6 +2524,16 @@ case PLUS: return arm_not_operand (hi, SImode) && arm_add_operand (lo, SImode); + case IOR: + if ((const_ok_for_arm (hi_val) || (hi_val == 0xFFFFFFFF)) + && (const_ok_for_arm (lo_val) || (lo_val == 0xFFFFFFFF))) + return 1; + if (TARGET_THUMB2 + && (const_ok_for_arm (lo_val) || const_ok_for_arm (~lo_val)) + && (const_ok_for_arm (hi_val) || const_ok_for_arm (~hi_val))) + return 1; + return 0; + default: return 0; } The 0xFFFFFFFF is not valid arm mode immediate, but ior 0XFFFFFFFF results in all 1's, so it is also allowed in an iordi3 insn. And the patch in iorsi3_insn pattern explicitly check the all 0's and all 1's cases, and output either a simple register mov instruction or instruction mov all 1's to the destination. @@ -2902,10 +2915,29 @@ (ior:SI (match_operand:SI 1 "s_register_operand" "%r,r,r") (match_operand:SI 2 "reg_or_int_operand" "rI,K,?n")))] "TARGET_32BIT" - "@ - orr%?\\t%0, %1, %2 - orn%?\\t%0, %1, #%B2 - #" + "* + { + if (CONST_INT_P (operands[2])) + { + HOST_WIDE_INT i = INTVAL (operands[2]) & 0xFFFFFFFF; + if (i == 0xFFFFFFFF) + return \"mvn%?\\t%0, #0\"; + if (i == 0) + { + if (!rtx_equal_p (operands[0], operands[1])) + return \"mov%?\\t%0, %1\"; + else + return \"\"; + } + } + + switch (which_alternative) + { + case 0: return \"orr%?\\t%0, %1, %2\"; + case 1: return \"orn%?\\t%0, %1, #%B2\"; + case 2: return \"#\"; + } + }" "TARGET_32BIT && GET_CODE (operands[2]) == CONST_INT && !(const_ok_for_arm (INTVAL (operands[2])) > Is there any reason why we don't split such cases earlier into the > constituent moves and the associated ands earlier than reload in the > non-Neon case? > I referenced pattern arm_adddi3 which is split after reload_completed. And the pattern arm_subdi3 is even not split. I guess they keep the original constant longer may benefit some optimizations involving constants. But it may also lose flexibility in other cases. > In addition, it would be good to have some tests for Thumb2 that deal > with the replicated constants for Thumb2 . Can you have a look at > creating some tests similar to the thumb2*replicated*.c tests in > gcc.target/arm but for 64 bit constants ? > The new test cases involving thumb2 replicated constants are added as following. thanks Carrot 2012-07-18 Wei Guozhi PR target/53189 * gcc.target/arm/pr53189-10.c: New testcase. * gcc.target/arm/pr53189-11.c: New testcase. * gcc.target/arm/pr53189-12.c: New testcase. Index: pr53189-10.c =================================================================== --- pr53189-10.c (revision 0) +++ pr53189-10.c (revision 0) @@ -0,0 +1,9 @@ +/* { dg-options "-mthumb -O2" } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-final { scan-assembler-not "mov" } } */ +/* { dg-final { scan-assembler "and.*#-16843010" } } */ + +void t0p(long long * p) +{ + *p &= 0x9fefefefe; +} Index: pr53189-11.c =================================================================== --- pr53189-11.c (revision 0) +++ pr53189-11.c (revision 0) @@ -0,0 +1,9 @@ +/* { dg-options "-mthumb -O2" } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-final { scan-assembler-not "mov" } } */ +/* { dg-final { scan-assembler "eor.*#-1426019584" } } */ + +void t0p(long long * p) +{ + *p ^= 0x7ab00ab00; +} Index: pr53189-12.c =================================================================== --- pr53189-12.c (revision 0) +++ pr53189-12.c (revision 0) @@ -0,0 +1,9 @@ +/* { dg-options "-mthumb -O2" } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-final { scan-assembler-not "mov" } } */ +/* { dg-final { scan-assembler "orr.*#13435085" } } */ + +void t0p(long long * p) +{ + *p |= 0x500cd00cd; +}