From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR02-AM0-obe.outbound.protection.outlook.com (mail-am0eur02on2052.outbound.protection.outlook.com [40.107.247.52]) by sourceware.org (Postfix) with ESMTPS id 830513858423 for ; Wed, 5 Jul 2023 16:13:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 830513858423 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ReI8y0tDm76omElwJMNX4bzlFgn9YPQVxBR/Q3n95/8=; b=Qrw6MlyFjmiWGEFMnDBuVZAWXoq54j3yVUj4e7a85jMH2J1HERgEtQ0pgmD3uU5I818cm0R0P2cfqDUig4WL5jEz2QlnCh4whWAxGa3bEQ5qaI9ig8oAN10W02/GbdsKAnJ839Lz+bXi7NUyiSK4lVTwjKIbazPdRzxm+D18GV0= Received: from AS9PR06CA0205.eurprd06.prod.outlook.com (2603:10a6:20b:45d::31) by PAWPR08MB9053.eurprd08.prod.outlook.com (2603:10a6:102:341::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6544.24; Wed, 5 Jul 2023 16:13:46 +0000 Received: from AM7EUR03FT050.eop-EUR03.prod.protection.outlook.com (2603:10a6:20b:45d:cafe::e4) by AS9PR06CA0205.outlook.office365.com (2603:10a6:20b:45d::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6565.18 via Frontend Transport; Wed, 5 Jul 2023 16:13:46 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM7EUR03FT050.mail.protection.outlook.com (100.127.141.27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6565.24 via Frontend Transport; Wed, 5 Jul 2023 16:13:46 +0000 Received: ("Tessian outbound d6c4ee3ba1eb:v142"); Wed, 05 Jul 2023 16:13:46 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: dbc3303b10b4c513 X-CR-MTA-TID: 64aa7808 Received: from d44cb6c7b562.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id E153976A-CE41-4D8B-98DD-83C432BAD2F1.1; Wed, 05 Jul 2023 16:13:35 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id d44cb6c7b562.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Wed, 05 Jul 2023 16:13:35 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Pf/6Ds6vY6jBMH2COkL4raxFFcumSAT/RUxuZ9aTpb/ZrL4TDhrAbpN1VTMDnSekqhBSHkht23UUyl6Ktbvg2K6cU6ktSl5kPq2cqMVfNtxKPv5HQAOZR9SeubB28eAlPWSDsB5ZPB9jbqSekbz/zwhw5AAD6Jh/wW59XmUMDNhLSA9BspwnmR6yBCTn4tpP8Y6LJP9QsEXPzNpU88ZLiO44VcIIup/CP9n6Jn61xK5f3XkW+H7rECb1t1La1NnktqdEqbqthhaIKipLRyb2nIXHJFYCW+3yfW9oxoE0wPUiILznn/51W1K+C4csYhHB8yOXYlz1TBP+4oXDeWObLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ReI8y0tDm76omElwJMNX4bzlFgn9YPQVxBR/Q3n95/8=; b=LPxpcv87BwlbiCuVpyrsK+RFDAKMR76xiqhuN+Bo6WQOg2UGUQXCb8W6O7xYGEhrXTdJqHfom7fgLg7yvY2ipvJH7UjEQCvnY+bgO7b25+3BglYtNHUZTmSsBu0hlDcX1QkcEhalz09ht2gQNJ74YrWmggXnF7GxA0Z32zlm3rfu6uusIjTvmcwvwl8aABnbZVOBim0LN0upMlx4IOIac5xMSJYAMh7g9TAjyCG8Y14Vtg2w0ezMV0j92+4CHVhpIvCRCBaZ5hlOBu/S+KLRewkyzpMFACk6EkKqF7XYS6DW8fwVg1NkwQ0iQfavLnUA0cH8oZf04rW3OXAgAS60rQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ReI8y0tDm76omElwJMNX4bzlFgn9YPQVxBR/Q3n95/8=; b=Qrw6MlyFjmiWGEFMnDBuVZAWXoq54j3yVUj4e7a85jMH2J1HERgEtQ0pgmD3uU5I818cm0R0P2cfqDUig4WL5jEz2QlnCh4whWAxGa3bEQ5qaI9ig8oAN10W02/GbdsKAnJ839Lz+bXi7NUyiSK4lVTwjKIbazPdRzxm+D18GV0= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from DB9PR08MB6507.eurprd08.prod.outlook.com (2603:10a6:10:25a::6) by DBBPR08MB6122.eurprd08.prod.outlook.com (2603:10a6:10:20d::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6565.17; Wed, 5 Jul 2023 16:13:34 +0000 Received: from DB9PR08MB6507.eurprd08.prod.outlook.com ([fe80::2fd1:9380:86b1:467f]) by DB9PR08MB6507.eurprd08.prod.outlook.com ([fe80::2fd1:9380:86b1:467f%4]) with mapi id 15.20.6565.016; Wed, 5 Jul 2023 16:13:34 +0000 Message-ID: <1b7a0eac-724f-52fe-6bac-ee6c86136525@arm.com> Date: Wed, 5 Jul 2023 17:13:32 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 From: Stamatis Markianos-Wright Subject: Re: [PATCH 2/2] arm: Add support for MVE Tail-Predicated Low Overhead Loops To: "Andre Vieira (lists)" , "gcc-patches@gcc.gnu.org" Cc: Kyrylo Tkachov , Richard Earnshaw , ramana.gcc@gmail.com, "nickc@redhat.com" References: <7dbe3fde-5cdb-cd2d-cda4-16f02385906f@arm.com> Content-Language: en-US In-Reply-To: <7dbe3fde-5cdb-cd2d-cda4-16f02385906f@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-ClientProxiedBy: LNXP265CA0077.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:76::17) To DB9PR08MB6507.eurprd08.prod.outlook.com (2603:10a6:10:25a::6) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: DB9PR08MB6507:EE_|DBBPR08MB6122:EE_|AM7EUR03FT050:EE_|PAWPR08MB9053:EE_ X-MS-Office365-Filtering-Correlation-Id: 6d85e1ce-5467-4696-f724-08db7d72cfb2 X-LD-Processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Ania9yyNIY/SfzECXSJTNkTqbUdDsQn3NfYacxhNmaJ8cK1IT1aWxbabw9J6CxdH/CGsy0MQ0uhrb7XZs11cYIoXuxDAPEOI7/th7/INjVu6dTMnCwRb285hQhWaNUWEtpqwnFAlSv+zBnPJj6VtxqSkWF1AdQsNCPfC3abWi3QWN7Jj0KWnhQo7FcToRR0X0u21tgV30QTFai5Gakd0g2owWpG1SxnKRe9w7xN8gLKL6U0hMPQ8fWEGl8YK89mdqY7JGs9j/L5OlfTmpkbKXp8ODQfwPKvLMbFCXpYqWyC0eWWflI8p5/02ZLAZIp5E4gRG94mydTaIaCIt5mdj47c55NHLJ/j87vBrMVfmOtegpgeV+NgnyG25zO4x/vifjhuGCbY0CXsqwL3JD8ZdSbADxgKdck5GyRnZjN3y9bLqJPQWHQlglQwREdBs2I73Vnr+35GMcriXpuQIUZ/HR6/y2ubHV9xi+i2sGZmdWQz32h1ripvw7Mboc1eYMpcwzVtS//k2T7mYyQV2rXJDM631Ui2OZsxyPetMU7W5urdGokRCRM0ffwyiH09czDE1aUU8YdXEgH82X7iyFwBNcKeIea++7vhm6BHPcsAdPDwb2uNEcxoPZJkFMr9UPnQSgjszIMwJbPU5Ch3qakFlhLMpzHVlg3SsyLoAe2bbzkGE16qqCGNyTSmavx0gI9wi X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DB9PR08MB6507.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(136003)(346002)(376002)(366004)(396003)(39860400002)(451199021)(54906003)(478600001)(6486002)(110136005)(6506007)(26005)(53546011)(6512007)(2906002)(66946007)(316002)(41300700001)(66476007)(66556008)(4326008)(8676002)(8936002)(5660300002)(38100700002)(186003)(86362001)(31696002)(2616005)(83380400001)(36756003)(84970400001)(31686004)(66899021)(43740500002)(45980500001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6122 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM7EUR03FT050.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 3f3db616-e394-44af-992e-08db7d72c821 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: TpNYtyurekJLbmc0j7dhzg6h1/jRT41Gvluw03SLl+18YAFqq/mDQjaQW8NRFp1+wHzNejj8doNNmZUlBnqvaOqS2xW3KdH64cbYDdldQ4oUAu4jFUbY+P3B6CuybKyCiLLcKMsuQgk3IqmzhdM0Mf/IqhFs3P9oBvUtJgQ9WzmwEf2IC673XHppgeqPBD2tLZlc1kAUqU/DEOtk8fBVukCj5P3exfX1m0Bqq3tWqrIeSPX3gbMpmm97mTwLcwBLz/PEmW47aLnz4JgtWDQPVRnPb5mUMUXKjgYmqGVtPt194YIuZiOEHkjwJCmQJrLYq9za4WANLqcKFHNF5HrvRMdrjk1XoWc+jAZLUMXk5f/y/qZQZnfzZ2XPQ1stKq4h/0jZcsZW70QZMcDgaWApyxVlizHtVczxteMhLLTHBsmS4Yccg4ProJbsoeMDUrId4X64YRMyep3cCUoUSosnboZhlpeOwxXqpv0yf1cVjM+nBhazl1CP8TEPq8cmkQI6X9OrKqHzo6eTBSsWbIVNDhILUYANSO6k1OfF9JrxzfW1XfKxUgcPnjT3se9+RxfDLmSCuCnxgytHpQ9/gjaE1ovKwxIJppwnPD+NFy/uVWQo9YjDAeSudcIN66bRV76OW8IhFK7jzWXphVOWJ/WdEJZo3mCxZpI9kTfUWqDy+xzgGLPmttYigO9Keoob3+MaHK078C7ScAoeKGzIDsfHkQ7pyJwyvca5s1PhkgrBHzT+SkdHMQYWKcJ/jpMDFQmX5bI81cqvMlN+iDgPAXuzfs6kgnVJIxPK5j59Pq0mn0Tt5OmcyMGBttZ9XclJEPos X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(396003)(39860400002)(376002)(136003)(346002)(451199021)(40470700004)(46966006)(36840700001)(107886003)(36860700001)(41300700001)(316002)(6512007)(6486002)(186003)(83380400001)(84970400001)(53546011)(47076005)(336012)(31686004)(6506007)(26005)(2616005)(70586007)(40460700003)(110136005)(54906003)(82740400003)(478600001)(356005)(81166007)(66899021)(70206006)(4326008)(40480700001)(5660300002)(82310400005)(86362001)(8676002)(31696002)(8936002)(2906002)(36756003)(43740500002);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 05 Jul 2023 16:13:46.6632 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 6d85e1ce-5467-4696-f724-08db7d72cfb2 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM7EUR03FT050.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAWPR08MB9053 X-Spam-Status: No, score=-5.0 required=5.0 tests=BAYES_00,BODY_8BITS,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,KAM_DMARC_NONE,KAM_SHORT,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On 23/06/2023 11:23, Andre Vieira (lists) wrote: > +  if (insn != arm_mve_get_loop_vctp (body)) > +    { > > probably a good idea to invert the condition here and return false, > helps reducing the indenting in this function. Done, thanks > > > +    /* Starting from the current insn, scan backwards through the insn > +       chain until BB_HEAD: "for each insn in the BB prior to the > current". > +    */ > > There's a trailing whitespace after insn, but also I'd rewrite this > bit. The "for each insn in the BB prior to the current" is superfluous > and even confusing to me. How about: > "Scan backwards from the current INSN through the instruction chain > until the start of the basic block.  " Yes, agreed, it wasn't very clear. Done. > > >  I find 'that previous insn' to be confusing as you don't mention any > previous insn before. So how about something along the lines of: > 'If a previous insn defines a register that INSN uses then return true > if...' Done > > > Do we need to check: 'insn != prev_insn' ? Any reason why you can't > start the loop with: > 'for (rtx_insn *prev_insn = PREV_INSN (insn);' True! Done. > > Now I also found a case where things might go wrong in: > +        /* Look at all the DEFs of that previous insn: if one of them > is on > +           the same REG as our current insn, then recurse in order to > check > +           that insn's USEs.  If any of these insns return true as > +           MVE_VPT_UNPREDICATED_INSN_Ps, then the whole chain is > affected > +           by the change in behaviour from being placed in dlstp/letp > loop. > +        */ > +        df_ref prev_insn_defs = NULL; > +        FOR_EACH_INSN_DEF (prev_insn_defs, prev_insn) > +          { > +        if (DF_REF_REGNO (insn_uses) == DF_REF_REGNO (prev_insn_defs) > +            && insn != prev_insn > +            && body == BLOCK_FOR_INSN (prev_insn) > +            && !arm_mve_vec_insn_is_predicated_with_this_predicate > +             (insn, vctp_vpr_generated) > +            && arm_mve_check_df_chain_back_for_implic_predic > +             (prev_insn, vctp_vpr_generated)) > +          return true; > +          } > > The body == BLOCK_FOR_INSN (prev_insn) hinted me at it, if a def comes > from outside of the BB (so outside of the loop's body) then its by > definition unpredicated by vctp.  I think you want to check that if > prev_insn defines a register used by insn then return true if > prev_insn isn't in the same BB or has a chain that is not predicated, > i.e.: '!arm_mve_vec_insn_is_predicated_with_this_predicate (insn, > vctp_vpr_generated) && arm_mve_check_df_chain_back_for_implic_predic > prev_insn, vctp_vpr_generated))' you check body != BLOCK_FOR_INSN > (prev_insn)' Yes, you're right, this is vulnerable here. A neater fix to this (I think?) is to make the above REGNO_REG_SET_P more generic, so that it covers all scalar values and scalar ops, as well. Then it's a "if this insn in the loop has any input that originates outside the bb, then it's unsafe" check and the recursive loop backwards is only for the recursive "are any previous insns unsafe" > > > I also found some other issues, this currently loloops: > > uint16_t  test (uint16_t *a, int n) > { >   uint16_t res =0; >   while (n > 0) >     { >       mve_pred16_t p = vctp16q (n); >       uint16x8_t va = vldrhq_u16 (a); >       res = vaddvaq_u16 (res, va); >       res = vaddvaq_p_u16 (res, va, p); >       a += 8; >       n -= 8; >     } >   return res; > } > > But it shouldn't, this is because there's a lack of handling of across > vector instructions. Luckily in MVE all across vector instructions > have the side-effect that they write to a scalar register, even the > vshlcq instruction (it writes to a scalar carry output). Added support for them (you were right, there was some special handling needed!) > > Did this lead me to find an ICE with: > > uint16x8_t  test (uint16_t *a, int n) > { >   uint16x8_t res = vdupq_n_u16 (0); >   while (n > 0) >     { >       uint16_t carry = 0; >       mve_pred16_t p = vctp16q (n); >       uint16x8_t va = vldrhq_u16 (a); >       res = vshlcq_u16 (va, &carry, 1); >       res = vshlcq_m_u16 (res, &carry, 1 , p); >       a += 8; >       n -= 8; >     } >   return res; > } > > This is because: > +          /* If the USE is outside the loop body bb, or it is inside, > but > +         is an unpredicated store to memory.  */ > +          if (BLOCK_FOR_INSN (insn) != BLOCK_FOR_INSN (next_use_insn) > +         || (arm_mve_vec_insn_is_unpredicated_or_uses_other_predicate > +             (next_use_insn, vctp_vpr_generated) > +            && mve_memory_operand > +            (SET_DEST (single_set (next_use_insn)), > +             GET_MODE (SET_DEST (single_set (next_use_insn)))))) > +        return true; > > Assumes single_set doesn't return 0. Thanks! That is indeed correct. Corrected this by having a utility function to scan insn operands and check against mve_memory_operand that supports any number of operands/SETs in the insn > > Let's deal with these issues and I'll continue to review. > > On 15/06/2023 12:47, Stamatis Markianos-Wright via Gcc-patches wrote: >>      Hi all, >> >>      This is the 2/2 patch that contains the functional changes needed >>      for MVE Tail Predicated Low Overhead Loops.  See my previous email >>      for a general introduction of MVE LOLs. >> >>      This support is added through the already existing loop-doloop >>      mechanisms that are used for non-MVE dls/le looping. >> >>      Mid-end changes are: >> >>      1) Relax the loop-doloop mechanism in the mid-end to allow for >>         decrement numbers other that -1 and for `count` to be an >>         rtx containing a simple REG (which in this case will contain >>         the number of elements to be processed), rather >>         than an expression for calculating the number of iterations. >>      2) Added a new df utility function: `df_bb_regno_only_def_find` >> that >>         will return the DEF of a REG only if it is DEF-ed once within >> the >>         basic block. >> >>      And many things in the backend to implement the above optimisation: >> >>      3)  Implement the `arm_predict_doloop_p` target hook to instruct >> the >>          mid-end about Low Overhead Loops (MVE or not), as well as >>          `arm_loop_unroll_adjust` which will prevent unrolling of any >> loops >>          that are valid for becoming MVE Tail_Predicated Low Overhead >> Loops >>          (unrolling can transform a loop in ways that invalidate the >> dlstp/ >>          letp tranformation logic and the benefit of the dlstp/letp loop >>          would be considerably higher than that of unrolling) >>      4)  Appropriate changes to the define_expand of doloop_end, new >>          patterns for dlstp and letp, new iterators,  unspecs, etc. >>      5) `arm_mve_loop_valid_for_dlstp` and a number of checking >> functions: >>         * `arm_mve_dlstp_check_dec_counter` >>         * `arm_mve_dlstp_check_inc_counter` >>         * `arm_mve_check_reg_origin_is_num_elems` >>         * `arm_mve_check_df_chain_back_for_implic_predic` >>         * `arm_mve_check_df_chain_fwd_for_implic_predic_impact` >>         This all, in smoe way or another, are running checks on the loop >>         structure in order to determine if the loop is valid for >> dlstp/letp >>         transformation. >>      6) `arm_attempt_dlstp_transform`: (called from the define_expand of >>          doloop_end) this function re-checks for the loop's >> suitability for >>          dlstp/letp transformation and then implements it, if possible. >>      7) Various utility functions: >>         *`arm_mve_get_vctp_lanes` to map >>         from vctp unspecs to number of lanes, and >> `arm_get_required_vpr_reg` >>         to check an insn to see if it requires the VPR or not. >>         * `arm_mve_get_loop_vctp` >>         * `arm_mve_get_vctp_lanes` >>         * `arm_emit_mve_unpredicated_insn_to_seq` >>         * `arm_get_required_vpr_reg` >>         * `arm_get_required_vpr_reg_param` >>         * `arm_get_required_vpr_reg_ret_val` >>         * `arm_mve_vec_insn_is_predicated_with_this_predicate` >>         * `arm_mve_vec_insn_is_unpredicated_or_uses_other_predicate` >> >>      No regressions on arm-none-eabi with various targets and on >>      aarch64-none-elf. Thoughts on getting this into trunk? >> >>      Thank you, >>      Stam Markianos-Wright >> >>      gcc/ChangeLog: >> >>              * config/arm/arm-protos.h (arm_target_insn_ok_for_lob): >> Rename to... >>              (arm_target_bb_ok_for_lob): ...this >>              (arm_attempt_dlstp_transform): New. >>              * config/arm/arm.cc (TARGET_LOOP_UNROLL_ADJUST): New. >>              (TARGET_PREDICT_DOLOOP_P): New. >>              (arm_block_set_vect): >>              (arm_target_insn_ok_for_lob): Rename from >> arm_target_insn_ok_for_lob. >>              (arm_target_bb_ok_for_lob): New. >>              (arm_mve_get_vctp_lanes): New. >>              (arm_get_required_vpr_reg): New. >>              (arm_get_required_vpr_reg_param): New. >>              (arm_get_required_vpr_reg_ret_val): New. >>              (arm_mve_get_loop_vctp): New. >> (arm_mve_vec_insn_is_unpredicated_or_uses_other_predicate): New. >> (arm_mve_vec_insn_is_predicated_with_this_predicate): New. >>              (arm_mve_check_df_chain_back_for_implic_predic): New. >> (arm_mve_check_df_chain_fwd_for_implic_predic_impact): New. >>              (arm_mve_check_reg_origin_is_num_elems): New. >>              (arm_mve_dlstp_check_inc_counter): New. >>              (arm_mve_dlstp_check_dec_counter): New. >>              (arm_mve_loop_valid_for_dlstp): New. >>              (arm_predict_doloop_p): New. >>              (arm_loop_unroll_adjust): New. >>              (arm_emit_mve_unpredicated_insn_to_seq): New. >>              (arm_attempt_dlstp_transform): New. >>              * config/arm/iterators.md (DLSTP): New. >>              (mode1): Add DLSTP mappings. >>              * config/arm/mve.md (*predicated_doloop_end_internal): New. >>              (dlstp_insn): New. >>              * config/arm/thumb2.md (doloop_end): Update for MVE LOLs. >>              * config/arm/unspecs.md: New unspecs. >>              * df-core.cc (df_bb_regno_only_def_find): New. >>              * df.h (df_bb_regno_only_def_find): New. >>              * loop-doloop.cc (doloop_condition_get): Relax conditions. >>              (doloop_optimize): Add support for elementwise LoLs. >> >>      gcc/testsuite/ChangeLog: >> >>              * gcc.target/arm/lob.h: Update framework. >>              * gcc.target/arm/lob1.c: Likewise. >>              * gcc.target/arm/lob6.c: Likewise. >>              * gcc.target/arm/mve/dlstp-compile-asm.c: New test. >>              * gcc.target/arm/mve/dlstp-int16x8.c: New test. >>              * gcc.target/arm/mve/dlstp-int32x4.c: New test. >>              * gcc.target/arm/mve/dlstp-int64x2.c: New test. >>              * gcc.target/arm/mve/dlstp-int8x16.c: New test. >>              * gcc.target/arm/mve/dlstp-invalid-asm.c: New test.