From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=Vat6=CQ=arm.com=Tamar.Christina@sourceware.org>
Received: from EUR04-VI1-obe.outbound.protection.outlook.com (mail-vi1eur04on2055.outbound.protection.outlook.com [40.107.8.55])
	by sourceware.org (Postfix) with ESMTPS id 2A65A3853D13
	for <gcc-patches@gcc.gnu.org>; Wed, 28 Jun 2023 13:49:18 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2A65A3853D13
Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;
 s=selector2-armh-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=qjKrTBvzvHR3uUOHBpjwDwpsGTEkrxCum1GT8OOrXWc=;
 b=xe6ujLdpj1+EzHHmWzc2Fe62tID/mvyirrTuJwoNgOoWun/n5CojkjIqXFKQ5JSf3ZPKyVjnWxl3lP9Fy057d5cWutiEMc8KX6/UGLwiYXYvlg0WTvMJuEd4WjUD+kIzzMYkSVYNVp025n2KMHJqe5uCnSonfR2OohmvHkrsU64=
Received: from DUZPR01CA0058.eurprd01.prod.exchangelabs.com
 (2603:10a6:10:469::12) by PAWPR08MB10281.eurprd08.prod.outlook.com
 (2603:10a6:102:367::8) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.26; Wed, 28 Jun
 2023 13:49:15 +0000
Received: from DBAEUR03FT053.eop-EUR03.prod.protection.outlook.com
 (2603:10a6:10:469:cafe::26) by DUZPR01CA0058.outlook.office365.com
 (2603:10a6:10:469::12) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.34 via Frontend
 Transport; Wed, 28 Jun 2023 13:49:15 +0000
X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123)
 smtp.mailfrom=arm.com; dkim=pass (signature was verified)
 header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com;
Received-SPF: Pass (protection.outlook.com: domain of arm.com designates
 63.35.35.123 as permitted sender) receiver=protection.outlook.com;
 client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com;
 pr=C
Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by
 DBAEUR03FT053.mail.protection.outlook.com (100.127.142.121) with Microsoft
 SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.6544.20 via Frontend Transport; Wed, 28 Jun 2023 13:49:15 +0000
Received: ("Tessian outbound c08fa2e31830:v142"); Wed, 28 Jun 2023 13:49:15 +0000
X-CheckRecipientChecked: true
X-CR-MTA-CID: 6d865725d07fa32d
X-CR-MTA-TID: 64aa7808
Received: from 4056ee67cb68.1
	by 64aa7808-outbound-1.mta.getcheckrecipient.com id 89EC7A12-0A1B-4B6F-94F8-5D095F259C69.1;
	Wed, 28 Jun 2023 13:49:04 +0000
Received: from EUR04-DB3-obe.outbound.protection.outlook.com
    by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 4056ee67cb68.1
    (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384);
    Wed, 28 Jun 2023 13:49:04 +0000
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=IKzDOpoZPvc2nRZy+XbKqfuHg8wudEUc2MyU9eZVhF3y+dzRsr9sejXTgfYCamG7L0Nh1/9l48oocyfms5lKgCqBq22+wHvtgLpBu29h5g3FGKFIRQUREbuQf0BaThaekpDP+Cr2ZXpsDx2v5WW1Gk6ozw5fIIOAhsa+mENnkcXvEHpKuY1voLMZZifnPaUhJha5bOlL3X8AS7yLL7wqZUgWFQtZEPZChIIqQ5Dk9oG2GXmdS34kfDu+OaVaN2/RCkoXNmnc7HnqHxYtpcaRDTCa4hTcfpGRve2cne8fuWnZ0CDXuyXXlgpKdC28nCrr6IjdD1bH7cfUxB9g94okjw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
 bh=qjKrTBvzvHR3uUOHBpjwDwpsGTEkrxCum1GT8OOrXWc=;
 b=glfotqLkk4xBhsBNLxV1WxTMOzu3SuEZwzkWx1toWDKP5aGLN3jpEDDlwlXThXmCkjGBYukL9A10lnJbgdsl20lIPle8DPQgcat1NGKwBXMTTYC7DTqSoJ3iEM5X+7YoJ1NcL/79QPt4RzfFN+ipl9CfEHJqtq97v8AVFT/XNm3BumKf0uSUxc9Ggpokd7sq7mpQImCJrDKd68cNCTy24+C36VzjDALS7n+i0yuIoohyRfclTIjK6EbwjW3DMybWkdtnyIyul00dcqDhNnZwI6RT11v/8zN8Gsa8JES9YOvRB1H4tml6kgeK/cNm82hrFZZhrCCA0hjE/aAFXsiMSA==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass
 header.d=arm.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;
 s=selector2-armh-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=qjKrTBvzvHR3uUOHBpjwDwpsGTEkrxCum1GT8OOrXWc=;
 b=xe6ujLdpj1+EzHHmWzc2Fe62tID/mvyirrTuJwoNgOoWun/n5CojkjIqXFKQ5JSf3ZPKyVjnWxl3lP9Fy057d5cWutiEMc8KX6/UGLwiYXYvlg0WTvMJuEd4WjUD+kIzzMYkSVYNVp025n2KMHJqe5uCnSonfR2OohmvHkrsU64=
Authentication-Results-Original: dkim=none (message not signed)
 header.d=none;dmarc=none action=none header.from=arm.com;
Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17)
 by AM7PR08MB5398.eurprd08.prod.outlook.com (2603:10a6:20b:103::16) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6521.26; Wed, 28 Jun
 2023 13:49:02 +0000
Received: from VI1PR08MB5325.eurprd08.prod.outlook.com
 ([fe80::2301:1cde:cfe7:eaf0]) by VI1PR08MB5325.eurprd08.prod.outlook.com
 ([fe80::2301:1cde:cfe7:eaf0%6]) with mapi id 15.20.6521.026; Wed, 28 Jun 2023
 13:49:02 +0000
Date: Wed, 28 Jun 2023 14:48:58 +0100
From: Tamar Christina <tamar.christina@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: nd@arm.com, Richard.Earnshaw@arm.com, Marcus.Shawcroft@arm.com,
	Kyrylo.Tkachov@arm.com, richard.sandiford@arm.com
Subject: [PATCH 17/19]AArch64 Add optimization for vector cbranch combining
 SVE and Advanced SIMD 
Message-ID: <ZJw6SvUWBaXlpQoL@arm.com>
Content-Type: multipart/mixed; boundary="P4dpPdWF+qUNLvcs"
Content-Disposition: inline
In-Reply-To: <patch-17494-tamar@arm.com>
X-ClientProxiedBy: LO3P123CA0002.GBRP123.PROD.OUTLOOK.COM
 (2603:10a6:600:ba::7) To VI1PR08MB5325.eurprd08.prod.outlook.com
 (2603:10a6:803:13e::17)
MIME-Version: 1.0
X-MS-TrafficTypeDiagnostic:
	VI1PR08MB5325:EE_|AM7PR08MB5398:EE_|DBAEUR03FT053:EE_|PAWPR08MB10281:EE_
X-MS-Office365-Filtering-Correlation-Id: 565ff38a-a4c4-4ca0-c8c0-08db77de764f
x-checkrecipientrouted: true
NoDisclaimer: true
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam-Untrusted: BCL:0;
X-Microsoft-Antispam-Message-Info-Original:
 nJwiKxUaFSiKCL2mEv0gLgud1QdmtDXBT31iCMjHZnbR2GBt8hy9QrGOiLoQWryPOar7VFK/5jEm2tu4zATqm0cMV3H7l7o4RuKn2qqZIFq7kdGYaW/DC2gAL84Yr4Qt23oWrkSVf2gHPBjYTv0lNjws/WUGxUZo60RtfDOa4gudFNywUJKyeLXzqqSsnkWA/j/np5yQCBOxur/4BX8WZJwGiMSAmKAyIIwfv/A0ScP1kMeGAnORyBZqhLBU/0c8hu/y00WHeCtWiwnbIbCJZ/Bt/gLaCKcGIkFQVUlHsY8udVVHt+/6Ow+a3n1gezBDL98MlOE9r/fkxMb62HkSnACWkJzIrNzqxOxAhriPESspfiwAzIZlTxLSfA9Df7M6eXKB3C177YLqosDtQ7eKUKvMmtdSMOI2iXA1ce4nr9LeY5NZpy2vGCPiX3ZnO/xA9WOVEQ8FtK3BjfU3ZrBpmb5PUQFnqaOX0EYcJ1C9w8Gvi+NUgz+Nc0fKlq8NTiQKsqxWMkO15JnqIBAHp+sUj4OhTZY36X5LQuDRKjT99L3Kj2yce6PyjNb6z8k82RDI8+Ng/404Y2rwgWzytfPO1q1Oftod875z7Qa+ss9J1Ot9Dg8ZntAEjS92nJB0rkTdGGdENWpY3pz7PyDvjHFH8A==
X-Forefront-Antispam-Report-Untrusted:
 CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(136003)(376002)(346002)(39860400002)(366004)(396003)(451199021)(6486002)(4743002)(33964004)(26005)(2616005)(478600001)(6666004)(186003)(2906002)(6506007)(83380400001)(44144004)(5660300002)(44832011)(36756003)(316002)(235185007)(8676002)(66946007)(30864003)(41300700001)(38100700002)(4326008)(86362001)(6916009)(66476007)(66556008)(8936002)(6512007)(84970400001)(2700100001);DIR:OUT;SFP:1101;
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM7PR08MB5398
Original-Authentication-Results: dkim=none (message not signed)
 header.d=none;dmarc=none action=none header.from=arm.com;
X-EOPAttributedMessage: 0
X-MS-Exchange-Transport-CrossTenantHeadersStripped:
 DBAEUR03FT053.eop-EUR03.prod.protection.outlook.com
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id-Prvs:
	de2d1bea-a4ee-450c-77c0-08db77de6e66
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info:
	RL7ahzjK418acGItf2po7NdNJLsABOyogXtlG26ifNDl/Sg7wChFYF07nuRYMMPg8+kWHhB+6Yda685g5VWtMgSNCm0H9Xy692emNDxWm8QIa07jU6O42RvS4PR0W9w6vlqq1ZGprTQNmpjPnY2Wr4IUxbOB10zUi51JT8rsG03Ok8Tq8pGuTB/1Vi1wzgbYmxa7LBk3eXa3skXZlukdtQZ6OuGVggJ12ZD9S/H2nTwSOIlMxoS6Y/oJaz3Wj9zvEVHN0+2e9TDkNftgvw78OK/PFWijUj6qDQsVpnKPv9Sw6wGGiDU9tSKdkEMbpKQrzlYOcYeR4yuF9eR3b93mcW875sX12lrpyj+ThGjk5KUU3pH0oQMdW2an6EtmrpaMZrX++GGZixibwPGbYmN1OTyWVtSKzlHEgFUHqxukNnLU0Fqpw2DYMWJkLnIzsrGY5S1pEqZRT3PCO0EIzU3WMWCg1HsLN1ksY5leiN98DtkzGEQxhazYvffgIBWIUeUw1JaZtR4kW3dNlkgvykpNV3buMqVeQ8owobYj+JffEzNifU5n9kGecqNPq1I4dnuZcM/NF8v9CAxcztk9RDdQLSg1Rm6j4Ro1vP90HWjySt8+X9DoClndb7SgH4jW9gsr59DKQP6MICz+eru5Mbg08i3vfNj5UHQhzb0hVkHNhYkRNBkSHEWYjQuwywpon2JGaHUk54/zonMSnnXvfnBrLkG2uT3ogRNEd2x6R2kWZ6aE+EnnC5ilNlltPEab+nRJguK5x848F5HL3LxaymYl8LCfEL8l7em7KcutHWiN9gg=
X-Forefront-Antispam-Report:
	CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230028)(4636009)(136003)(376002)(396003)(346002)(39860400002)(451199021)(46966006)(40470700004)(36840700001)(41300700001)(30864003)(2906002)(40460700003)(6486002)(33964004)(44144004)(82310400005)(356005)(82740400003)(6666004)(81166007)(2616005)(336012)(6506007)(6512007)(4743002)(83380400001)(26005)(186003)(47076005)(36860700001)(40480700001)(86362001)(478600001)(316002)(70206006)(70586007)(36756003)(4326008)(6916009)(8936002)(8676002)(5660300002)(235185007)(84970400001)(44832011)(2700100001);DIR:OUT;SFP:1101;
X-OriginatorOrg: arm.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 28 Jun 2023 13:49:15.4286
 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 565ff38a-a4c4-4ca0-c8c0-08db77de764f
X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com]
X-MS-Exchange-CrossTenant-AuthSource:
	DBAEUR03FT053.eop-EUR03.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAWPR08MB10281
X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_ASCII_DIVIDERS,KAM_DMARC_NONE,KAM_LOTSOFHASH,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org
List-Id: <gcc-patches.gcc.gnu.org>

--P4dpPdWF+qUNLvcs
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline

Hi All,

Advanced SIMD lacks flag setting vector comparisons which SVE adds.  Since machines
with SVE also support Advanced SIMD we can use the SVE comparisons to perform the
operation in cases where SVE codegen is allowed, but the vectorizer has decided
to generate Advanced SIMD because of loop costing.

e.g. for

void f1 (int x)
{
  for (int i = 0; i < N; i++)
    {
      b[i] += a[i];
      if (a[i] != x)
	break;
    }
}

We currently generate:

        cmeq    v31.4s, v31.4s, v28.4s
        uminp   v31.4s, v31.4s, v31.4s
        fmov    x5, d31
        cbz     x5, .L2

and after this patch:

        ptrue   p7.b, vl16
        ...
        cmpne   p15.s, p7/z, z31.s, z28.s
        b.any   .L2

Because we need to lift the predicate creation to outside of the loop we need to
expand the predicate early,  however in the cbranch expansion we don't see the
outer compare which we need to consume.

For this reason the expansion is two fold, when expanding the cbranch we emit an
SVE predicated comparison and later on during combine we match the SVE and NEON
comparison while also consuming the ptest.

Unfortunately *aarch64_pred_cmpne<mode><EQL:code>_neon_ptest is needed because
for some reason combine destroys the NOT and transforms it into a plus and -1.

For the straight SVE ones, we seem to fail to eliminate the ptest in these cases
but that's a separate optimization

Test show that I'm missing a few, but before I write the patterns for them, are
these OK?

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* config/aarch64/aarch64-simd.md (cbranch<mode>4): Update with SVE.
	* config/aarch64/aarch64-sve.md
	(*aarch64_pred_cmp<UCOMPARISONS:cmp_op><mode><EQL:code>_neon_ptest,
	*aarch64_pred_cmpeq<mode><EQL:code>_neon_ptest,
	*aarch64_pred_cmpne<mode><EQL:code>_neon_ptest): New.
	(aarch64_ptest<mode>): Rename to...
	(@aarch64_ptest<mode>): ... This.
	* genemit.cc: Include rtx-vector-builder.h.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/sve/vect-early-break-cbranch_1.c: New test.
	* gcc.target/aarch64/sve/vect-early-break-cbranch_2.c: New test.

--- inline copy of patch -- 
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b1a2c617d7d4106ab725d53a5d0b5c2fb61a0c78..75cb5d6f7f92b70fed8762fe64e23f0c05a99c99 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3843,31 +3843,59 @@ (define_expand "cbranch<mode>4"
   "TARGET_SIMD"
 {
   auto code = GET_CODE (operands[0]);
-  rtx tmp = operands[1];
 
-  /* If comparing against a non-zero vector we have to do a comparison first
-     so we can have a != 0 comparison with the result.  */
-  if (operands[2] != CONST0_RTX (<MODE>mode))
-    emit_insn (gen_vec_cmp<mode><mode> (tmp, operands[0], operands[1],
-					operands[2]));
-
-  /* For 64-bit vectors we need no reductions.  */
-  if (known_eq (128, GET_MODE_BITSIZE (<MODE>mode)))
+  /* If SVE is available, lets borrow some instructions.  We will optimize
+     these further later in combine.  */
+  if (TARGET_SVE)
     {
-      /* Always reduce using a V4SI.  */
-      rtx reduc = gen_lowpart (V4SImode, tmp);
-      rtx res = gen_reg_rtx (V4SImode);
-      emit_insn (gen_aarch64_umaxpv4si (res, reduc, reduc));
-      emit_move_insn (tmp, gen_lowpart (<MODE>mode, res));
+      machine_mode full_mode = aarch64_full_sve_mode (<VEL>mode).require ();
+      rtx in1 = lowpart_subreg (full_mode, operands[1], <MODE>mode);
+      rtx in2 = lowpart_subreg (full_mode, operands[2], <MODE>mode);
+
+      machine_mode pred_mode = aarch64_sve_pred_mode (full_mode);
+      rtx_vector_builder builder (VNx16BImode, 16, 2);
+      for (unsigned int i = 0; i < 16; ++i)
+	builder.quick_push (CONST1_RTX (BImode));
+      for (unsigned int i = 0; i < 16; ++i)
+	builder.quick_push (CONST0_RTX (BImode));
+      rtx ptrue = force_reg (VNx16BImode, builder.build ());
+      rtx cast_ptrue = gen_lowpart (pred_mode, ptrue);
+      rtx ptrue_flag = gen_int_mode (SVE_KNOWN_PTRUE, SImode);
+
+      rtx tmp = gen_reg_rtx (pred_mode);
+      aarch64_expand_sve_vec_cmp_int (tmp, code, in1, in2);
+      emit_insn (gen_aarch64_ptest (pred_mode, ptrue, cast_ptrue, ptrue_flag, tmp));
+      operands[1] = gen_rtx_REG (CC_NZCmode, CC_REGNUM);
+      operands[2] = const0_rtx;
     }
+  else
+    {
+      rtx tmp = operands[1];
 
-  rtx val = gen_reg_rtx (DImode);
-  emit_move_insn (val, gen_lowpart (DImode, tmp));
+      /* If comparing against a non-zero vector we have to do a comparison first
+	 so we can have a != 0 comparison with the result.  */
+      if (operands[2] != CONST0_RTX (<MODE>mode))
+	emit_insn (gen_vec_cmp<mode><mode> (tmp, operands[0], operands[1],
+					    operands[2]));
 
-  rtx cc_reg = aarch64_gen_compare_reg (code, val, const0_rtx);
-  rtx cmp_rtx = gen_rtx_fmt_ee (code, DImode, cc_reg, const0_rtx);
-  emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[3]));
-  DONE;
+      /* For 64-bit vectors we need no reductions.  */
+      if (known_eq (128, GET_MODE_BITSIZE (<MODE>mode)))
+	{
+	  /* Always reduce using a V4SI.  */
+	  rtx reduc = gen_lowpart (V4SImode, tmp);
+	  rtx res = gen_reg_rtx (V4SImode);
+	  emit_insn (gen_aarch64_umaxpv4si (res, reduc, reduc));
+	  emit_move_insn (tmp, gen_lowpart (<MODE>mode, res));
+	}
+
+      rtx val = gen_reg_rtx (DImode);
+      emit_move_insn (val, gen_lowpart (DImode, tmp));
+
+      rtx cc_reg = aarch64_gen_compare_reg (code, val, const0_rtx);
+      rtx cmp_rtx = gen_rtx_fmt_ee (code, DImode, cc_reg, const0_rtx);
+      emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[3]));
+      DONE;
+    }
 })
 
 ;; Avdanced SIMD lacks a vector != comparison, but this is a quite common
diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md
index da5534c3e32b3a8819c57a26582cfa5e22e63753..0e10e497e073ee7cfa4025d9adb19076c1615e87 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -8059,6 +8059,105 @@ (define_insn "*aarch64_pred_cmp<cmp_op><mode>_wide_ptest"
   "cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.d"
 )
 
+;; Predicated integer comparisons over Advanced SIMD arguments in which only
+;; the flags result is interesting.
+(define_insn "*aarch64_pred_cmp<UCOMPARISONS:cmp_op><mode><EQL:code>_neon_ptest"
+  [(set (reg:CC_NZC CC_REGNUM)
+	(unspec:CC_NZC
+	  [(match_operand:VNx16BI 1 "register_operand" "Upl")
+	   (match_operand 4)
+	   (match_operand:SI 5 "aarch64_sve_ptrue_flag")
+	   (unspec:VNx4BI
+	     [(match_operand:VNx4BI 6 "register_operand" "Upl")
+	      (match_operand:SI 7 "aarch64_sve_ptrue_flag")
+	      (EQL:VNx4BI
+		(subreg:SVE_FULL_BHSI
+		 (neg:<V128>
+		  (UCOMPARISONS:<V128>
+		   (match_operand:<V128> 2 "register_operand" "w")
+		   (match_operand:<V128> 3 "aarch64_simd_reg_or_zero" "w"))) 0)
+		(match_operand:SVE_FULL_BHSI 8 "aarch64_simd_imm_zero" "Dz"))]
+	     UNSPEC_PRED_Z)]
+	  UNSPEC_PTEST))
+   (clobber (match_scratch:VNx4BI 0 "=Upa"))]
+  "TARGET_SVE
+   && aarch64_sve_same_pred_for_ptest_p (&operands[4], &operands[6])"
+{
+  operands[2] = lowpart_subreg (<MODE>mode, operands[2], <V128>mode);
+  operands[3] = lowpart_subreg (<MODE>mode, operands[3], <V128>mode);
+  if (EQ == <EQL:CODE>)
+    std::swap (operands[2], operands[3]);
+
+  return "cmp<UCOMPARISONS:cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>";
+}
+)
+
+;; Predicated integer comparisons over Advanced SIMD arguments in which only
+;; the flags result is interesting.
+(define_insn "*aarch64_pred_cmpeq<mode><EQL:code>_neon_ptest"
+  [(set (reg:CC_NZC CC_REGNUM)
+	(unspec:CC_NZC
+	  [(match_operand:VNx16BI 1 "register_operand" "Upl")
+	   (match_operand 4)
+	   (match_operand:SI 5 "aarch64_sve_ptrue_flag")
+	   (unspec:VNx4BI
+	     [(match_operand:VNx4BI 6 "register_operand" "Upl")
+	      (match_operand:SI 7 "aarch64_sve_ptrue_flag")
+	      (EQL:VNx4BI
+		(subreg:SVE_FULL_BHSI
+		 (neg:<V128>
+		  (eq:<V128>
+		   (match_operand:<V128> 2 "register_operand" "w")
+		   (match_operand:<V128> 3 "aarch64_simd_reg_or_zero" "w"))) 0)
+		(match_operand:SVE_FULL_BHSI 8 "aarch64_simd_imm_zero" "Dz"))]
+	     UNSPEC_PRED_Z)]
+	  UNSPEC_PTEST))
+   (clobber (match_scratch:VNx4BI 0 "=Upa"))]
+  "TARGET_SVE
+   && aarch64_sve_same_pred_for_ptest_p (&operands[4], &operands[6])"
+{
+  operands[2] = lowpart_subreg (<MODE>mode, operands[2], <V128>mode);
+  operands[3] = lowpart_subreg (<MODE>mode, operands[3], <V128>mode);
+  if (EQ == <EQL:CODE>)
+    std::swap (operands[2], operands[3]);
+
+  return "cmpeq\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>";
+}
+)
+
+;; Same as the above but version for == and !=
+(define_insn "*aarch64_pred_cmpne<mode><EQL:code>_neon_ptest"
+  [(set (reg:CC_NZC CC_REGNUM)
+	(unspec:CC_NZC
+	  [(match_operand:VNx16BI 1 "register_operand" "Upl")
+	   (match_operand 4)
+	   (match_operand:SI 5 "aarch64_sve_ptrue_flag")
+	   (unspec:VNx4BI
+	     [(match_operand:VNx4BI 6 "register_operand" "Upl")
+	      (match_operand:SI 7 "aarch64_sve_ptrue_flag")
+	      (EQL:VNx4BI
+		(subreg:SVE_FULL_BHSI
+		 (plus:<V128>
+		  (eq:<V128>
+		   (match_operand:<V128> 2 "register_operand" "w")
+		   (match_operand:<V128> 3 "aarch64_simd_reg_or_zero" "w"))
+		  (match_operand:<V128> 9 "aarch64_simd_imm_minus_one" "i")) 0)
+		(match_operand:SVE_FULL_BHSI 8 "aarch64_simd_imm_zero" "Dz"))]
+	     UNSPEC_PRED_Z)]
+	  UNSPEC_PTEST))
+   (clobber (match_scratch:VNx4BI 0 "=Upa"))]
+  "TARGET_SVE
+   && aarch64_sve_same_pred_for_ptest_p (&operands[4], &operands[6])"
+{
+  operands[2] = lowpart_subreg (<MODE>mode, operands[2], <V128>mode);
+  operands[3] = lowpart_subreg (<MODE>mode, operands[3], <V128>mode);
+  if (EQ == <EQL:CODE>)
+    std::swap (operands[2], operands[3]);
+
+  return "cmpne\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>";
+}
+)
+
 ;; -------------------------------------------------------------------------
 ;; ---- [INT] While tests
 ;; -------------------------------------------------------------------------
@@ -8537,7 +8636,7 @@ (define_expand "cbranch<mode>4"
 )
 
 ;; See "Description of UNSPEC_PTEST" above for details.
-(define_insn "aarch64_ptest<mode>"
+(define_insn "@aarch64_ptest<mode>"
   [(set (reg:CC_NZC CC_REGNUM)
 	(unspec:CC_NZC [(match_operand:VNx16BI 0 "register_operand" "Upa")
 			(match_operand 1)
diff --git a/gcc/genemit.cc b/gcc/genemit.cc
index 1ce0564076d8b0d39542f49dd51e5df01cc83c35..73309ca00ec0aa3cd76c85e04535bac44cb2f354 100644
--- a/gcc/genemit.cc
+++ b/gcc/genemit.cc
@@ -906,6 +906,7 @@ from the machine description file `md'.  */\n\n");
   printf ("#include \"tm-constrs.h\"\n");
   printf ("#include \"ggc.h\"\n");
   printf ("#include \"target.h\"\n\n");
+  printf ("#include \"rtx-vector-builder.h\"\n\n");
 
   /* Read the machine description.  */
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_1.c b/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..c281cfccbe12f0ac8c01ede563dbe325237902c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_1.c
@@ -0,0 +1,117 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
+
+#define N 640
+int a[N] = {0};
+int b[N] = {0};
+
+
+/*
+** f1:
+**	...
+**	cmpgt	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f1 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] > 0)
+	break;
+    }
+}
+
+/*
+** f2:
+**	...
+**	cmpge	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f2 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] >= 0)
+	break;
+    }
+}
+
+/*
+** f3:
+**	...
+**	cmpeq	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f3 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] == 0)
+	break;
+    }
+}
+
+/*
+** f4:
+**	...
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f4 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] != 0)
+	break;
+    }
+}
+
+/*
+** f5:
+**	...
+**	cmplt	p[0-9]+.s, p7/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	.L[0-9]+
+**	...
+*/
+void f5 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] < 0)
+	break;
+    }
+}
+
+/*
+** f6:
+**	...
+**	cmple	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f6 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] <= 0)
+	break;
+    }
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_2.c b/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..f1ca3eafc5ae33393a7df9b5e40fa3420a79bfc2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_2.c
@@ -0,0 +1,114 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 --param=aarch64-autovec-preference=1" } */
+/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
+
+#define N 640
+int a[N] = {0};
+int b[N] = {0};
+
+
+/*
+** f1:
+**	...
+**	cmgt	v[0-9]+.4s, v[0-9]+.4s, #0
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f1 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] > 0)
+	break;
+    }
+}
+
+/*
+** f2:
+**	...
+**	cmge	v[0-9]+.4s, v[0-9]+.4s, #0
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f2 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] >= 0)
+	break;
+    }
+}
+
+/*
+** f3:
+**	...
+**	cmpeq	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, z[0-9]+.s
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f3 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] == 0)
+	break;
+    }
+}
+
+/*
+** f4:
+**	...
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, z[0-9]+.s
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f4 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] != 0)
+	break;
+    }
+}
+
+/*
+** f5:
+**	...
+**	cmlt	v[0-9]+.4s, v[0-9]+.4s, #0
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f5 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] < 0)
+	break;
+    }
+}
+
+/*
+** f6:
+**	...
+**	cmle	v[0-9]+.4s, v[0-9]+.4s, #0
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f6 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] <= 0)
+	break;
+    }
+}


-- 

--P4dpPdWF+qUNLvcs
Content-Type: text/plain; charset=utf-8
Content-Disposition: attachment; filename="rb17511.patch"

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index b1a2c617d7d4106ab725d53a5d0b5c2fb61a0c78..75cb5d6f7f92b70fed8762fe64e23f0c05a99c99 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3843,31 +3843,59 @@ (define_expand "cbranch<mode>4"
   "TARGET_SIMD"
 {
   auto code = GET_CODE (operands[0]);
-  rtx tmp = operands[1];
 
-  /* If comparing against a non-zero vector we have to do a comparison first
-     so we can have a != 0 comparison with the result.  */
-  if (operands[2] != CONST0_RTX (<MODE>mode))
-    emit_insn (gen_vec_cmp<mode><mode> (tmp, operands[0], operands[1],
-					operands[2]));
-
-  /* For 64-bit vectors we need no reductions.  */
-  if (known_eq (128, GET_MODE_BITSIZE (<MODE>mode)))
+  /* If SVE is available, lets borrow some instructions.  We will optimize
+     these further later in combine.  */
+  if (TARGET_SVE)
     {
-      /* Always reduce using a V4SI.  */
-      rtx reduc = gen_lowpart (V4SImode, tmp);
-      rtx res = gen_reg_rtx (V4SImode);
-      emit_insn (gen_aarch64_umaxpv4si (res, reduc, reduc));
-      emit_move_insn (tmp, gen_lowpart (<MODE>mode, res));
+      machine_mode full_mode = aarch64_full_sve_mode (<VEL>mode).require ();
+      rtx in1 = lowpart_subreg (full_mode, operands[1], <MODE>mode);
+      rtx in2 = lowpart_subreg (full_mode, operands[2], <MODE>mode);
+
+      machine_mode pred_mode = aarch64_sve_pred_mode (full_mode);
+      rtx_vector_builder builder (VNx16BImode, 16, 2);
+      for (unsigned int i = 0; i < 16; ++i)
+	builder.quick_push (CONST1_RTX (BImode));
+      for (unsigned int i = 0; i < 16; ++i)
+	builder.quick_push (CONST0_RTX (BImode));
+      rtx ptrue = force_reg (VNx16BImode, builder.build ());
+      rtx cast_ptrue = gen_lowpart (pred_mode, ptrue);
+      rtx ptrue_flag = gen_int_mode (SVE_KNOWN_PTRUE, SImode);
+
+      rtx tmp = gen_reg_rtx (pred_mode);
+      aarch64_expand_sve_vec_cmp_int (tmp, code, in1, in2);
+      emit_insn (gen_aarch64_ptest (pred_mode, ptrue, cast_ptrue, ptrue_flag, tmp));
+      operands[1] = gen_rtx_REG (CC_NZCmode, CC_REGNUM);
+      operands[2] = const0_rtx;
     }
+  else
+    {
+      rtx tmp = operands[1];
 
-  rtx val = gen_reg_rtx (DImode);
-  emit_move_insn (val, gen_lowpart (DImode, tmp));
+      /* If comparing against a non-zero vector we have to do a comparison first
+	 so we can have a != 0 comparison with the result.  */
+      if (operands[2] != CONST0_RTX (<MODE>mode))
+	emit_insn (gen_vec_cmp<mode><mode> (tmp, operands[0], operands[1],
+					    operands[2]));
 
-  rtx cc_reg = aarch64_gen_compare_reg (code, val, const0_rtx);
-  rtx cmp_rtx = gen_rtx_fmt_ee (code, DImode, cc_reg, const0_rtx);
-  emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[3]));
-  DONE;
+      /* For 64-bit vectors we need no reductions.  */
+      if (known_eq (128, GET_MODE_BITSIZE (<MODE>mode)))
+	{
+	  /* Always reduce using a V4SI.  */
+	  rtx reduc = gen_lowpart (V4SImode, tmp);
+	  rtx res = gen_reg_rtx (V4SImode);
+	  emit_insn (gen_aarch64_umaxpv4si (res, reduc, reduc));
+	  emit_move_insn (tmp, gen_lowpart (<MODE>mode, res));
+	}
+
+      rtx val = gen_reg_rtx (DImode);
+      emit_move_insn (val, gen_lowpart (DImode, tmp));
+
+      rtx cc_reg = aarch64_gen_compare_reg (code, val, const0_rtx);
+      rtx cmp_rtx = gen_rtx_fmt_ee (code, DImode, cc_reg, const0_rtx);
+      emit_jump_insn (gen_condjump (cmp_rtx, cc_reg, operands[3]));
+      DONE;
+    }
 })
 
 ;; Avdanced SIMD lacks a vector != comparison, but this is a quite common
diff --git a/gcc/config/aarch64/aarch64-sve.md b/gcc/config/aarch64/aarch64-sve.md
index da5534c3e32b3a8819c57a26582cfa5e22e63753..0e10e497e073ee7cfa4025d9adb19076c1615e87 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -8059,6 +8059,105 @@ (define_insn "*aarch64_pred_cmp<cmp_op><mode>_wide_ptest"
   "cmp<cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.d"
 )
 
+;; Predicated integer comparisons over Advanced SIMD arguments in which only
+;; the flags result is interesting.
+(define_insn "*aarch64_pred_cmp<UCOMPARISONS:cmp_op><mode><EQL:code>_neon_ptest"
+  [(set (reg:CC_NZC CC_REGNUM)
+	(unspec:CC_NZC
+	  [(match_operand:VNx16BI 1 "register_operand" "Upl")
+	   (match_operand 4)
+	   (match_operand:SI 5 "aarch64_sve_ptrue_flag")
+	   (unspec:VNx4BI
+	     [(match_operand:VNx4BI 6 "register_operand" "Upl")
+	      (match_operand:SI 7 "aarch64_sve_ptrue_flag")
+	      (EQL:VNx4BI
+		(subreg:SVE_FULL_BHSI
+		 (neg:<V128>
+		  (UCOMPARISONS:<V128>
+		   (match_operand:<V128> 2 "register_operand" "w")
+		   (match_operand:<V128> 3 "aarch64_simd_reg_or_zero" "w"))) 0)
+		(match_operand:SVE_FULL_BHSI 8 "aarch64_simd_imm_zero" "Dz"))]
+	     UNSPEC_PRED_Z)]
+	  UNSPEC_PTEST))
+   (clobber (match_scratch:VNx4BI 0 "=Upa"))]
+  "TARGET_SVE
+   && aarch64_sve_same_pred_for_ptest_p (&operands[4], &operands[6])"
+{
+  operands[2] = lowpart_subreg (<MODE>mode, operands[2], <V128>mode);
+  operands[3] = lowpart_subreg (<MODE>mode, operands[3], <V128>mode);
+  if (EQ == <EQL:CODE>)
+    std::swap (operands[2], operands[3]);
+
+  return "cmp<UCOMPARISONS:cmp_op>\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>";
+}
+)
+
+;; Predicated integer comparisons over Advanced SIMD arguments in which only
+;; the flags result is interesting.
+(define_insn "*aarch64_pred_cmpeq<mode><EQL:code>_neon_ptest"
+  [(set (reg:CC_NZC CC_REGNUM)
+	(unspec:CC_NZC
+	  [(match_operand:VNx16BI 1 "register_operand" "Upl")
+	   (match_operand 4)
+	   (match_operand:SI 5 "aarch64_sve_ptrue_flag")
+	   (unspec:VNx4BI
+	     [(match_operand:VNx4BI 6 "register_operand" "Upl")
+	      (match_operand:SI 7 "aarch64_sve_ptrue_flag")
+	      (EQL:VNx4BI
+		(subreg:SVE_FULL_BHSI
+		 (neg:<V128>
+		  (eq:<V128>
+		   (match_operand:<V128> 2 "register_operand" "w")
+		   (match_operand:<V128> 3 "aarch64_simd_reg_or_zero" "w"))) 0)
+		(match_operand:SVE_FULL_BHSI 8 "aarch64_simd_imm_zero" "Dz"))]
+	     UNSPEC_PRED_Z)]
+	  UNSPEC_PTEST))
+   (clobber (match_scratch:VNx4BI 0 "=Upa"))]
+  "TARGET_SVE
+   && aarch64_sve_same_pred_for_ptest_p (&operands[4], &operands[6])"
+{
+  operands[2] = lowpart_subreg (<MODE>mode, operands[2], <V128>mode);
+  operands[3] = lowpart_subreg (<MODE>mode, operands[3], <V128>mode);
+  if (EQ == <EQL:CODE>)
+    std::swap (operands[2], operands[3]);
+
+  return "cmpeq\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>";
+}
+)
+
+;; Same as the above but version for == and !=
+(define_insn "*aarch64_pred_cmpne<mode><EQL:code>_neon_ptest"
+  [(set (reg:CC_NZC CC_REGNUM)
+	(unspec:CC_NZC
+	  [(match_operand:VNx16BI 1 "register_operand" "Upl")
+	   (match_operand 4)
+	   (match_operand:SI 5 "aarch64_sve_ptrue_flag")
+	   (unspec:VNx4BI
+	     [(match_operand:VNx4BI 6 "register_operand" "Upl")
+	      (match_operand:SI 7 "aarch64_sve_ptrue_flag")
+	      (EQL:VNx4BI
+		(subreg:SVE_FULL_BHSI
+		 (plus:<V128>
+		  (eq:<V128>
+		   (match_operand:<V128> 2 "register_operand" "w")
+		   (match_operand:<V128> 3 "aarch64_simd_reg_or_zero" "w"))
+		  (match_operand:<V128> 9 "aarch64_simd_imm_minus_one" "i")) 0)
+		(match_operand:SVE_FULL_BHSI 8 "aarch64_simd_imm_zero" "Dz"))]
+	     UNSPEC_PRED_Z)]
+	  UNSPEC_PTEST))
+   (clobber (match_scratch:VNx4BI 0 "=Upa"))]
+  "TARGET_SVE
+   && aarch64_sve_same_pred_for_ptest_p (&operands[4], &operands[6])"
+{
+  operands[2] = lowpart_subreg (<MODE>mode, operands[2], <V128>mode);
+  operands[3] = lowpart_subreg (<MODE>mode, operands[3], <V128>mode);
+  if (EQ == <EQL:CODE>)
+    std::swap (operands[2], operands[3]);
+
+  return "cmpne\t%0.<Vetype>, %1/z, %2.<Vetype>, %3.<Vetype>";
+}
+)
+
 ;; -------------------------------------------------------------------------
 ;; ---- [INT] While tests
 ;; -------------------------------------------------------------------------
@@ -8537,7 +8636,7 @@ (define_expand "cbranch<mode>4"
 )
 
 ;; See "Description of UNSPEC_PTEST" above for details.
-(define_insn "aarch64_ptest<mode>"
+(define_insn "@aarch64_ptest<mode>"
   [(set (reg:CC_NZC CC_REGNUM)
 	(unspec:CC_NZC [(match_operand:VNx16BI 0 "register_operand" "Upa")
 			(match_operand 1)
diff --git a/gcc/genemit.cc b/gcc/genemit.cc
index 1ce0564076d8b0d39542f49dd51e5df01cc83c35..73309ca00ec0aa3cd76c85e04535bac44cb2f354 100644
--- a/gcc/genemit.cc
+++ b/gcc/genemit.cc
@@ -906,6 +906,7 @@ from the machine description file `md'.  */\n\n");
   printf ("#include \"tm-constrs.h\"\n");
   printf ("#include \"ggc.h\"\n");
   printf ("#include \"target.h\"\n\n");
+  printf ("#include \"rtx-vector-builder.h\"\n\n");
 
   /* Read the machine description.  */
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_1.c b/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_1.c
new file mode 100644
index 0000000000000000000000000000000000000000..c281cfccbe12f0ac8c01ede563dbe325237902c9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_1.c
@@ -0,0 +1,117 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
+
+#define N 640
+int a[N] = {0};
+int b[N] = {0};
+
+
+/*
+** f1:
+**	...
+**	cmpgt	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f1 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] > 0)
+	break;
+    }
+}
+
+/*
+** f2:
+**	...
+**	cmpge	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f2 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] >= 0)
+	break;
+    }
+}
+
+/*
+** f3:
+**	...
+**	cmpeq	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f3 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] == 0)
+	break;
+    }
+}
+
+/*
+** f4:
+**	...
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f4 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] != 0)
+	break;
+    }
+}
+
+/*
+** f5:
+**	...
+**	cmplt	p[0-9]+.s, p7/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	.L[0-9]+
+**	...
+*/
+void f5 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] < 0)
+	break;
+    }
+}
+
+/*
+** f6:
+**	...
+**	cmple	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	ptest	p[0-9]+, p[0-9]+.b
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f6 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] <= 0)
+	break;
+    }
+}
+
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_2.c b/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_2.c
new file mode 100644
index 0000000000000000000000000000000000000000..f1ca3eafc5ae33393a7df9b5e40fa3420a79bfc2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/vect-early-break-cbranch_2.c
@@ -0,0 +1,114 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 --param=aarch64-autovec-preference=1" } */
+/* { dg-final { check-function-bodies "**" "" "" { target lp64 } } } */
+
+#define N 640
+int a[N] = {0};
+int b[N] = {0};
+
+
+/*
+** f1:
+**	...
+**	cmgt	v[0-9]+.4s, v[0-9]+.4s, #0
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f1 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] > 0)
+	break;
+    }
+}
+
+/*
+** f2:
+**	...
+**	cmge	v[0-9]+.4s, v[0-9]+.4s, #0
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f2 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] >= 0)
+	break;
+    }
+}
+
+/*
+** f3:
+**	...
+**	cmpeq	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, z[0-9]+.s
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f3 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] == 0)
+	break;
+    }
+}
+
+/*
+** f4:
+**	...
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, z[0-9]+.s
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f4 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] != 0)
+	break;
+    }
+}
+
+/*
+** f5:
+**	...
+**	cmlt	v[0-9]+.4s, v[0-9]+.4s, #0
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f5 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] < 0)
+	break;
+    }
+}
+
+/*
+** f6:
+**	...
+**	cmle	v[0-9]+.4s, v[0-9]+.4s, #0
+**	cmpne	p[0-9]+.s, p[0-9]+/z, z[0-9]+.s, #0
+**	b.any	\.L[0-9]+
+**	...
+*/
+void f6 ()
+{
+  for (int i = 0; i < N; i++)
+    {
+      b[i] += a[i];
+      if (a[i] <= 0)
+	break;
+    }
+}


--P4dpPdWF+qUNLvcs--