From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.20]) by sourceware.org (Postfix) with ESMTPS id 770EF388459D for ; Wed, 15 May 2024 07:47:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 770EF388459D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 770EF388459D Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=198.175.65.20 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1715759269; cv=pass; b=eGEkZ91bbujMuK0X5fMRFPsWIdaLzOm9Oyj7TSgj8JwYlMh9wgNXGXTqmbY306VM0U/NcUeK0YraM954Z6Hv1qYMf2cDrggdyYlUHTimGzu1hNPDeYw6SzXGyDdNyh8JDpqvLxzyKAzJZ0J0dikSTaAd5dyjcNs65LWbmpvxXMM= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1715759269; c=relaxed/simple; bh=i1gXIs43nmUmUS0L0FJLNgfNSG4/vElq+QuPHiAu8OU=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=DHP0urdBx80fjK/GC08Vr5KLTWI/MpRT0ZrcBebNBPHw7wpTOaNW8xt7trXV2I69k2zly4IWG7MyUYG7lQVBkqp08GArESvKZFRN5+jLMl/rsnZPHUbFSLO5qVp/8cHiEdhhW81xB/nATtUhs/iXmKFRaIDtDIu5Hb7gCWkCPa0= ARC-Authentication-Results: i=2; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715759266; x=1747295266; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=i1gXIs43nmUmUS0L0FJLNgfNSG4/vElq+QuPHiAu8OU=; b=aHJs1vtxXC0cVdKO7MADyoQ7A9WnobEoXXsqIUR3pTEOk1lQ/NB/SExX /urTKGFO/m3aGlEkg0KK3UXsOgU5IZKhc3V5nSgwVjE5D6w+ZjHKlB1jz 9yIQHBxGud1/uSDB5gmElt36zAdwNFNWV4co8syxKU2S4fVBPWYCRr4Fl 9fmZuaMLmjxjY3sW9blgf/BWAUDkzy6QD46kHUf2zojl1Cq7XYgM41zOs h9MkDKPfJZn74IvmL5XKdo5LWRz0gpB7mnFsCh284Fwk9WBP59hybblyD LzmCGBmAR6w/9k2EoTEy2xtWVjUgpbjXHwqnIqiADKiXlLoFh+JnR+LpT Q==; X-CSE-ConnectionGUID: ZSRCvqZYR6CmuDxhhG6Eow== X-CSE-MsgGUID: JCV6CbSdSUSVH0yc72Mo7w== X-IronPort-AV: E=McAfee;i="6600,9927,11073"; a="11623392" X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="11623392" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by orvoesa112.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 May 2024 00:47:24 -0700 X-CSE-ConnectionGUID: reKphNOJRpyCxSgaLrbf/A== X-CSE-MsgGUID: Q0PT/22CSBCpl3he1lf48g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,161,1712646000"; d="scan'208";a="35663581" Received: from orsmsx603.amr.corp.intel.com ([10.22.229.16]) by orviesa003.jf.intel.com with ESMTP/TLS/AES256-GCM-SHA384; 15 May 2024 00:47:24 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Wed, 15 May 2024 00:47:23 -0700 Received: from orsedg603.ED.cps.intel.com (10.7.248.4) by orsmsx610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39 via Frontend Transport; Wed, 15 May 2024 00:47:23 -0700 Received: from NAM11-CO1-obe.outbound.protection.outlook.com (104.47.56.168) by edgegateway.intel.com (134.134.137.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Wed, 15 May 2024 00:47:23 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lIcY2VTySZvY5q974arfOIDlo9/sgldl2/ACyK7K9mtJbUi8EnvCY3F4vhxIpa12dkmxptgoRmaHQ0jdT49QdnJn/OO3F5dmMIJbScdXCms9sjWh1lHPR6Jom91yuwrm7PEeFlNiLZDlr8bUp8e0P/xZcHK2/sFjZQfyGdpZj8wtiIlmQ9EY/xH+rQbzCGO/Y6P3NUNS+AlOAFZIZeFVViEWzHj3NpObdMLEYKqz7LabpK/w1dTp9wFwRMw2el4m7mB++D06ibOiTiBQhkUvBkECfuI9yU87/ombDXvePOmLHJ9E28/tIKD1LDLT9kpM61EM6FRrMV6w/SBgbXfVew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9B4c+7puWSqRle3uQSpUjFdx9lx1H3SRchjMTu6gBkg=; b=YYpQmxZwbWiXTjfvhOcfo9yP8ucnH3ZCgbQDoTxGu6Xt450sOrGYOXZ3xf34NSpoIZvmZr1waGNUlB7Ex1RihewjAbU0f/zCsNY0iHBnh2vS0ts0U4DzEZQnN6vpAt7yRsGJhmc2LwSPBRXRPpxbXk3YvRfHLBLAysWHwfZyrcdEHHzaYSv9jY4qQtaPKzCm+tTHMZuqBfNAeUsWDMgt/XyBrJg0EZC43GvWGQScJPVTv0FuOLVVhfQlpQaTDRqTJiim2e3S5N0LvTkwZWaBd7c+zbSeHUR6wtTnPRuKBYauFBOuP8vEBNgM2spnzd+OIJyiYEr8DV562j5qKm7RtA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from DM4PR11MB5487.namprd11.prod.outlook.com (2603:10b6:5:39f::22) by CH3PR11MB7392.namprd11.prod.outlook.com (2603:10b6:610:145::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7587.27; Wed, 15 May 2024 07:47:21 +0000 Received: from DM4PR11MB5487.namprd11.prod.outlook.com ([fe80::6150:d792:f331:4e3f]) by DM4PR11MB5487.namprd11.prod.outlook.com ([fe80::6150:d792:f331:4e3f%5]) with mapi id 15.20.7587.026; Wed, 15 May 2024 07:47:21 +0000 From: "Kong, Lingling" To: "gcc-patches@gcc.gnu.org" CC: "Liu, Hongtao" , "Kong, Lingling" , Uros Bizjak Subject: [PATCH 8/8] [APX NF] Support APX NF for lzcnt/tzcnt/popcnt Thread-Topic: [PATCH 8/8] [APX NF] Support APX NF for lzcnt/tzcnt/popcnt Thread-Index: AQHappYskLgOBvQ8eUO7e3SwdiroSrGX6w6Q Date: Wed, 15 May 2024 07:47:21 +0000 Message-ID: References: <20240515070226.3760873-1-lingling.kong@intel.com> <20240515070226.3760873-8-lingling.kong@intel.com> In-Reply-To: <20240515070226.3760873-8-lingling.kong@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: DM4PR11MB5487:EE_|CH3PR11MB7392:EE_ x-ms-office365-filtering-correlation-id: 931b4a5b-5c42-46a4-72aa-08dc74b340e2 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0;ARA:13230031|366007|376005|1800799015|38070700009; x-microsoft-antispam-message-info: =?us-ascii?Q?TqTG4leBixKDsMsJ0fdkznG+lNmgN7pnfEeUWbngjbEFXzijWmJn14ZVDVto?= =?us-ascii?Q?SoT96OH3cousJSFjYLzURv/f5ddU2psRrPX3gjwL6MG2pPEEjGZS2TYN5/xy?= =?us-ascii?Q?vH/bUWdxqly2fXVv7kOvlZ3jQws0mbb6EyNqZVTcOm6El5WjJ9DTr+oqElq+?= =?us-ascii?Q?60LoPR7piSaHVNvPlBlrXycKd49y509TW7maIqW6d2pUbVtvRt45q+2zmEPf?= =?us-ascii?Q?TrWV23BLINXi0IvcHrIKxx5bFFuSc3yp0+xZuT8ggI1/CKKi2iCCZmHzrXWd?= =?us-ascii?Q?2KII8nMk3oodnZTphEz9XcW7StnrDpyHizIzT2nyt2YZgu9L22tKhU/FC0/2?= =?us-ascii?Q?l+wUGc3yZRSwjiS9sjTzuITVRR3unPw91+UF9CHA4BMSKrfqP5q+3lAKdmVl?= =?us-ascii?Q?mhMbwc4rNxtBI8d/H3UmPo+pK9TmK+51pPU6dTehPvBVwZT31ijA/TE03zM9?= =?us-ascii?Q?U3aL/b+cf8gk24KfWppygCCBdue3LWrOOIfU6nMsml7m6s0HobrlvWczj/kb?= =?us-ascii?Q?vfhIHy+K7K/W7hI27fMZABo6PmKRPnthv9Qb+pXAbruM5x+by1SsV9QemZ5M?= =?us-ascii?Q?XFSOZnubH7WIw5yLrguvL/KdzthMivKHzNQfDHzDXZnPGJdRktn0CAWus9FE?= =?us-ascii?Q?I9ebpl4bfr+F37pGA//SOOADwYzVW7RnZ3+unuLYIgCUrxuyuwXBoXGulu7g?= =?us-ascii?Q?ghauv4/sLcsPxg28umefTOfyJ6JRnfhJFTE5pfB0ypjKwbUwgrkDCCzpQXRG?= =?us-ascii?Q?kdPWISlTrykFvSA+ZetvV1CG2nUVbV/VGdq3JyrczrH9ROtT4XjwXQrVMhe+?= =?us-ascii?Q?MdMxfFCrZJb+vYlyGyGfnvHQjQQy+QrEQlqvg2BD/bzSbQjXBTXEK8ASw3+E?= =?us-ascii?Q?s+Ysu7q3Q2Xll369WPhOquTt8VE/u5zC5BWg9oTLRnxIAn4RiajGiRsGzxyd?= =?us-ascii?Q?dl6SA9tYya1OQo9G1/C8i/PZj33PUkfI5i55WwnYtlVZYcDyGmpwdOOpHkIN?= =?us-ascii?Q?uG8qN6oBg0xC2EaLAOeqmwbbN8B+6iXoHu2fAejNyg7fb0yTmuUKqUuW6kfw?= =?us-ascii?Q?uzgIebRPjvLleBfQVapo/umgwmZm3E+jZOsQiVM6QEEa9iRzxoC2DtUIPgGh?= =?us-ascii?Q?SRwxZt0VBxzB47RU00EyhM28bxqEUCgHzXBI8Q8NNqZ+1qc/wPVwvadkX4as?= =?us-ascii?Q?VmDOr2PYgCaj5xn0CsSHdthaMH+B+nbHS+CLCvFUYGhFxGUYHwdN1/fOLhU+?= =?us-ascii?Q?MQbOKtGAxbh8o4OJVagu562zdPygPE9AzO2/HaxlXm+Wt1lBNi1j11x27f2l?= =?us-ascii?Q?5sCNKUIfywwSUiUKbseQPsalB0t9bBUAUry11YEd5Xyb3w=3D=3D?= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM4PR11MB5487.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(366007)(376005)(1800799015)(38070700009);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?DcntewYpc7ULasGC2kyQTyjC7G6+uGMPo5c4nwYOtUww+yIMFKy0QnfONmJP?= =?us-ascii?Q?rhDRvtoI3VTybg11z4u1uMqo3myiPAEG1PqmAORWio1hR1XW4t31VxQt27d3?= =?us-ascii?Q?Ce8RiINfd9dKd3fg8TCM1Zd8/6TEoBXDN+3z3lvMDZzOHLXM4krZrGhN3VK/?= =?us-ascii?Q?MGwkjlS5tfMotNu4jUXg5m6D0mTf2Ag6CwOOxyMuWb+E7MTbW+p4SA1DDBcl?= =?us-ascii?Q?mcBaIEhim1RlpMrgHffeFg+nLuqfZUoejUWObZM4j8ft7rwypH3vx65Sz4/n?= =?us-ascii?Q?IRhhdKRJnugQ5mGhCz0ZbagLBwSSPstZWkmyyRkfzqrOIy6eOoJqFVx70Dzq?= =?us-ascii?Q?4yCjjSD8kxJHq5nNTWJGU46p1fu5NiCIBJMyxo4RZhB+Tvw8x4klsOKU8012?= =?us-ascii?Q?UhSVPGK8tHJrC4UWptPkWYZopRUr2liE6TshAA49gcJSIYg2rG+NCN0fF+ji?= =?us-ascii?Q?5zhzg5H57xaHJ/PJqP/9YmKxsevVz0ZeGKhwLHcLBoVjN3Uvabpwh2CJRjfF?= =?us-ascii?Q?lBYivey4SgtOb7ezSELdixdT4i/1jZLhjQw/kArfvs8eIPTCSMF8v/7omyOt?= =?us-ascii?Q?IV0I7E5+BjG77OA+8KwMl4jcSU7c0QAVwVWfwviwpPetueCFN/bvp5UbXawK?= =?us-ascii?Q?hSwvWEadE1/vMCHvwle3Aq2ehM3nGLkVi3filt+2ECTcefMrHYHCqFWp1lm6?= =?us-ascii?Q?xakl4Lr8XUeAszoMHCPoiIU+Xl73PzCLmBH5PygiEFH4vLmVA6EOKJf4RPtE?= =?us-ascii?Q?bkGswPys3FUGJ+olul1S4sJa9F0xbiX40GO2fZv/k5avBkV5OnXNaqaO+uLl?= =?us-ascii?Q?9LY4Im455QBJ7cdJgxw4bLZN/coYKrq/5OrI2Q/KX2DLCg9///cmhVRNTQnK?= =?us-ascii?Q?hO8cYnxjtFBQpLKkrzjVnxO8qGGKM+3ngbB1noPFu5fb9K9bEjK5LEqlGRQa?= =?us-ascii?Q?NsXFE3ILJJDp3NhM0/tvejgPHB+scgEtpljiHLoco3bIEkOXmT5hyTMh+55Y?= =?us-ascii?Q?TrXAlCDhJBt2ru2bV6Td3OULMlrg09WgjSYeDNlqyz+dqj/ezIMK8+3Nz3d1?= =?us-ascii?Q?kMFYAizBzUIfbnGWoPK/A0/2k/9QgSCZMX8+qvPlGQzpckqKBW3IBSWFXYr+?= =?us-ascii?Q?6N9s0dxNrxH5V/cH1MXcqlI2X/YC6PQ4m1NaNrebc2slDChqBuuer3oAkRLO?= =?us-ascii?Q?N8+lDHC90qc8NXzNSh9f/IoLJ5emSVgxwvRYIMEcIkX288DGWGN/faPI+S8f?= =?us-ascii?Q?iWTb8dDKjbe5Vx3A28mPxri0PuO8pHHmSJctr3RXhy2j7Pbl9kT5rIHED0j2?= =?us-ascii?Q?VZ4qMd0NQntLJ0X4ABkMZ36Lmm8PZiizHXzz42dP2RTZ18Jeag2YOec3eAFx?= =?us-ascii?Q?LJPJsZ34FgiyTGPN6DfACtIMJWbLwRktL6VzJb8vbPYwDgfmfMwevmcswQ6I?= =?us-ascii?Q?bvz54XTF9TPzFQrYlIb8QZIzz8LsrZlAoq1+DE8izDBxXxe4Sl5FXx+3oMk3?= =?us-ascii?Q?5XjNNYMDkaP0rlmR8yPjuMs2q667TUH8SqQTKwHgLNlXx9ixN1AGUnjsn9/T?= =?us-ascii?Q?W07U1KVrUAEcNRd13qY4gHctcu9gz0WZh9uO8fge?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM4PR11MB5487.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 931b4a5b-5c42-46a4-72aa-08dc74b340e2 X-MS-Exchange-CrossTenant-originalarrivaltime: 15 May 2024 07:47:21.6215 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: K8D01f5+szh4IP5biHCBNp5QD2yb2M3tUVTSdegePQUBBLvlXyF66ZRujMQkt1sDDyFxVW2USR23ed0pgLGWeA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR11MB7392 X-OriginatorOrg: intel.com X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: gcc/ChangeLog: * config/i386/i386.md (clz2_lzcnt_nf): New define_insn. (*clz2_lzcnt_falsedep_nf): Ditto. (__nf): Ditto. (*__falsedep_nf): Ditto. (_hi_nf): Ditto. (popcount2_nf): Ditto. (*popcount2_falsedep_nf): Ditto. (popcounthi2_nf): Ditto. --- gcc/config/i386/i386.md | 132 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 132 insertions(+) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 55f65a= 31b16..ddde83e57f5 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -21029,6 +21029,24 @@ operands[3] =3D gen_reg_rtx (mode); }) =20 +(define_insn_and_split "clz2_lzcnt_nf" + [(set (match_operand:SWI48 0 "register_operand" "=3Dr") + (clz:SWI48 + (match_operand:SWI48 1 "nonimmediate_operand" "rm")))] + "TARGET_APX_NF && TARGET_LZCNT" + "%{nf%} lzcnt{}\t{%1, %0|%0, %1}" + "&& TARGET_AVOID_FALSE_DEP_FOR_BMI && epilogue_completed + && optimize_function_for_speed_p (cfun) + && !reg_mentioned_p (operands[0], operands[1])" + [(parallel + [(set (match_dup 0) + (clz:SWI48 (match_dup 1))) + (unspec [(match_dup 0)] UNSPEC_INSN_FALSE_DEP)])] + "ix86_expand_clear (operands[0]);" + [(set_attr "prefix_rep" "1") + (set_attr "type" "bitmanip") + (set_attr "mode" "")]) + (define_insn_and_split "clz2_lzcnt" [(set (match_operand:SWI48 0 "register_operand" "=3Dr") (clz:SWI48 @@ -21052,6 +21070,18 @@ ; False dependency happens when destination is only updated by tzcnt, ; l= zcnt or popcnt. There is no false dependency when destination is ; also u= sed in source. +(define_insn "*clz2_lzcnt_falsedep_nf" + [(set (match_operand:SWI48 0 "register_operand" "=3Dr") + (clz:SWI48 + (match_operand:SWI48 1 "nonimmediate_operand" "rm"))) + (unspec [(match_operand:SWI48 2 "register_operand" "0")] + UNSPEC_INSN_FALSE_DEP)] + "TARGET_APX_NF && TARGET_LZCNT" + "%{nf%} lzcnt{}\t{%1, %0|%0, %1}" + [(set_attr "prefix_rep" "1") + (set_attr "type" "bitmanip") + (set_attr "mode" "")]) + (define_insn "*clz2_lzcnt_falsedep" [(set (match_operand:SWI48 0 "register_operand" "=3Dr") (clz:SWI48 @@ -21158,6 +21188,25 @@ ;; Version of lzcnt/tzcnt that is expanded from intrinsics. This version = ;; provides operand size as output when source operand is zero.=20 =20 +(define_insn_and_split "__nf" + [(set (match_operand:SWI48 0 "register_operand" "=3Dr") + (unspec:SWI48 + [(match_operand:SWI48 1 "nonimmediate_operand" "rm")] LT_ZCNT))] + "TARGET_APX_NF" + "%{nf%} {}\t{%1, %0|%0, %1}" + "&& TARGET_AVOID_FALSE_DEP_FOR_BMI && epilogue_completed + && optimize_function_for_speed_p (cfun) + && !reg_mentioned_p (operands[0], operands[1])" + [(parallel + [(set (match_dup 0) + (unspec:SWI48 [(match_dup 1)] LT_ZCNT)) + (unspec [(match_dup 0)] UNSPEC_INSN_FALSE_DEP)])] + "ix86_expand_clear (operands[0]);" + [(set_attr "type" "") + (set_attr "prefix_0f" "1") + (set_attr "prefix_rep" "1") + (set_attr "mode" "")]) + (define_insn_and_split "_" [(set (match_operand:SWI48 0 "register_operand" "=3Dr") (unspec:SWI48 @@ -21182,6 +21231,20 @@ ; False dependency happens when destination is only updated by tzcnt, ; l= zcnt or popcnt. There is no false dependency when destination is ; also u= sed in source. +; also used in source. +(define_insn "*__falsedep_nf" + [(set (match_operand:SWI48 0 "register_operand" "=3Dr") + (unspec:SWI48 + [(match_operand:SWI48 1 "nonimmediate_operand" "rm")] LT_ZCNT)) + (unspec [(match_operand:SWI48 2 "register_operand" "0")] + UNSPEC_INSN_FALSE_DEP)] + "TARGET_APX_NF" + "%{nf%} {}\t{%1, %0|%0, %1}" + [(set_attr "type" "") + (set_attr "prefix_0f" "1") + (set_attr "prefix_rep" "1") + (set_attr "mode" "")]) + (define_insn "*__falsedep" [(set (match_operand:SWI48 0 "register_operand" "=3Dr") (unspec:SWI48 @@ -21196,6 +21259,17 @@ (set_attr "prefix_rep" "1") (set_attr "mode" "")]) =20 +(define_insn "_hi_nf" + [(set (match_operand:HI 0 "register_operand" "=3Dr") + (unspec:HI + [(match_operand:HI 1 "nonimmediate_operand" "rm")] LT_ZCNT))] + "TARGET_APX_NF" + "%{nf%} {w}\t{%1, %0|%0, %1}" + [(set_attr "type" "") + (set_attr "prefix_0f" "1") + (set_attr "prefix_rep" "1") + (set_attr "mode" "HI")]) + (define_insn "_hi" [(set (match_operand:HI 0 "register_operand" "=3Dr") (unspec:HI @@ -21620,6 +21694,30 @@ [(set_attr "type" "bitmanip") (set_attr "mode" "")]) =20 +(define_insn_and_split "popcount2_nf" + [(set (match_operand:SWI48 0 "register_operand" "=3Dr") + (popcount:SWI48 + (match_operand:SWI48 1 "nonimmediate_operand" "rm")))] + "TARGET_APX_NF && TARGET_POPCNT" +{ +#if TARGET_MACHO + return "%{nf%} popcnt\t{%1, %0|%0, %1}"; #else + return "%{nf%} popcnt{}\t{%1, %0|%0, %1}"; #endif } + "&& TARGET_AVOID_FALSE_DEP_FOR_BMI && epilogue_completed + && optimize_function_for_speed_p (cfun) + && !reg_mentioned_p (operands[0], operands[1])" + [(parallel + [(set (match_dup 0) + (popcount:SWI48 (match_dup 1))) + (unspec [(match_dup 0)] UNSPEC_INSN_FALSE_DEP)])] + "ix86_expand_clear (operands[0]);" + [(set_attr "prefix_rep" "1") + (set_attr "type" "bitmanip") + (set_attr "mode" "")]) + (define_insn_and_split "popcount2" [(set (match_operand:SWI48 0 "register_operand" "=3Dr") (popcount:SWI48 @@ -21649,6 +21747,24 @@ ; False dependency happens when destination is only updated by tzcnt, ; l= zcnt or popcnt. There is no false dependency when destination is ; also u= sed in source. +(define_insn "*popcount2_falsedep_nf" + [(set (match_operand:SWI48 0 "register_operand" "=3Dr") + (popcount:SWI48 + (match_operand:SWI48 1 "nonimmediate_operand" "rm"))) + (unspec [(match_operand:SWI48 2 "register_operand" "0")] + UNSPEC_INSN_FALSE_DEP)] + "TARGET_APX_NF && TARGET_POPCNT" +{ +#if TARGET_MACHO + return "%{nf%} popcnt\t{%1, %0|%0, %1}"; #else + return "%{nf%} popcnt{}\t{%1, %0|%0, %1}"; #endif } + [(set_attr "prefix_rep" "1") + (set_attr "type" "bitmanip") + (set_attr "mode" "")]) + (define_insn "*popcount2_falsedep" [(set (match_operand:SWI48 0 "register_operand" "=3Dr") (popcount:SWI48 @@ -21806,6 +21922,22 @@ DONE; }) =20 +(define_insn "popcounthi2_nf" + [(set (match_operand:HI 0 "register_operand" "=3Dr") + (popcount:HI + (match_operand:HI 1 "nonimmediate_operand" "rm")))] + "TARGET_APX_NF && TARGET_POPCNT" +{ +#if TARGET_MACHO + return "%{nf%} popcnt\t{%1, %0|%0, %1}"; #else + return "%{nf%} popcnt{w}\t{%1, %0|%0, %1}"; #endif } + [(set_attr "prefix_rep" "1") + (set_attr "type" "bitmanip") + (set_attr "mode" "HI")]) + (define_insn "popcounthi2" [(set (match_operand:HI 0 "register_operand" "=3Dr") (popcount:HI -- 2.31.1