From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2063.outbound.protection.outlook.com [40.107.20.63]) by sourceware.org (Postfix) with ESMTPS id 73E65386F452 for ; Mon, 13 May 2024 15:03:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 73E65386F452 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 73E65386F452 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.20.63 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1715612617; cv=pass; b=SlpIxV2V0DpfnUL7kufEi3C6Qg9pc6NYmLKCyGk2rR0yJyTOv0tgLqut132ESuQ9E/VU+28S0iirVl6wo/8QJJn3UUR+YSFlOrqn3l279J3Xw8EwB4KTGrKrLptcMYJaPGRkmdMqazUcbPMKiNfKBmky3WzcBTAlUve9SGQ1QXw= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1715612617; c=relaxed/simple; bh=eks9gQrlOMIKNEeq7StWBt0uh0kBxLvttx0c5xELrRg=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=H07We01xckJ6cUuDz/UifUbhKdJ7zef2Jo46+4ut8CaKEWq3lY0Fp0GX4OrWEPXJjlZhQciZQ/SuLWZ2VvxYpeOZHQVnYcULpMO6F6mXUJVT8oxQ6fs+c3JvNcO2/i+v7Bho+06lX8SXDkev5WXOqRkvKoo8vkhesAh4TmlcGyI= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=l+wn9CZlG8W3VPkx44iz7VtXLSPJ84dhXPUoegAvP4nZe0Ai3xNk1mhz7T1fYut6/Jrt5wSEoNLTHP+OvItWk5xoTh8b5RvX59qL1tvqL4idQNI0rT8qNazmUGycKN05CLAc0vm0fIspXxrbFqNIcgusQYW9RHBVhGFrSndIZxau3Tqhf854cd3eRk5JHyFs2Iqdn9e1NxR7OywUZTISXo+90b96A/Q3wzpczlbmg9Q8p+XF7Ozl8BzZk7x3A+qQJqdK4AIeZm8XVmVm8LgS8q3m3DWftB+WjNdgCKi4Ip9C93VMaMveHFcgl0CkYQtOi6I8PdiUHm2sTYgzV6G/fQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qyhqm6CgrpIt8XJ7EhwKOgcvsD438JGd4/gYDYeiZ+w=; b=M139VSG13kimtDLjBVYWb1B8bhpmtW7VQNiRCS7M4iUsy8+K7WLytfhbm2DniSCBw0z0Hvw5gOBXQhsOThl9PtjBcJvs5bboSGrbFubyTrCFi26CXgDXE1EWurL3VHNbG1Ci7tk+GNuBUkNyZuX9LJxaHjw8+uUyr/KeZPEnk82Hv8S/y9ei1DRX/67PJ1bSPIFJ4wvOrGh1G0Q9H7CnxpGLPqUw/2eWqC37/ObwchMaWKVTPCe1ho7kgo8p09rsD8dUEeWE8JAbMETDMm5P8giBn/+1MkdMonrpZJ8X9tYQs6s3IzRkB9GfwV+t/byyIIr3XnL+Da4zl6thP7xxfQ== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qyhqm6CgrpIt8XJ7EhwKOgcvsD438JGd4/gYDYeiZ+w=; b=K5U20nERQ16KXO9knyM3uYQGij//OvhqsKeq/3zGjSevH0le2Ln6zRJjrGbU+pcv/n69Oi1iBbJcavnVyiUD7/O7MydzZ5k6FzdBeWCriCMm31tGAa28cc1deHB7ovby2RH/JdKtnS+ZWNEBViZ1exTL639rzzmZ8QdZ0iVAFg4= Received: from AS9PR06CA0346.eurprd06.prod.outlook.com (2603:10a6:20b:466::21) by GVXPR08MB10938.eurprd08.prod.outlook.com (2603:10a6:150:1f8::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7544.55; Mon, 13 May 2024 15:03:29 +0000 Received: from AM1PEPF000252DC.eurprd07.prod.outlook.com (2603:10a6:20b:466:cafe::95) by AS9PR06CA0346.outlook.office365.com (2603:10a6:20b:466::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7544.55 via Frontend Transport; Mon, 13 May 2024 15:03:29 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM1PEPF000252DC.mail.protection.outlook.com (10.167.16.54) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7587.21 via Frontend Transport; Mon, 13 May 2024 15:03:29 +0000 Received: ("Tessian outbound 9d9bf1c5d85a:v315"); Mon, 13 May 2024 15:03:28 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: bb76131373a8ac3b X-CR-MTA-TID: 64aa7808 Received: from c386e1de6361.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 69AE7E31-9A76-4A36-A9AA-72FEA38EA39B.1; Mon, 13 May 2024 15:03:21 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id c386e1de6361.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 13 May 2024 15:03:21 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=KHSPw/RpSj/hcOl8zM4vEA1CVKLsQkQMrq/UO7zSdo8XxfERpsX3FR2bpthi8NssWqSQP5iTFGXUv0NpA/RCvJECJXFNVXXltn3GwWGvuvPgP4yca1KbbMggIMbv1WnVNt8SZbDPgZl+zJzDSmCWnq7Xi56nApDj/H26XtIYNfkIv7IHkREK8ZA0MFKow62FBwxkQCKrhaPAEqBtcOiMD+ezX7erjyXctYwkMIHvPDVQHJMmjRfT1QkmvCTeIIpCTAFpkuuVkyAFHlX713ZsbFehjl/PIjdN/yBM7iUDGENOWyb+L7biApg5X88DzQfyVV9Wn03ON7o0u60IH0XdcQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qyhqm6CgrpIt8XJ7EhwKOgcvsD438JGd4/gYDYeiZ+w=; b=R3VPVv8z3J6vdsCPMJXySY/3updcxKlqI1DOQSG0DafEKw+IH2Z0MBkOd1ToFpa84ojvszgfc7WPaimgY8rJhMV0ERtw3fEmvlyO0tseLscA4Yvzj1R5g7gtIucJCupr/YI2mKBbzrUyZpURcInbGsfYB22b6jnVEc7IVN51aIF8JjTXLYph/UUg/GvV2PsjkBC+oLYsKBlNm6hczGMVR+BKZH2MieDBQf38Qk31z42c6sPBZh6mA+Jj34RS/f5fWarOzA5IkHJ98D6d3MTjxkUNB6mtbJvsz2mPRLfY0D+zcb2yvQEW0+gI0KrQB/636aM4DeDs1mVV0nJQI0BPYQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=qyhqm6CgrpIt8XJ7EhwKOgcvsD438JGd4/gYDYeiZ+w=; b=K5U20nERQ16KXO9knyM3uYQGij//OvhqsKeq/3zGjSevH0le2Ln6zRJjrGbU+pcv/n69Oi1iBbJcavnVyiUD7/O7MydzZ5k6FzdBeWCriCMm31tGAa28cc1deHB7ovby2RH/JdKtnS+ZWNEBViZ1exTL639rzzmZ8QdZ0iVAFg4= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by GV1PR08MB7754.eurprd08.prod.outlook.com (2603:10a6:150:55::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7544.55; Mon, 13 May 2024 15:03:18 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::a0e:800c:c8b2:5ff0]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::a0e:800c:c8b2:5ff0%4]) with mapi id 15.20.7544.052; Mon, 13 May 2024 15:03:18 +0000 From: Tamar Christina To: "Li, Pan2" , "gcc-patches@gcc.gnu.org" CC: "juzhe.zhong@rivai.ai" , "kito.cheng@gmail.com" , "richard.guenther@gmail.com" , "Liu, Hongtao" Subject: RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int Thread-Topic: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int Thread-Index: AQHan8RxnQ9arDi1KkSY+N3eW21dkbGU6RfQgABMfQCAAAFBoA== Date: Mon, 13 May 2024 15:03:18 +0000 Message-ID: References: <20240406120755.2692291-1-pan2.li@intel.com> <20240506144805.725379-1-pan2.li@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|GV1PR08MB7754:EE_|AM1PEPF000252DC:EE_|GVXPR08MB10938:EE_ X-MS-Office365-Filtering-Correlation-Id: 8a9df068-d308-4ab5-0b4c-08dc735dd915 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230031|1800799015|376005|366007|38070700009; X-Microsoft-Antispam-Message-Info-Original: =?us-ascii?Q?h+m6G003BdSXTmHkxVov8paj+S6uhRiAKGVZNXLoC49ECQFsXgpv3OsMju4w?= =?us-ascii?Q?MIkHe7mMnB73171DegdVr06VO2rxp1Go/org4ju89cneqAwBxun1l57rkNqU?= =?us-ascii?Q?P+/D/A0nTRKaJtrT4EDnQMriAAYdXAleROEOsCz1j7LvulNqeaj2m7bErzN7?= =?us-ascii?Q?Z9jLV58j61l1hr0IK3WDPNhcEXQbrSntxMD6KQfIT5S9Q5zMrFaDZclIxx5t?= =?us-ascii?Q?9toJCJ45aPC31e2ACe4UsTkhKMki3I6nXpqKK1kbZLcohExwFmzYaMpYHJz7?= =?us-ascii?Q?PphPncU8OA9VL73HN1RAFbW/xAfRPswSf4iBbUieGsz6ZkoA+mpBHmPt745v?= =?us-ascii?Q?f+AAQ4GcSfmrL0eC176toDt51hP0Rfmk9DdmOgRIFu0XqxNE4xJIE8xcRuYM?= =?us-ascii?Q?XLXdC0IwkpVZ1sWxpteSpTLc45dWEfXTAQdFEUhIqmVlyVRo+QL7ogeWnb8d?= =?us-ascii?Q?3xc5AocaQ6j29Oxx8vfQA1Gc2TVIb2cDtL3qFzy2dPJ4oX7bJG/18y2POwr5?= =?us-ascii?Q?euVr00QazL/+CdlYV8mflD0WNm4PYc5dtqCBgCyxahIPmgDHY4WCRGb8tfwc?= =?us-ascii?Q?IWSFzc1fFmEEtgC9NH+a+Vb6FUe94tsvSENsYTcdGjwL/IPmIdcDHPmhenwg?= =?us-ascii?Q?Q2/4KvCtq+r00VkOQOYiov8xzS4U83E/fV6+C+Fawtsl4Bo+lszSGbSjE/GA?= =?us-ascii?Q?0d08broLa3Icuu/Ahmzn2gacmcVJmU9nD1LkUX5EDgpZd34ymlEIID+ph7uq?= =?us-ascii?Q?uJ4fVeNRth9DMVlob7C3WTRBw0TK4mxGkdiH2GKvyYyEQHbY5ENksmxpRMh9?= =?us-ascii?Q?1wQxKGol/iYvcM1Mn770V4zkDKc2h6On6a1cm3CPnizw+dyTZcxuGJsVkicy?= =?us-ascii?Q?+VnVmIxK1aiIqaP6fbl4Z4lcxBNylnhpd31qpSlQz/D72FQ/cCWvCAVCupHy?= =?us-ascii?Q?o2waAyMmLjoIydVPfcut+AMEYVKECZlaLK0wTezAESjeyWrosYIntDcgLA9P?= =?us-ascii?Q?BbUbUuZ6gnzb0u+hBBC26D0qUd8dXvr923Z74YjT1esSCwK9yd1gncBCf6HC?= =?us-ascii?Q?aFBmWnJ6IahZd3hktw0V/t4wGLtXJR0YjP3A2sKlp/wfBBpRIvByfyab6WMu?= =?us-ascii?Q?DJOScY5wNCqnS3LSvSp+niXpKBsroF3py10gJPOVi+wZdkdT4qYfvAemGufh?= =?us-ascii?Q?AuKrUiT/PUcvoSh+zvYjLecH4bxJJzw/Mek9slDeSlrfK56tqMmtIn9XmvW4?= =?us-ascii?Q?xSQk6RXkJuS/E23wVmpM4JOvnn/w2rLCwYKlhjApYjTCJ3+VUv16WTiWfVgR?= =?us-ascii?Q?KfW5PrxynXVxEveIVvRQYBexXrSUEEwjwzPy/vlI2uKHRw=3D=3D?= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(1800799015)(376005)(366007)(38070700009);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB7754 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM1PEPF000252DC.eurprd07.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 62c4d0b9-22c0-4e94-1557-08dc735dd2cf X-Microsoft-Antispam: BCL:0;ARA:13230031|1800799015|376005|36860700004|35042699013|82310400017; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?ngn8uaz8VftoKicujEu03Vz5wNg4dkUmGd9YXEvjMFhAX8Bv1whQljbObd9a?= =?us-ascii?Q?8ItcRHBkT0Jit1RTq2CJTt7UfsNGXCY7vt8aGd8aTYZR64KTCsbRD41qOP3Q?= =?us-ascii?Q?6wmvrMRL1Jfgc6IMJQ97VzueiYekkUXFKCvaIeeo3km4o3jRxeBxeapO4Kb7?= =?us-ascii?Q?tImhmhmUc5dWEdqHYyy7R9tEWwZizVB1TBURvtd5/EUN32vi9dy6cY3erdw4?= =?us-ascii?Q?rDZTtrTu52X9AkHKh6zEr+D55sCB/rxjVMvd6Onlh6yjFvc/aEILuK2c8Cnk?= =?us-ascii?Q?2Cnreil/6HoAkpjXJ8A5/dvkn5yqCifzp/wiDEFvTcU3lT8xIWSoDQIl3FZr?= =?us-ascii?Q?s4ZE2s2QekUd8p196I3OmFcQDUc+BK8LYCloODqTQVsr5EHqVv+P1r132mr5?= =?us-ascii?Q?guZpCnnexKPU5T4t9sOfOKit/JbAIUAJq0Q/m8sqJMfpdGSFuPgvFXZQS5qe?= =?us-ascii?Q?ubOdKf2/Qfwj2yjPASXHHc/LwZejDZ9ZnXfeK2T0tA9tu/HU87c4LwH16TGh?= =?us-ascii?Q?FiMWPxH25oLr/DxT+lNAGmI6UaIOp8thye2wT5cAihDSJ/RsuyQmuQenj0Az?= =?us-ascii?Q?T5rOaEsYVfP8bXT4w+/Xc5cVSHi0SE2cxMXNOHxxDUxsf4hhvIEtkL2f/Dga?= =?us-ascii?Q?SW3GhSDfKHUlOKhk2djX1dSXqZqA02WgYCl0OVXm1TIPs+i02pQCBqdm5lcN?= =?us-ascii?Q?gPKqcVGbD1E1W8BOKNwGr7Om320ep4Wy9wDq6fluRCmqVAPiOr2xAp0eH5J2?= =?us-ascii?Q?QENGDkpNvePcqEQBQF/l4qzO/79MrqzkyFN3RNnfyHZ5nwpptrlTROX+LJy/?= =?us-ascii?Q?fK4C2rM/euCJlIcUO8Ur9wJD7PnV4j95SVeU5uHXTlMDlSS3FdaoP4MuP/Ot?= =?us-ascii?Q?JHTSF4ktVAOSHeBe47n0BwOXfkRevwYmc68uxFApenbNF0d29eGns03i4EE0?= =?us-ascii?Q?IoU6Id5Apl29vFV9Ht7oi7njUiHQDipjyEVgAYx05SFD+k4FYTai5Nn6dZPH?= =?us-ascii?Q?oLFEQI/RF02MYKflv/Bxh//w1A0HJty6DfbM2iBAHOF0zCutxbCVerGKH/Bm?= =?us-ascii?Q?njcYrxHgtBWV7+1TVJrbEsfQB9WIDA3zgtY3Tcdt37M1nzSQByGcpA3p/xkT?= =?us-ascii?Q?84RzM040Apq7DyWzZt5zKcM0T+CjI0hMN61+9voPy4buxpVkakPuDFBICSLs?= =?us-ascii?Q?2linZ8ML72nRfWgYSA4gtzOarsurBPFNrjNH2kaXqhexXK4YIzcXH3a9VlUA?= =?us-ascii?Q?ZLG+m9/AEASRIna6xEsDPfKEmfxA9VVIfaXuoptrH9BqY2sIqCTeb8kP//LC?= =?us-ascii?Q?hcmbtnKPQWNXSRchf+8+fE7DXdSuAtSZj5N9euF8XhIB71BKEOtZsprUSFo/?= =?us-ascii?Q?Z/To5sk=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(1800799015)(376005)(36860700004)(35042699013)(82310400017);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 May 2024 15:03:29.0174 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 8a9df068-d308-4ab5-0b4c-08dc735dd915 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM1PEPF000252DC.eurprd07.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GVXPR08MB10938 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO,GIT_PATCH_0,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: >=20 > Thanks Tamer for comments. >=20 > > I think OPTIMIZE_FOR_BOTH is better here, since this is a win also when > optimizing for size. >=20 > Sure thing, let me update it in v5. >=20 > > Hmm why do you iterate independently over the statements? The block bel= ow > already visits > > Every statement doesn't it? >=20 > Because it will hit .ADD_OVERFLOW first, then it will never hit SAT_ADD a= s the > shape changed, or shall we put it to the previous pass ? >=20 That's just a matter of matching the overflow as an additional case no? i.e. you can add an overload for unsigned_integer_sat_add matching the IFN_ ADD_OVERFLOW and using the realpart and imagpart helpers. I think that would be better as it avoid visiting all the statements twice = but also extends the matching to some __builtin_add_overflow uses and should be fair= ly simple. > > The root of your match is a BIT_IOR_EXPR expression, so I think you jus= t need to > change the entry below to: > > > > case BIT_IOR_EXPR: > > match_saturation_arith (&gsi, stmt, m_cfg_changed_p); > > /* fall-through */ > > case BIT_XOR_EXPR: > > match_uaddc_usubc (&gsi, stmt, code); > > break; >=20 > There are other shapes (not covered in this patch) of SAT_ADD like below = branch > version, the IOR should be one of the ROOT. Thus doesn't > add case here. Then, shall we take case for each shape here ? Both works= for me. >=20 Yeah, I think that's better than iterating over the statements twice. It a= lso fits better In the existing code. Tamar. > #define SAT_ADD_U_1(T) \ > T sat_add_u_1_##T(T x, T y) \ > { \ > return (T)(x + y) >=3D x ? (x + y) : -1; \ > } >=20 > SAT_ADD_U_1(uint32_t) >=20 > Pan >=20 >=20 > -----Original Message----- > From: Tamar Christina > Sent: Monday, May 13, 2024 5:10 PM > To: Li, Pan2 ; gcc-patches@gcc.gnu.org > Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; richard.guenther@gmail.co= m; > Liu, Hongtao > Subject: RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsi= gned > scalar int >=20 > Hi Pan, >=20 > > -----Original Message----- > > From: pan2.li@intel.com > > Sent: Monday, May 6, 2024 3:48 PM > > To: gcc-patches@gcc.gnu.org > > Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; Tamar Christina > > ; richard.guenther@gmail.com; > > hongtao.liu@intel.com; Pan Li > > Subject: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsign= ed > scalar > > int > > > > From: Pan Li > > > > This patch would like to add the middle-end presentation for the > > saturation add. Aka set the result of add to the max when overflow. > > It will take the pattern similar as below. > > > > SAT_ADD (x, y) =3D> (x + y) | (-(TYPE)((TYPE)(x + y) < x)) > > > > Take uint8_t as example, we will have: > > > > * SAT_ADD (1, 254) =3D> 255. > > * SAT_ADD (1, 255) =3D> 255. > > * SAT_ADD (2, 255) =3D> 255. > > * SAT_ADD (255, 255) =3D> 255. > > > > Given below example for the unsigned scalar integer uint64_t: > > > > uint64_t sat_add_u64 (uint64_t x, uint64_t y) > > { > > return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); > > } > > > > Before this patch: > > uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) > > { > > long unsigned int _1; > > _Bool _2; > > long unsigned int _3; > > long unsigned int _4; > > uint64_t _7; > > long unsigned int _10; > > __complex__ long unsigned int _11; > > > > ;; basic block 2, loop depth 0 > > ;; pred: ENTRY > > _11 =3D .ADD_OVERFLOW (x_5(D), y_6(D)); > > _1 =3D REALPART_EXPR <_11>; > > _10 =3D IMAGPART_EXPR <_11>; > > _2 =3D _10 !=3D 0; > > _3 =3D (long unsigned int) _2; > > _4 =3D -_3; > > _7 =3D _1 | _4; > > return _7; > > ;; succ: EXIT > > > > } > > > > After this patch: > > uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) > > { > > uint64_t _7; > > > > ;; basic block 2, loop depth 0 > > ;; pred: ENTRY > > _7 =3D .SAT_ADD (x_5(D), y_6(D)); [tail call] > > return _7; > > ;; succ: EXIT > > } > > > > We perform the tranform during widen_mult because that the sub-expr of > > SAT_ADD will be optimized to .ADD_OVERFLOW. We need to try the .SAT_AD= D > > pattern first and then .ADD_OVERFLOW, or we may never catch the patter= n > > .SAT_ADD. Meanwhile, the isel pass is after widen_mult and then we > > cannot perform the .SAT_ADD pattern match as the sub-expr will be > > optmized to .ADD_OVERFLOW first. > > > > The below tests are passed for this patch: > > 1. The riscv fully regression tests. > > 2. The aarch64 fully regression tests. > > 3. The x86 bootstrap tests. > > 4. The x86 fully regression tests. > > > > PR target/51492 > > PR target/112600 > > > > gcc/ChangeLog: > > > > * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD > > to the return true switch case(s). > > * internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD. > > * match.pd: Add unsigned SAT_ADD match. > > * optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd. > > * tree-ssa-math-opts.cc (gimple_unsigned_integer_sat_add): New extern > > func decl generated in match.pd match. > > (match_saturation_arith): New func impl to match the saturation arith. > > (math_opts_dom_walker::after_dom_children): Try match saturation > > arith. > > > > Signed-off-by: Pan Li > > --- > > gcc/internal-fn.cc | 1 + > > gcc/internal-fn.def | 2 ++ > > gcc/match.pd | 28 ++++++++++++++++++++++++ > > gcc/optabs.def | 4 ++-- > > gcc/tree-ssa-math-opts.cc | 46 > > +++++++++++++++++++++++++++++++++++++++ > > 5 files changed, 79 insertions(+), 2 deletions(-) > > > > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > > index 0a7053c2286..73045ca8c8c 100644 > > --- a/gcc/internal-fn.cc > > +++ b/gcc/internal-fn.cc > > @@ -4202,6 +4202,7 @@ commutative_binary_fn_p (internal_fn fn) > > case IFN_UBSAN_CHECK_MUL: > > case IFN_ADD_OVERFLOW: > > case IFN_MUL_OVERFLOW: > > + case IFN_SAT_ADD: > > case IFN_VEC_WIDEN_PLUS: > > case IFN_VEC_WIDEN_PLUS_LO: > > case IFN_VEC_WIDEN_PLUS_HI: > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > > index 848bb9dbff3..25badbb86e5 100644 > > --- a/gcc/internal-fn.def > > +++ b/gcc/internal-fn.def > > @@ -275,6 +275,8 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, > ECF_CONST > > | ECF_NOTHROW, first, > > DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, > > first, > > smulhrs, umulhrs, binary) > > > > +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd, > > binary) > > + > > DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) > > DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary) > > DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary) > > diff --git a/gcc/match.pd b/gcc/match.pd > > index d401e7503e6..7058e4cbe29 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -3043,6 +3043,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > || POINTER_TYPE_P (itype)) > > && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype)))))) > > > > +/* Unsigned Saturation Add */ > > +(match (usadd_left_part @0 @1) > > + (plus:c @0 @1) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +(match (usadd_right_part @0 @1) > > + (negate (convert (lt (plus:c @0 @1) @0))) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +(match (usadd_right_part @0 @1) > > + (negate (convert (gt @0 (plus:c @0 @1)))) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +/* Unsigned saturation add, case 1 (branchless): > > + SAT_U_ADD =3D (X + Y) | - ((X + Y) < X) or > > + SAT_U_ADD =3D (X + Y) | - (X > (X + Y)). */ > > +(match (unsigned_integer_sat_add @0 @1) > > + (bit_ior:c (usadd_left_part @0 @1) (usadd_right_part @0 @1))) > > + > > /* x > y && x !=3D XXX_MIN --> x > y > > x > y && x =3D=3D XXX_MIN --> false . */ > > (for eqne (eq ne) > > diff --git a/gcc/optabs.def b/gcc/optabs.def > > index ad14f9328b9..3f2cb46aff8 100644 > > --- a/gcc/optabs.def > > +++ b/gcc/optabs.def > > @@ -111,8 +111,8 @@ OPTAB_NX(add_optab, "add$F$a3") > > OPTAB_NX(add_optab, "add$Q$a3") > > OPTAB_VL(addv_optab, "addv$I$a3", PLUS, "add", '3', gen_intv_fp_libfun= c) > > OPTAB_VX(addv_optab, "add$F$a3") > > -OPTAB_NL(ssadd_optab, "ssadd$Q$a3", SS_PLUS, "ssadd", '3', > > gen_signed_fixed_libfunc) > > -OPTAB_NL(usadd_optab, "usadd$Q$a3", US_PLUS, "usadd", '3', > > gen_unsigned_fixed_libfunc) > > +OPTAB_NL(ssadd_optab, "ssadd$a3", SS_PLUS, "ssadd", '3', > > gen_signed_fixed_libfunc) > > +OPTAB_NL(usadd_optab, "usadd$a3", US_PLUS, "usadd", '3', > > gen_unsigned_fixed_libfunc) > > OPTAB_NL(sub_optab, "sub$P$a3", MINUS, "sub", '3', > gen_int_fp_fixed_libfunc) > > OPTAB_NX(sub_optab, "sub$F$a3") > > OPTAB_NX(sub_optab, "sub$Q$a3") > > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > > index 705f4a4695a..35a46edc9f6 100644 > > --- a/gcc/tree-ssa-math-opts.cc > > +++ b/gcc/tree-ssa-math-opts.cc > > @@ -4026,6 +4026,44 @@ arith_overflow_check_p (gimple *stmt, gimple > > *cast_stmt, gimple *&use_stmt, > > return 0; > > } > > > > +extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tre= e)); > > + > > +/* > > + * Try to match saturation arith pattern(s). > > + * 1. SAT_ADD (unsigned) > > + * _7 =3D _4 + _6; > > + * _8 =3D _4 > _7; > > + * _9 =3D (long unsigned int) _8; > > + * _10 =3D -_9; > > + * _12 =3D _7 | _10; > > + * =3D> > > + * _12 =3D .SAT_ADD (_4, _6); */ > > +static bool > > +match_saturation_arith (gimple_stmt_iterator *gsi, gimple *stmt, > > + bool *cfg_changed_p) > > +{ > > + gcall *call =3D NULL; > > + bool changed_p =3D false; > > + > > + gcc_assert (is_gimple_assign (stmt)); > > + > > + tree ops[2]; > > + tree lhs =3D gimple_assign_lhs (stmt); > > + > > + if (gimple_unsigned_integer_sat_add (lhs, ops, NULL) > > + && direct_internal_fn_supported_p (IFN_SAT_ADD, TREE_TYPE (lhs), > > + OPTIMIZE_FOR_SPEED)) >=20 > I think OPTIMIZE_FOR_BOTH is better here, since this is a win also when o= ptimizing > for size. > > + { > > + call =3D gimple_build_call_internal (IFN_SAT_ADD, 2, ops[0], ops= [1]); > > + gimple_call_set_lhs (call, lhs); > > + gsi_replace (gsi, call, true); > > + changed_p =3D true; > > + *cfg_changed_p =3D changed_p; > > + } > > + > > + return changed_p; > > +} > > + > > /* Recognize for unsigned x > > x =3D y - z; > > if (x > y) > > @@ -5886,6 +5924,14 @@ math_opts_dom_walker::after_dom_children > > (basic_block bb) > > > > fma_deferring_state fma_state (param_avoid_fma_max_bits > 0); > > > > + for (gsi =3D gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi= )) > > + { > > + gimple *stmt =3D gsi_stmt (gsi); > > + > > + if (is_gimple_assign (stmt)) > > + match_saturation_arith (&gsi, stmt, m_cfg_changed_p); > > + } > > + >=20 > Hmm why do you iterate independently over the statements? The block below > already visits > Every statement doesn't it? >=20 > The root of your match is a BIT_IOR_EXPR expression, so I think you just = need to > change the entry below to: >=20 > case BIT_IOR_EXPR: > match_saturation_arith (&gsi, stmt, m_cfg_changed_p); > /* fall-through */ > case BIT_XOR_EXPR: > match_uaddc_usubc (&gsi, stmt, code); > break; >=20 > Patch is looking good! Thanks again for working on this. >=20 > Regards, > Tamar >=20 > > for (gsi =3D gsi_after_labels (bb); !gsi_end_p (gsi);) > > { > > gimple *stmt =3D gsi_stmt (gsi); > > -- > > 2.34.1