From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2079.outbound.protection.outlook.com [40.107.20.79]) by sourceware.org (Postfix) with ESMTPS id 5AE8E386F813 for ; Mon, 13 May 2024 09:09:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5AE8E386F813 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5AE8E386F813 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=40.107.20.79 ARC-Seal: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1715591404; cv=pass; b=aXSwK63eyoCLVQhEQMvB8pyB7T3//zt/apuUsBTaoTZiWyg6vvAU4DjorM5xY7XGaSVtTaqMq1kBUZw3O1uDtE7JQFI3PDgmkkCu//lp3DwqPb7bSBPNt862hnKf72DFzNWADH1rm/eeP9VksKncYogxRcjIjBmXtLpxXZF+q0E= ARC-Message-Signature: i=3; a=rsa-sha256; d=sourceware.org; s=key; t=1715591404; c=relaxed/simple; bh=lmypo1BEyNByF+NzsVQQMEm6FtfjhAbWZwhaOfv+NBw=; h=DKIM-Signature:DKIM-Signature:From:To:Subject:Date:Message-ID: MIME-Version; b=YwbHSMZYxMEwbpoy3WHXfxiH0ogjrmJA+o3KGnO4bFIvPtLOZzT0dEWasXiWZe4MqBKNT5a8UWo+c6fmfnHOfXslBYBK6yVEJbKoYAmi9K/8BUOQH1cw/Fh7w4dfolsrpV3gs72YlvQojfIUnThre+P43IhbH/WEeAJP+aAhWrg= ARC-Authentication-Results: i=3; server2.sourceware.org ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=QEklvLy6rcmEFG7CP6uEmqjYrDx6nrOp7yvcNL42FAIJBxkrjP7pD8UPTW47Faw7lMQh9cxalJ4h9dTPSaDLtEnXwe5pC+8eZwenVdUlrSIkDFZrRs2iPa8+ZdkBPmsICDtYE9G4vFvHHpjnk3KFo5b3oIel+SGgSyVzbacwZw8g9BMXZACECxi4qS1BulviGIO0RaM/SRj126JXcexnIZhslE3lXFZyUBBV0/Z/n+oldFbgXM/xSzQm6kr12XI1TqdCqV3dbIMyxSw5t3cankon5muS11f+ppZNoTyT9htIc1WOunPFmPew3Uv9i5lPSq6slhCFiK4C55GhXXDXlg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aHud7ysAZmphBRLHTc3IAa9LBVlRtPpDVCD8QfiiEBs=; b=RnphVCmcGvXUl1q0zQfw8lGDE16Cw5t3wODe295Js2AoVX6YdE+9tLvw2bKuvBozrR9ImHuc16OdVaqZkK5HmUQh2eb0+aoeMYGKnewWCSrsGmgbG979xUg1SSjNQ3+ZrXTiqkxku92UfgpnRMbjz+MedtLjq5h2R55nt07bvZ/hvwmGSHUDPPG+ghnpRJyNbKuHEnURAkHzJMZ2NxGoxgDfhfHE41TBngn23H2qCZqy6SmcCBSq4G344e9OR9sxWKTWHLQ0nXO2evVFFicbE5Tt7g0UrgeJ8nwCQwbj/YhjRI2EPVhcMPsYiQy008qBOCEyl3nc2r0sTdBGAggx+A== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=arm.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aHud7ysAZmphBRLHTc3IAa9LBVlRtPpDVCD8QfiiEBs=; b=Oq1VoNDK+zGeq6MxZyTnKW/ennYpHA8wySp40dK2VIFRePBnhjr+ftu2aN1ADL1HUnHmzEyE3CzpozJQ0aoKmufnUzxGqU6Ztf9eLSgA0yJhWH9lfR0LBoCwE0orqfxVcaIkqmaRb2p+oHEVno5JbfTaQvqoz4qbwKYj0Yd/kyU= Received: from DUZPR01CA0107.eurprd01.prod.exchangelabs.com (2603:10a6:10:4bb::25) by DU0PR08MB8091.eurprd08.prod.outlook.com (2603:10a6:10:3ea::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7544.55; Mon, 13 May 2024 09:09:47 +0000 Received: from DU6PEPF0000A7E1.eurprd02.prod.outlook.com (2603:10a6:10:4bb:cafe::8e) by DUZPR01CA0107.outlook.office365.com (2603:10a6:10:4bb::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7544.55 via Frontend Transport; Mon, 13 May 2024 09:09:47 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=arm.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DU6PEPF0000A7E1.mail.protection.outlook.com (10.167.8.40) with Microsoft SMTP Server (version=TLS1_3, cipher=TLS_AES_256_GCM_SHA384) id 15.20.7587.21 via Frontend Transport; Mon, 13 May 2024 09:09:47 +0000 Received: ("Tessian outbound b7675f20d34d:v315"); Mon, 13 May 2024 09:09:46 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: be16db3dfad7c5db X-CR-MTA-TID: 64aa7808 Received: from f0449f31c5ec.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id FB292798-4CA5-4AE1-9E59-3D8FA3507563.1; Mon, 13 May 2024 09:09:39 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id f0449f31c5ec.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 13 May 2024 09:09:39 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Owb1H1toZVrGo7DezSC9deEM3wu0LRIgABM5Cl4WnSzt/WhHOsWJnQPPR1dsJekIqolUZMDL925B9UD7YVyTR35OJdf6M62Zo6N9PtO/WZ8a4isPYuUX4ysUwV8JkTBTXhPy/whrfRS40iPT7ntFTiiFdmhpidiK0BBifYdxTrOD3YLG91YFradYgKoiV9kVjD48rL12AeZqLJ8rqPcYkR0VuUWAzruY1AJX5xi5i9G6j2T7R0XcWHzThE9MCZELarTWVlobyW1Lnor3S5xwz4m1njP/nOHce8q0KKjfIyPxJ+RsXNCuYbrRYtw111EkYXFAaY8laPW6361iWKMhrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=aHud7ysAZmphBRLHTc3IAa9LBVlRtPpDVCD8QfiiEBs=; b=nbHvCVX3/MDzOmhueEcCxglxQN0COxlMHNDZSbOFo8FYd60l0cj2FTaFBHArgZ1iYTWhA8NmP26ciP91Q4vVRvXFdSikec5mGM0yiPweT2EhXE6+ex5LYxTORIOHhIyB7xPtZZPXVJSEO3TV3c5EVwphuEwZB7ObO7dfHS0yXkXCZecc5ZeJ6db68x7RKr9wY2N5ukLfZeE8oUEYZSCkaeXyEiMkzN8IWN/1CjBmmvRv68921zgKUSbEfcWSanseV/4p2N9gWQpVPLlIyoKrbpafJZKvW+r4I1v55ojFgzrD67P2beffppzr9QW8sabbS8TvSbKZB9DRkTRzcqcJ5A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arm.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=aHud7ysAZmphBRLHTc3IAa9LBVlRtPpDVCD8QfiiEBs=; b=Oq1VoNDK+zGeq6MxZyTnKW/ennYpHA8wySp40dK2VIFRePBnhjr+ftu2aN1ADL1HUnHmzEyE3CzpozJQ0aoKmufnUzxGqU6Ztf9eLSgA0yJhWH9lfR0LBoCwE0orqfxVcaIkqmaRb2p+oHEVno5JbfTaQvqoz4qbwKYj0Yd/kyU= Received: from VI1PR08MB5325.eurprd08.prod.outlook.com (2603:10a6:803:13e::17) by GV1PR08MB11179.eurprd08.prod.outlook.com (2603:10a6:150:1ed::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7544.55; Mon, 13 May 2024 09:09:33 +0000 Received: from VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::a0e:800c:c8b2:5ff0]) by VI1PR08MB5325.eurprd08.prod.outlook.com ([fe80::a0e:800c:c8b2:5ff0%4]) with mapi id 15.20.7544.052; Mon, 13 May 2024 09:09:33 +0000 From: Tamar Christina To: "pan2.li@intel.com" , "gcc-patches@gcc.gnu.org" CC: "juzhe.zhong@rivai.ai" , "kito.cheng@gmail.com" , "richard.guenther@gmail.com" , "hongtao.liu@intel.com" Subject: RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int Thread-Topic: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int Thread-Index: AQHan8RxnQ9arDi1KkSY+N3eW21dkbGU6RfQ Date: Mon, 13 May 2024 09:09:33 +0000 Message-ID: References: <20240406120755.2692291-1-pan2.li@intel.com> <20240506144805.725379-1-pan2.li@intel.com> In-Reply-To: <20240506144805.725379-1-pan2.li@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; x-ms-traffictypediagnostic: VI1PR08MB5325:EE_|GV1PR08MB11179:EE_|DU6PEPF0000A7E1:EE_|DU0PR08MB8091:EE_ X-MS-Office365-Filtering-Correlation-Id: bdaacffe-17b8-419d-6592-08dc732c6fd8 x-checkrecipientrouted: true nodisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0;ARA:13230031|376005|366007|1800799015|38070700009; X-Microsoft-Antispam-Message-Info-Original: =?us-ascii?Q?WgcGtx4z1k694QJyeTWL768xA37Kj39x8xDpY03EgVw3/iDEg0XyYa3pko8p?= =?us-ascii?Q?Vxxvrh0HXE0cJv3OsaRt0jmdlhWmpq5YiZp0/sBjG+R6QMyzxYnxCcqYAtsa?= =?us-ascii?Q?eB4A1choidl37u1CP3MdstZ2FYrpDpFn1vuXESYiLKqLkXsYVbT6SBzl1fTJ?= =?us-ascii?Q?5BiQIfh/ob+a9wnTDtQDXF4G5M8Hmq46n5jVEqQURZ0oaNoZlY2tlWaXpQbt?= =?us-ascii?Q?T+kZpLVjo43bR2y265KCW/XUYEe+PhPN7L/NyAFqOHmW1DwiMz7RE/bEPh4M?= =?us-ascii?Q?QLRpAvWueY1xYtHRbm7iCWPFpqFuWsR3OEQaJ/CDlZJ0lAGkc8ZePO+K3rV5?= =?us-ascii?Q?BH1ib9/WvEKLpGpSezzbHxaEo//ztWD/wGqSwlAxfxuj9lDOIgrdeZJnWO23?= =?us-ascii?Q?gGval5tlZn6Ev0l8h0yx/W3Vj3JzgE49qT1Rd+YMSrcamxDNT6BLn2l+qnBb?= =?us-ascii?Q?SwvPStPD5ngqAhoU1gvD91x2zPyF9RVzBuyxXuRyrSc97nG6wBYxuGsSbXx2?= =?us-ascii?Q?TG03gWmG4T7hKrUCKJUb9BpjoJb8QysfP51wA912PoD1pJC7aEl890WregfH?= =?us-ascii?Q?8e7nHCJHQ9SaFv4r06JOLtBTcAGa231XJSZEgu8ozk0QqxCqXTPd2LPLOJQv?= =?us-ascii?Q?ne0dvq1QR8xzQzW1zZV9hxwkvR9kUX+M+T3RyAeuK7Mr4ZHnId0l5xCu+UUY?= =?us-ascii?Q?Y3I4gVgMB3u2mDLChiZ4q8ky8nGAw3tokccpwshih41xl++9IIRfBLazVByQ?= =?us-ascii?Q?piQNLZsILBYRUQ7jqRrxcvl0fDsukN29oxJErdBDHUc8HybOsIXer9QL1PCU?= =?us-ascii?Q?4mcObHOLY6ljQLPUwvSVAY0QsL8S8CXHo6qNruNUuTzymEGrWVQM8mJy13fC?= =?us-ascii?Q?uFCknfmWBZYOiCz7QLNwtvsF6grUqhFULtS4hLOBl01DXEyeVVAJ+C41rpaf?= =?us-ascii?Q?5BXEONyz6pwvCFwUEHr5C8VGOBg9cTVpFT0VdrwKIPreCU2ulv+9taVM3xAb?= =?us-ascii?Q?+fLpoyddvUtjwsn4He6PbaCgpg6uu8AzYkmBKLoJpDqROEqYIDXrARy5Vv8V?= =?us-ascii?Q?CE9PYH8zreYS0+z5TMHHab/R/jcjU3CaJQfNPVnd1e3ja2Rlmm+tznibyC4D?= =?us-ascii?Q?wnT8TugOl/79gAUKc7+wtNWebDsli8bjtp9ocSH6uC/0rvSmiud4EupFZWSb?= =?us-ascii?Q?8Ps96Knhu5S5KFEo0wrQgSy7VU/21eptCm9dbOq3MyWS0zVLO3wicltve2WL?= =?us-ascii?Q?NE3wpIxvK+ZLURq6DFzn1slcQ22lKQwFl/Cib1xk+e6FpwrTHw4cVTq6aH3g?= =?us-ascii?Q?aJEYPXkjE1p9cCNArvgNU9qGcsqP2JjuLegy9wt0HRpO6w=3D=3D?= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:VI1PR08MB5325.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(376005)(366007)(1800799015)(38070700009);DIR:OUT;SFP:1101; Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB11179 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DU6PEPF0000A7E1.eurprd02.prod.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 06b1bdf2-0537-466f-3fdb-08dc732c67af X-Microsoft-Antispam: BCL:0;ARA:13230031|1800799015|376005|36860700004|82310400017|35042699013; X-Microsoft-Antispam-Message-Info: =?us-ascii?Q?9p2Q/QyGqC3Zx3uGvfv5eafTpJeVdNbTCD+DJdsm0+EU3BJ0WbK8BEv8dCNS?= =?us-ascii?Q?N+YAa5Grb+dH344n3kCkdxU6GhljBSftb2yHvZs+XLATTszj92fQSD5MvhKu?= =?us-ascii?Q?o1mLioZUChyXQn6JUhpQC64RSSegpBsemYeMmiMykT/57sh7aVSk3cDfiFCm?= =?us-ascii?Q?aU7pFUZ9/3COtKTIjFJsLBlIwWxalYkd7FMbSt8pNVGfv51kwh91hyAfepBy?= =?us-ascii?Q?Z3G3XRCVUifWeLA3Y+UmejvQvuY8dZZex0XJbN6SOTHUcQNAImCBr6HHtJS3?= =?us-ascii?Q?m7ao1UIJWY1ErnUWtt3//0cAGm6BVgCjohkN71yErv2uphoOIunMqN7ct17r?= =?us-ascii?Q?3a6la2JJqOj6b8CZOcGkARILoovOyOC8Jw6hcYBb/9lO/9I2msEyc28+Jbaq?= =?us-ascii?Q?lAD8mcMKOnj8gkiGMXJIvgv3kJ5QfroS6I3JK/v766nlIpyeceE6Kn0Gmet5?= =?us-ascii?Q?0DNbkgeGMdw7Mg5m3ClbA1fTpAoWA7/nic4OHQuGpvSaPY/9hEbECgFmC06/?= =?us-ascii?Q?hsu5qDDDPSPRJL6lf2syIpI/++IPEDG8YQdfuekllwZrmUPGZlwy7WymGQSx?= =?us-ascii?Q?8p7UREhVvs2zrJeOpBGZdNcGd7ctr9DBoLBF5Ry/mzULyijR5fxfz+Ixfjx/?= =?us-ascii?Q?3+7Jqrp0+GDznof3V9njFdsbnHuo1wxVv6qx6mgsOV40SLVfuUbMClTfPPAs?= =?us-ascii?Q?WOdcC5PEk230XclUVmY/Rxi/JZsgagjjQqMs3H900C0x1Mdlm/MV4OJgUtBH?= =?us-ascii?Q?Azia3+CEciDtajWFaJ6DlazTXFaoIg+kFZuaNyxZ1AyZHCbFFV/JQiM0tUMB?= =?us-ascii?Q?UuEGTvXbZzBOszsX5DCOe9ZHcsnOUIutgasLPoD4wA2XYN+Ywo6UKFjqgGHP?= =?us-ascii?Q?rYs8BWv1bFXaUo6bImpbbxTs5Xn/zdCf7h4miIQtv6H9V4AFmi35fJSHmLZV?= =?us-ascii?Q?s8PDVSP3PajnkL0fN6V2PPpaH4a/cB2LfTW28NHYVXyruuc2Zbypv4c3/spd?= =?us-ascii?Q?YvnIRqcc/Z1hzHe89bNdnZlPA2lgP1aUF7byKV35l6hfJxRt4qXGL9W/8d0i?= =?us-ascii?Q?VWaXsX9Uwmqx2LLa1jdvrcYHHhxMNh7m6A4kCLSKH/OhWBGRWPlJHZGNH4Iv?= =?us-ascii?Q?Z+0lu+yWgSverWKTGj+NIzCrBtrRm9BejEjgnGRz2Zbn5sgRF8iXGYzYQ78l?= =?us-ascii?Q?7atNcMD55m2lWtShQqf6xR9ZmDzHOnx+ovWGOFz6MjUQ+ypmzovzaWqQXW1l?= =?us-ascii?Q?gEjfnEcWsW0qZaXl/uZoEFlz+GtUUlKcMN7q5cs74z2xattF6WG6LChS4Ohe?= =?us-ascii?Q?zOJM0r1OcVopeiNJE1BAM+u68/hB6V8KDJLfJQwdU4zUWXx0CsAlhjSX4rJ+?= =?us-ascii?Q?pa+UTcM=3D?= X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230031)(1800799015)(376005)(36860700004)(82310400017)(35042699013);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 May 2024 09:09:47.1663 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: bdaacffe-17b8-419d-6592-08dc732c6fd8 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DU6PEPF0000A7E1.eurprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DU0PR08MB8091 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO,GIT_PATCH_0,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Pan, > -----Original Message----- > From: pan2.li@intel.com > Sent: Monday, May 6, 2024 3:48 PM > To: gcc-patches@gcc.gnu.org > Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; Tamar Christina > ; richard.guenther@gmail.com; > hongtao.liu@intel.com; Pan Li > Subject: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned= scalar > int >=20 > From: Pan Li >=20 > This patch would like to add the middle-end presentation for the > saturation add. Aka set the result of add to the max when overflow. > It will take the pattern similar as below. >=20 > SAT_ADD (x, y) =3D> (x + y) | (-(TYPE)((TYPE)(x + y) < x)) >=20 > Take uint8_t as example, we will have: >=20 > * SAT_ADD (1, 254) =3D> 255. > * SAT_ADD (1, 255) =3D> 255. > * SAT_ADD (2, 255) =3D> 255. > * SAT_ADD (255, 255) =3D> 255. >=20 > Given below example for the unsigned scalar integer uint64_t: >=20 > uint64_t sat_add_u64 (uint64_t x, uint64_t y) > { > return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); > } >=20 > Before this patch: > uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) > { > long unsigned int _1; > _Bool _2; > long unsigned int _3; > long unsigned int _4; > uint64_t _7; > long unsigned int _10; > __complex__ long unsigned int _11; >=20 > ;; basic block 2, loop depth 0 > ;; pred: ENTRY > _11 =3D .ADD_OVERFLOW (x_5(D), y_6(D)); > _1 =3D REALPART_EXPR <_11>; > _10 =3D IMAGPART_EXPR <_11>; > _2 =3D _10 !=3D 0; > _3 =3D (long unsigned int) _2; > _4 =3D -_3; > _7 =3D _1 | _4; > return _7; > ;; succ: EXIT >=20 > } >=20 > After this patch: > uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) > { > uint64_t _7; >=20 > ;; basic block 2, loop depth 0 > ;; pred: ENTRY > _7 =3D .SAT_ADD (x_5(D), y_6(D)); [tail call] > return _7; > ;; succ: EXIT > } >=20 > We perform the tranform during widen_mult because that the sub-expr of > SAT_ADD will be optimized to .ADD_OVERFLOW. We need to try the .SAT_ADD > pattern first and then .ADD_OVERFLOW, or we may never catch the pattern > .SAT_ADD. Meanwhile, the isel pass is after widen_mult and then we > cannot perform the .SAT_ADD pattern match as the sub-expr will be > optmized to .ADD_OVERFLOW first. >=20 > The below tests are passed for this patch: > 1. The riscv fully regression tests. > 2. The aarch64 fully regression tests. > 3. The x86 bootstrap tests. > 4. The x86 fully regression tests. >=20 > PR target/51492 > PR target/112600 >=20 > gcc/ChangeLog: >=20 > * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD > to the return true switch case(s). > * internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD. > * match.pd: Add unsigned SAT_ADD match. > * optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd. > * tree-ssa-math-opts.cc (gimple_unsigned_integer_sat_add): New extern > func decl generated in match.pd match. > (match_saturation_arith): New func impl to match the saturation arith. > (math_opts_dom_walker::after_dom_children): Try match saturation > arith. >=20 > Signed-off-by: Pan Li > --- > gcc/internal-fn.cc | 1 + > gcc/internal-fn.def | 2 ++ > gcc/match.pd | 28 ++++++++++++++++++++++++ > gcc/optabs.def | 4 ++-- > gcc/tree-ssa-math-opts.cc | 46 > +++++++++++++++++++++++++++++++++++++++ > 5 files changed, 79 insertions(+), 2 deletions(-) >=20 > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > index 0a7053c2286..73045ca8c8c 100644 > --- a/gcc/internal-fn.cc > +++ b/gcc/internal-fn.cc > @@ -4202,6 +4202,7 @@ commutative_binary_fn_p (internal_fn fn) > case IFN_UBSAN_CHECK_MUL: > case IFN_ADD_OVERFLOW: > case IFN_MUL_OVERFLOW: > + case IFN_SAT_ADD: > case IFN_VEC_WIDEN_PLUS: > case IFN_VEC_WIDEN_PLUS_LO: > case IFN_VEC_WIDEN_PLUS_HI: > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > index 848bb9dbff3..25badbb86e5 100644 > --- a/gcc/internal-fn.def > +++ b/gcc/internal-fn.def > @@ -275,6 +275,8 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, ECF_CONST > | ECF_NOTHROW, first, > DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, > first, > smulhrs, umulhrs, binary) >=20 > +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd, > binary) > + > DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) > DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary) > DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary) > diff --git a/gcc/match.pd b/gcc/match.pd > index d401e7503e6..7058e4cbe29 100644 > --- a/gcc/match.pd > +++ b/gcc/match.pd > @@ -3043,6 +3043,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > || POINTER_TYPE_P (itype)) > && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype)))))) >=20 > +/* Unsigned Saturation Add */ > +(match (usadd_left_part @0 @1) > + (plus:c @0 @1) > + (if (INTEGRAL_TYPE_P (type) > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > + && types_match (type, TREE_TYPE (@0)) > + && types_match (type, TREE_TYPE (@1))))) > + > +(match (usadd_right_part @0 @1) > + (negate (convert (lt (plus:c @0 @1) @0))) > + (if (INTEGRAL_TYPE_P (type) > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > + && types_match (type, TREE_TYPE (@0)) > + && types_match (type, TREE_TYPE (@1))))) > + > +(match (usadd_right_part @0 @1) > + (negate (convert (gt @0 (plus:c @0 @1)))) > + (if (INTEGRAL_TYPE_P (type) > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > + && types_match (type, TREE_TYPE (@0)) > + && types_match (type, TREE_TYPE (@1))))) > + > +/* Unsigned saturation add, case 1 (branchless): > + SAT_U_ADD =3D (X + Y) | - ((X + Y) < X) or > + SAT_U_ADD =3D (X + Y) | - (X > (X + Y)). */ > +(match (unsigned_integer_sat_add @0 @1) > + (bit_ior:c (usadd_left_part @0 @1) (usadd_right_part @0 @1))) > + > /* x > y && x !=3D XXX_MIN --> x > y > x > y && x =3D=3D XXX_MIN --> false . */ > (for eqne (eq ne) > diff --git a/gcc/optabs.def b/gcc/optabs.def > index ad14f9328b9..3f2cb46aff8 100644 > --- a/gcc/optabs.def > +++ b/gcc/optabs.def > @@ -111,8 +111,8 @@ OPTAB_NX(add_optab, "add$F$a3") > OPTAB_NX(add_optab, "add$Q$a3") > OPTAB_VL(addv_optab, "addv$I$a3", PLUS, "add", '3', gen_intv_fp_libfunc) > OPTAB_VX(addv_optab, "add$F$a3") > -OPTAB_NL(ssadd_optab, "ssadd$Q$a3", SS_PLUS, "ssadd", '3', > gen_signed_fixed_libfunc) > -OPTAB_NL(usadd_optab, "usadd$Q$a3", US_PLUS, "usadd", '3', > gen_unsigned_fixed_libfunc) > +OPTAB_NL(ssadd_optab, "ssadd$a3", SS_PLUS, "ssadd", '3', > gen_signed_fixed_libfunc) > +OPTAB_NL(usadd_optab, "usadd$a3", US_PLUS, "usadd", '3', > gen_unsigned_fixed_libfunc) > OPTAB_NL(sub_optab, "sub$P$a3", MINUS, "sub", '3', gen_int_fp_fixed_libf= unc) > OPTAB_NX(sub_optab, "sub$F$a3") > OPTAB_NX(sub_optab, "sub$Q$a3") > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > index 705f4a4695a..35a46edc9f6 100644 > --- a/gcc/tree-ssa-math-opts.cc > +++ b/gcc/tree-ssa-math-opts.cc > @@ -4026,6 +4026,44 @@ arith_overflow_check_p (gimple *stmt, gimple > *cast_stmt, gimple *&use_stmt, > return 0; > } >=20 > +extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tree)= ); > + > +/* > + * Try to match saturation arith pattern(s). > + * 1. SAT_ADD (unsigned) > + * _7 =3D _4 + _6; > + * _8 =3D _4 > _7; > + * _9 =3D (long unsigned int) _8; > + * _10 =3D -_9; > + * _12 =3D _7 | _10; > + * =3D> > + * _12 =3D .SAT_ADD (_4, _6); */ > +static bool > +match_saturation_arith (gimple_stmt_iterator *gsi, gimple *stmt, > + bool *cfg_changed_p) > +{ > + gcall *call =3D NULL; > + bool changed_p =3D false; > + > + gcc_assert (is_gimple_assign (stmt)); > + > + tree ops[2]; > + tree lhs =3D gimple_assign_lhs (stmt); > + > + if (gimple_unsigned_integer_sat_add (lhs, ops, NULL) > + && direct_internal_fn_supported_p (IFN_SAT_ADD, TREE_TYPE (lhs), > + OPTIMIZE_FOR_SPEED)) I think OPTIMIZE_FOR_BOTH is better here, since this is a win also when opt= imizing for size. > + { > + call =3D gimple_build_call_internal (IFN_SAT_ADD, 2, ops[0], ops[1= ]); > + gimple_call_set_lhs (call, lhs); > + gsi_replace (gsi, call, true); > + changed_p =3D true; > + *cfg_changed_p =3D changed_p; > + } > + > + return changed_p; > +} > + > /* Recognize for unsigned x > x =3D y - z; > if (x > y) > @@ -5886,6 +5924,14 @@ math_opts_dom_walker::after_dom_children > (basic_block bb) >=20 > fma_deferring_state fma_state (param_avoid_fma_max_bits > 0); >=20 > + for (gsi =3D gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gimple *stmt =3D gsi_stmt (gsi); > + > + if (is_gimple_assign (stmt)) > + match_saturation_arith (&gsi, stmt, m_cfg_changed_p); > + } > + Hmm why do you iterate independently over the statements? The block below a= lready visits Every statement doesn't it? The root of your match is a BIT_IOR_EXPR expression, so I think you just ne= ed to change the entry below to: case BIT_IOR_EXPR: match_saturation_arith (&gsi, stmt, m_cfg_changed_p); /* fall-through */ case BIT_XOR_EXPR: match_uaddc_usubc (&gsi, stmt, code); break; Patch is looking good! Thanks again for working on this. Regards, Tamar > for (gsi =3D gsi_after_labels (bb); !gsi_end_p (gsi);) > { > gimple *stmt =3D gsi_stmt (gsi); > -- > 2.34.1