From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by sourceware.org (Postfix) with ESMTPS id EFBFA3849AC8 for ; Tue, 14 May 2024 01:50:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EFBFA3849AC8 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EFBFA3849AC8 Authentication-Results: server2.sourceware.org; arc=pass smtp.remote-ip=198.175.65.13 ARC-Seal: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1715651452; cv=pass; b=jRNsT133y+2TKP7FjvJYHbMQhvwsADjWKgLW8rWMzdZpf8nAVfu0ncFgKAIOJ6/13aam0nnNMDVHYILvp/eI6AERElsjLET1HALZf9yDoEp8GH00BN6FKq2ja30ulTrOSSXFBDr8TAgu4p34T7VU24t8eHItByz4sCEkHNsVBjg= ARC-Message-Signature: i=2; a=rsa-sha256; d=sourceware.org; s=key; t=1715651452; c=relaxed/simple; bh=NBQjBHXri4hPhWHXGDivGdmh96+HiqgeLtjAl/n4dYU=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=EDu1mxwzq8ZevfJBnV6d3oUW0fLEl0qPdBxurfWRWcf/Rfs0k3Jb4dzPy0Eq0uH00UUQC/LtXzw+/DlPCVFP7yAnsXhF2JOyg7MklFDeU7luqAgg3Jrv+Fi15gCTfqbFvaQn4zoY473wG0rXkSzzt2Aes2hkEfltMwc4iGWErbU= ARC-Authentication-Results: i=2; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715651451; x=1747187451; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=NBQjBHXri4hPhWHXGDivGdmh96+HiqgeLtjAl/n4dYU=; b=L3pveacPEiYXHAILSyvSBbk9EB5K/ovyQEjpaneJIxeWoYNY6ZNWRvp+ +RqTsOQnPXrm7hFDdppnfzlCLeQCt+SfU0OfaWwhWG6lJTdjR5ej703V1 M3TOgGMtTFhQXu4bBKaUDtIlbdzRKEoTSvEmuzEawTeASDi9eBsENQZvz sp4Zq4Y1ryotCEZCSSchjVqqIkoIahepj1fZkLQPJNq9FQ4OBy2YIJUzh DYqcq6ahBSSwyY8Ni81aWoyd8yfs3DJp2NULsAMOBr4hsORDawfkCo/go jvY2N/e6FRTpkrbICqjd5zsEAzt49b72itcWi8ZSENAzmDdvQmup8Evxt g==; X-CSE-ConnectionGUID: wNcZaGKeR7iGjCAootlZ0g== X-CSE-MsgGUID: ql03i5Q3SJSFzpqMTz+HqA== X-IronPort-AV: E=McAfee;i="6600,9927,11072"; a="22762038" X-IronPort-AV: E=Sophos;i="6.08,159,1712646000"; d="scan'208";a="22762038" Received: from fmviesa003.fm.intel.com ([10.60.135.143]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 May 2024 18:50:49 -0700 X-CSE-ConnectionGUID: 4l4fzGOvTHa+u7pVBiragQ== X-CSE-MsgGUID: TN/BXKIDSa+uSWZpZs1sXA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,159,1712646000"; d="scan'208";a="35041514" Received: from orsmsx602.amr.corp.intel.com ([10.22.229.15]) by fmviesa003.fm.intel.com with ESMTP/TLS/AES256-GCM-SHA384; 13 May 2024 18:50:49 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX602.amr.corp.intel.com (10.22.229.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 13 May 2024 18:50:48 -0700 Received: from orsmsx603.amr.corp.intel.com (10.22.229.16) by ORSMSX610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Mon, 13 May 2024 18:50:47 -0700 Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by orsmsx603.amr.corp.intel.com (10.22.229.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35 via Frontend Transport; Mon, 13 May 2024 18:50:47 -0700 Received: from NAM12-DM6-obe.outbound.protection.outlook.com (104.47.59.169) by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Mon, 13 May 2024 18:50:47 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=CPS7AJnO3+Dl2qLeMkBNWxtUsl2c3VopMnV35qekbFkhh/1GgDj+ieGl2CaykdtUOo3Xra/jCPXstC8dsk8nhsr8HVPUrOgzPfsm0AaRShxA3NZ2dsHlCjC2z/VqwiRpPLos01TmGGH5S61fiPLyivg2levklDVqTZQ+5oUg+NrhNI5hICDV4Bo1U9936n8ENZvDkNgDZcaOZChhzRQoF1yyxOzAuClzTF5xAomGA8gWHsN8uMQL8b8WAgd6hdEMOkVAYlVZnMlv8BYOHJ+4VEsAx1JeFej7v8JE1DHbRDCKJU1P3qO9XgX2rmlcttaNIUfnMT42H8SgAt8czi+Knw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AXCkQmLIgkWrgGVtomCgjWbhX3FLFU5FRCUYZA67YIc=; b=cmZrvJuvr9mud3UGsOBwEvwgqfY83Nh/sIMy4dGWBI5O41YtSOAP9F24pzniWO/agYJTah2M9DUtuPtVqoq07M9iCjMyrPJrF00v0vJwTkLRI1D/6cL2GMYvB88ezZ/l0SiYsMFUxLjNoXZpV5zlmKJnq/HD+wxjKVPJiD0dK1pfSM2bmNN05qaHGCeYzgP7t3pVuiHttw3gzeXHrqYlihr5gwgfIcu7/sxvQpH7YakO4vTaCn7eToudw4JQwTSfXsRn2Mj2WcTFyVJLgMfdJ4EhBoxoRoWoK+vIAUH5+9hdwDRXJJPNHbCSKOBX1SGupJ6ISwJWX5dbfeLgvh6RSA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from MW5PR11MB5908.namprd11.prod.outlook.com (2603:10b6:303:194::10) by DS0PR11MB8050.namprd11.prod.outlook.com (2603:10b6:8:117::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7544.55; Tue, 14 May 2024 01:50:45 +0000 Received: from MW5PR11MB5908.namprd11.prod.outlook.com ([fe80::aaa8:bc22:5fb0:5ed0]) by MW5PR11MB5908.namprd11.prod.outlook.com ([fe80::aaa8:bc22:5fb0:5ed0%5]) with mapi id 15.20.7544.052; Tue, 14 May 2024 01:50:45 +0000 From: "Li, Pan2" To: Tamar Christina , "gcc-patches@gcc.gnu.org" CC: "juzhe.zhong@rivai.ai" , "kito.cheng@gmail.com" , "richard.guenther@gmail.com" , "Liu, Hongtao" Subject: RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int Thread-Topic: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int Thread-Index: AQHan8Rv3d0VuIuNKUCXPobkNkl64LGU6wCAgAAzopCAAC81AIAAsxtA Date: Tue, 14 May 2024 01:50:45 +0000 Message-ID: References: <20240406120755.2692291-1-pan2.li@intel.com> <20240506144805.725379-1-pan2.li@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: MW5PR11MB5908:EE_|DS0PR11MB8050:EE_ x-ms-office365-filtering-correlation-id: f562b9eb-c631-4c32-f41c-08dc73b84564 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0;ARA:13230031|1800799015|366007|376005|38070700009; x-microsoft-antispam-message-info: =?us-ascii?Q?0KBZZJz5GfJsegGcmuyOPv9l93xdEULurz5KfkfPbuCp4nVuimX+auExvAfg?= =?us-ascii?Q?n+xePZYItDl8GOrNtbdvrB7aRbhTWv7Wm59PiZKFnVtkXSEWnIS0USunwoju?= =?us-ascii?Q?hNusnPLjF/aMtTSHUxmyPo1I+g+iaLIEj/QycyuXekrXBFf9F7UREnLXYdrX?= =?us-ascii?Q?b2cj9SECK5CZpYDLGnhhD/o+TiYLLSUDz6s2eTF9Jn0IJIIo/NFd8RMC+HmZ?= =?us-ascii?Q?B60OHYKeifb3BZfu2JwEvcDJcuTl2iovjZBhfDr2j/lN+EojelCR+fyrCg7W?= =?us-ascii?Q?8z1aj0yaOeW1JX/ItZ2M+12tp9AYhODSJgb1Sb4ts3hWNUsykjL4Z5V7OYP5?= =?us-ascii?Q?b8I+cJ8lyfsk5dqjtE98kpX6Lg+qt+sHXkjTVx1dnJIOHlKUHyIixYBuMRMo?= =?us-ascii?Q?cE/QJj2nasPyq85VMerBwrRU/spWVooL9SWRYGeKN1UQzvx2zHWBgpzur5sZ?= =?us-ascii?Q?ptRYA9DUe8JN0hCAFufFxkfdhgtsRq0Esk0uPTp8fozZDTKklqMnyvfQw8m7?= =?us-ascii?Q?CPsfd1Cd52y5BPYZltlaXvpRfxkZFnlCIiNSgqKJrYNL9G5l6a8PGElR38Yh?= =?us-ascii?Q?ztkyB1l/Du1zgOuKsNTRwsVbbDjU8cRkTcv8wTT4SjHw2hDxmkTicpWb+xnY?= =?us-ascii?Q?WDFEYi40XsJ375+i9NEgOramFZzk/NsU/eb+PGiqhwpKyVbV+I6pO9Nt3fQb?= =?us-ascii?Q?Eys+tcMnaQg7LGLd8GNereWRjdinAbYICXJO0VZeQ/Vfgy8aR9d8oXML0n/T?= =?us-ascii?Q?TxZXifynGKAL+GwjPoN7NCUgRm80nluXzKdasDafM9fD5S5XLVhPkmBqfgs6?= =?us-ascii?Q?JvippBazjaOPNyR6pc0uTfy8lL8MPZU0oIKLNDF2CWHN1uqZFBYfIIvzIfK5?= =?us-ascii?Q?TIShMXpMtUGYsu+CdpDGhbT7bSPe/W2q7ThxqFtVzQD+vHihFM9HNEDlkZhC?= =?us-ascii?Q?1gQndrxq9jMDkblVZt5ct19KgNed/HXUnMHWP5D8216LAHO15K9ggQEOBrIX?= =?us-ascii?Q?FCvlaVbjvmAxym5N83SxvPae0ao7SID76XIw/ZytjYKzTyOtOEJGaZvZ90bl?= =?us-ascii?Q?iHoJY+Qhs0Q8k/DAOC4vJqhLH9/2iJspmlnyyzHOuGnOrQYGsx/hGrFQMX2S?= =?us-ascii?Q?bRiROWNx+q90D9Sy99YjUxTS+ILhEs47wM7p/sG9gjBOEhKBO8aGcpQ8vsjv?= =?us-ascii?Q?YmBaMvTQXv5o3WrluR8/6t6faJ8v7Y53IJ1j9G2uBgpWp06sSYGTNrOeheJ4?= =?us-ascii?Q?UmLXvOrLyg7lPDw3UI2yXK5+AaK6yVTI4cnv7BhWI/uVjTsEclpzvCj2J4DS?= =?us-ascii?Q?6XVzYxkLaQQJaaf4EzLF/o96yOY3/Ikc6EmjvKTJC3v4hQ=3D=3D?= x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MW5PR11MB5908.namprd11.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(1800799015)(366007)(376005)(38070700009);DIR:OUT;SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?QlFUSyC8/fK04eLEuh0VlSGtPE1lZe0uEa835UMXoh+R+ET9ye90zbi+H7ca?= =?us-ascii?Q?taiz2FNexwYvOOobR3ALVdsaGgiKLLFEmm4/ra2oZYPnmdb2H0oMpIPmDj69?= =?us-ascii?Q?BS5qparnRejhKvETzJx98XZhUN6q4XtQRLsODG53+eUluRgBwxUAeFZ3tj0T?= =?us-ascii?Q?ugbMdDJCcEXzZ2BNntCvr//RFRDUj1HOXSnnW6WEz83vLvtb5uyxTkLxsern?= =?us-ascii?Q?czGo1oOp9hssxrZcrUFx4IzFz+VyEJeiTQSVzI/IhrvvI1EOgL+VqnZom4+J?= =?us-ascii?Q?gmAY21yf/BRlMi+CSLv5RVqXQGnTkEMKhEdkzWSuS4lPx848UYZE2whUkPQy?= =?us-ascii?Q?OjXNh9vdfTNUuAGFMFdaNo+AjR3QWal+tPxtIG1HiMntTzSgPqxIKyAZl3Ed?= =?us-ascii?Q?rYqlCJFoyfKpWpLqbvNbnV/8VOMduH/QT2A90Tjz4LmRhcrPtVSuxSppScn8?= =?us-ascii?Q?qzZz7BmoFHhCwy9HNc4ZXjFQtujj8zOt70comHjYeAuiIGi2VfGkRidWIBtg?= =?us-ascii?Q?MsGOr2rUZH5HqVe2/RX2xgAMoml7JqL3Am934GnpYodrHS9+uwVwrDx73Dph?= =?us-ascii?Q?4aY5JsCF2w8pzxB+Xs9Bj+gcznR/cI33MCWUv5toJLn59v/SXpShBOIFZrjM?= =?us-ascii?Q?QqU9XtE78TvByA4AwZnmf/2SC8J2aYv+R2WhGIoUJNZH7dZMyLKMJqIAotJt?= =?us-ascii?Q?hffv+IBvppRVfP7VethxfYorLm/jVnBXzo06TOKYNig0H9PpQOWYBPsgClaJ?= =?us-ascii?Q?x07di+okzPqjzczQef/eHfLhXC7w42DKr0k2AiGSv5oQRFe7QNV11e6gnTZw?= =?us-ascii?Q?Lo2gu38a6aAe8pR/hxjddwksRW/Bqv0ovrYCoPgwpTYdWyXNA5oUfiym39Kn?= =?us-ascii?Q?nO6CN1fa6JjMhqdXyhGYdaELMBLdev6Um1Zfpsw2d/Zlrbsj9pKVcn6wCkMH?= =?us-ascii?Q?KJLsJRQ8sjHAHDhWemx5MQ2s4NQf+jY9D3qaTlak91jNsb3axcFH7tV+FeIi?= =?us-ascii?Q?8JHrLtoZEhLK/+Cow22Wib04vpmhND9MBatuJcLSaDCTQ/vUevAevBXgcWYl?= =?us-ascii?Q?jX6US2VG5y9vlyvprg7ubluSYoiOLr/lkuQ3l6IgAxXK5Lqp8mUtGx+ATw8k?= =?us-ascii?Q?uXE3G/Nf+yOVNOrHSSZ2eJJvi9kF2kDjsWmQVCqQngFUbtmqXZWmEFWRQG7f?= =?us-ascii?Q?SvPNqGeeodyMeKeT8VLwaoEj79r87dXKghRJEy4pSndjRK0mb52DyAweFh9v?= =?us-ascii?Q?A0/lU8zH97Fw+QYTD1VdsoAHj9+bVK2bzMBnquqDh8BqCiTL+zjvQbG+PfZU?= =?us-ascii?Q?e/iXIzpzzT4j5RqnVLtChT1qJVLmgOSYIk2AasR3G+ec0Ald2hvLWf/2Hrh0?= =?us-ascii?Q?GbXZDOjBRl5fL0Gxcv2pKvwucQNu9FCvZu1bZPAQlFSMkXcdm0BCeCIjj0Mx?= =?us-ascii?Q?R7us8wQuvrodhhPFgKD6nnw7M+62+VM5K1hMzvKOe4PhWHl9HfiUZQlzPneo?= =?us-ascii?Q?ftLP0EQ0cZO6HvkBQDEqQDnEszX4HBXKkgLP4ZuGhmfO75WC3dHHiCbHLdDK?= =?us-ascii?Q?9XZ4/wcf2HWYHm0uSlc=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: MW5PR11MB5908.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: f562b9eb-c631-4c32-f41c-08dc73b84564 X-MS-Exchange-CrossTenant-originalarrivaltime: 14 May 2024 01:50:45.5133 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: OxXEdQeKf6Xb6vM1sfA1zlXNMQJ2gcHr6yGEyN/sYMA7j1SqwxjjiUItFvgT1p1T6GrUfTVrQuktArCOR8Hv3Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR11MB8050 X-OriginatorOrg: intel.com X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_NONE,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: > That's just a matter of matching the overflow as an additional case no? > i.e. you can add an overload for unsigned_integer_sat_add matching the > IFN_ ADD_OVERFLOW and using the realpart and imagpart helpers. > I think that would be better as it avoid visiting all the statements twic= e but also > extends the matching to some __builtin_add_overflow uses and should be fa= irly > simple. Thanks Tamar, got the point here, will have a try with overload unsigned_in= teger_sat_add for that. > Yeah, I think that's better than iterating over the statements twice. It= also fits better > In the existing code. Ack, will follow the existing code. Pan -----Original Message----- From: Tamar Christina =20 Sent: Monday, May 13, 2024 11:03 PM To: Li, Pan2 ; gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; richard.guenther@gmail.com;= Liu, Hongtao Subject: RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsign= ed scalar int >=20 > Thanks Tamer for comments. >=20 > > I think OPTIMIZE_FOR_BOTH is better here, since this is a win also when > optimizing for size. >=20 > Sure thing, let me update it in v5. >=20 > > Hmm why do you iterate independently over the statements? The block bel= ow > already visits > > Every statement doesn't it? >=20 > Because it will hit .ADD_OVERFLOW first, then it will never hit SAT_ADD a= s the > shape changed, or shall we put it to the previous pass ? >=20 That's just a matter of matching the overflow as an additional case no? i.e. you can add an overload for unsigned_integer_sat_add matching the IFN_ ADD_OVERFLOW and using the realpart and imagpart helpers. I think that would be better as it avoid visiting all the statements twice = but also extends the matching to some __builtin_add_overflow uses and should be fair= ly simple. > > The root of your match is a BIT_IOR_EXPR expression, so I think you jus= t need to > change the entry below to: > > > > case BIT_IOR_EXPR: > > match_saturation_arith (&gsi, stmt, m_cfg_changed_p); > > /* fall-through */ > > case BIT_XOR_EXPR: > > match_uaddc_usubc (&gsi, stmt, code); > > break; >=20 > There are other shapes (not covered in this patch) of SAT_ADD like below = branch > version, the IOR should be one of the ROOT. Thus doesn't > add case here. Then, shall we take case for each shape here ? Both works= for me. >=20 Yeah, I think that's better than iterating over the statements twice. It a= lso fits better In the existing code. Tamar. > #define SAT_ADD_U_1(T) \ > T sat_add_u_1_##T(T x, T y) \ > { \ > return (T)(x + y) >=3D x ? (x + y) : -1; \ > } >=20 > SAT_ADD_U_1(uint32_t) >=20 > Pan >=20 >=20 > -----Original Message----- > From: Tamar Christina > Sent: Monday, May 13, 2024 5:10 PM > To: Li, Pan2 ; gcc-patches@gcc.gnu.org > Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; richard.guenther@gmail.co= m; > Liu, Hongtao > Subject: RE: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsi= gned > scalar int >=20 > Hi Pan, >=20 > > -----Original Message----- > > From: pan2.li@intel.com > > Sent: Monday, May 6, 2024 3:48 PM > > To: gcc-patches@gcc.gnu.org > > Cc: juzhe.zhong@rivai.ai; kito.cheng@gmail.com; Tamar Christina > > ; richard.guenther@gmail.com; > > hongtao.liu@intel.com; Pan Li > > Subject: [PATCH v4 1/3] Internal-fn: Support new IFN SAT_ADD for unsign= ed > scalar > > int > > > > From: Pan Li > > > > This patch would like to add the middle-end presentation for the > > saturation add. Aka set the result of add to the max when overflow. > > It will take the pattern similar as below. > > > > SAT_ADD (x, y) =3D> (x + y) | (-(TYPE)((TYPE)(x + y) < x)) > > > > Take uint8_t as example, we will have: > > > > * SAT_ADD (1, 254) =3D> 255. > > * SAT_ADD (1, 255) =3D> 255. > > * SAT_ADD (2, 255) =3D> 255. > > * SAT_ADD (255, 255) =3D> 255. > > > > Given below example for the unsigned scalar integer uint64_t: > > > > uint64_t sat_add_u64 (uint64_t x, uint64_t y) > > { > > return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x)); > > } > > > > Before this patch: > > uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) > > { > > long unsigned int _1; > > _Bool _2; > > long unsigned int _3; > > long unsigned int _4; > > uint64_t _7; > > long unsigned int _10; > > __complex__ long unsigned int _11; > > > > ;; basic block 2, loop depth 0 > > ;; pred: ENTRY > > _11 =3D .ADD_OVERFLOW (x_5(D), y_6(D)); > > _1 =3D REALPART_EXPR <_11>; > > _10 =3D IMAGPART_EXPR <_11>; > > _2 =3D _10 !=3D 0; > > _3 =3D (long unsigned int) _2; > > _4 =3D -_3; > > _7 =3D _1 | _4; > > return _7; > > ;; succ: EXIT > > > > } > > > > After this patch: > > uint64_t sat_add_uint64_t (uint64_t x, uint64_t y) > > { > > uint64_t _7; > > > > ;; basic block 2, loop depth 0 > > ;; pred: ENTRY > > _7 =3D .SAT_ADD (x_5(D), y_6(D)); [tail call] > > return _7; > > ;; succ: EXIT > > } > > > > We perform the tranform during widen_mult because that the sub-expr of > > SAT_ADD will be optimized to .ADD_OVERFLOW. We need to try the .SAT_AD= D > > pattern first and then .ADD_OVERFLOW, or we may never catch the patter= n > > .SAT_ADD. Meanwhile, the isel pass is after widen_mult and then we > > cannot perform the .SAT_ADD pattern match as the sub-expr will be > > optmized to .ADD_OVERFLOW first. > > > > The below tests are passed for this patch: > > 1. The riscv fully regression tests. > > 2. The aarch64 fully regression tests. > > 3. The x86 bootstrap tests. > > 4. The x86 fully regression tests. > > > > PR target/51492 > > PR target/112600 > > > > gcc/ChangeLog: > > > > * internal-fn.cc (commutative_binary_fn_p): Add type IFN_SAT_ADD > > to the return true switch case(s). > > * internal-fn.def (SAT_ADD): Add new signed optab SAT_ADD. > > * match.pd: Add unsigned SAT_ADD match. > > * optabs.def (OPTAB_NL): Remove fixed-point limitation for us/ssadd. > > * tree-ssa-math-opts.cc (gimple_unsigned_integer_sat_add): New extern > > func decl generated in match.pd match. > > (match_saturation_arith): New func impl to match the saturation arith. > > (math_opts_dom_walker::after_dom_children): Try match saturation > > arith. > > > > Signed-off-by: Pan Li > > --- > > gcc/internal-fn.cc | 1 + > > gcc/internal-fn.def | 2 ++ > > gcc/match.pd | 28 ++++++++++++++++++++++++ > > gcc/optabs.def | 4 ++-- > > gcc/tree-ssa-math-opts.cc | 46 > > +++++++++++++++++++++++++++++++++++++++ > > 5 files changed, 79 insertions(+), 2 deletions(-) > > > > diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc > > index 0a7053c2286..73045ca8c8c 100644 > > --- a/gcc/internal-fn.cc > > +++ b/gcc/internal-fn.cc > > @@ -4202,6 +4202,7 @@ commutative_binary_fn_p (internal_fn fn) > > case IFN_UBSAN_CHECK_MUL: > > case IFN_ADD_OVERFLOW: > > case IFN_MUL_OVERFLOW: > > + case IFN_SAT_ADD: > > case IFN_VEC_WIDEN_PLUS: > > case IFN_VEC_WIDEN_PLUS_LO: > > case IFN_VEC_WIDEN_PLUS_HI: > > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def > > index 848bb9dbff3..25badbb86e5 100644 > > --- a/gcc/internal-fn.def > > +++ b/gcc/internal-fn.def > > @@ -275,6 +275,8 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, > ECF_CONST > > | ECF_NOTHROW, first, > > DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, > > first, > > smulhrs, umulhrs, binary) > > > > +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST, first, ssadd, usadd, > > binary) > > + > > DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary) > > DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary) > > DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary) > > diff --git a/gcc/match.pd b/gcc/match.pd > > index d401e7503e6..7058e4cbe29 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -3043,6 +3043,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > > || POINTER_TYPE_P (itype)) > > && wi::eq_p (wi::to_wide (int_cst), wi::max_value (itype)))))) > > > > +/* Unsigned Saturation Add */ > > +(match (usadd_left_part @0 @1) > > + (plus:c @0 @1) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +(match (usadd_right_part @0 @1) > > + (negate (convert (lt (plus:c @0 @1) @0))) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +(match (usadd_right_part @0 @1) > > + (negate (convert (gt @0 (plus:c @0 @1)))) > > + (if (INTEGRAL_TYPE_P (type) > > + && TYPE_UNSIGNED (TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@0)) > > + && types_match (type, TREE_TYPE (@1))))) > > + > > +/* Unsigned saturation add, case 1 (branchless): > > + SAT_U_ADD =3D (X + Y) | - ((X + Y) < X) or > > + SAT_U_ADD =3D (X + Y) | - (X > (X + Y)). */ > > +(match (unsigned_integer_sat_add @0 @1) > > + (bit_ior:c (usadd_left_part @0 @1) (usadd_right_part @0 @1))) > > + > > /* x > y && x !=3D XXX_MIN --> x > y > > x > y && x =3D=3D XXX_MIN --> false . */ > > (for eqne (eq ne) > > diff --git a/gcc/optabs.def b/gcc/optabs.def > > index ad14f9328b9..3f2cb46aff8 100644 > > --- a/gcc/optabs.def > > +++ b/gcc/optabs.def > > @@ -111,8 +111,8 @@ OPTAB_NX(add_optab, "add$F$a3") > > OPTAB_NX(add_optab, "add$Q$a3") > > OPTAB_VL(addv_optab, "addv$I$a3", PLUS, "add", '3', gen_intv_fp_libfun= c) > > OPTAB_VX(addv_optab, "add$F$a3") > > -OPTAB_NL(ssadd_optab, "ssadd$Q$a3", SS_PLUS, "ssadd", '3', > > gen_signed_fixed_libfunc) > > -OPTAB_NL(usadd_optab, "usadd$Q$a3", US_PLUS, "usadd", '3', > > gen_unsigned_fixed_libfunc) > > +OPTAB_NL(ssadd_optab, "ssadd$a3", SS_PLUS, "ssadd", '3', > > gen_signed_fixed_libfunc) > > +OPTAB_NL(usadd_optab, "usadd$a3", US_PLUS, "usadd", '3', > > gen_unsigned_fixed_libfunc) > > OPTAB_NL(sub_optab, "sub$P$a3", MINUS, "sub", '3', > gen_int_fp_fixed_libfunc) > > OPTAB_NX(sub_optab, "sub$F$a3") > > OPTAB_NX(sub_optab, "sub$Q$a3") > > diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc > > index 705f4a4695a..35a46edc9f6 100644 > > --- a/gcc/tree-ssa-math-opts.cc > > +++ b/gcc/tree-ssa-math-opts.cc > > @@ -4026,6 +4026,44 @@ arith_overflow_check_p (gimple *stmt, gimple > > *cast_stmt, gimple *&use_stmt, > > return 0; > > } > > > > +extern bool gimple_unsigned_integer_sat_add (tree, tree*, tree (*)(tre= e)); > > + > > +/* > > + * Try to match saturation arith pattern(s). > > + * 1. SAT_ADD (unsigned) > > + * _7 =3D _4 + _6; > > + * _8 =3D _4 > _7; > > + * _9 =3D (long unsigned int) _8; > > + * _10 =3D -_9; > > + * _12 =3D _7 | _10; > > + * =3D> > > + * _12 =3D .SAT_ADD (_4, _6); */ > > +static bool > > +match_saturation_arith (gimple_stmt_iterator *gsi, gimple *stmt, > > + bool *cfg_changed_p) > > +{ > > + gcall *call =3D NULL; > > + bool changed_p =3D false; > > + > > + gcc_assert (is_gimple_assign (stmt)); > > + > > + tree ops[2]; > > + tree lhs =3D gimple_assign_lhs (stmt); > > + > > + if (gimple_unsigned_integer_sat_add (lhs, ops, NULL) > > + && direct_internal_fn_supported_p (IFN_SAT_ADD, TREE_TYPE (lhs), > > + OPTIMIZE_FOR_SPEED)) >=20 > I think OPTIMIZE_FOR_BOTH is better here, since this is a win also when o= ptimizing > for size. > > + { > > + call =3D gimple_build_call_internal (IFN_SAT_ADD, 2, ops[0], ops= [1]); > > + gimple_call_set_lhs (call, lhs); > > + gsi_replace (gsi, call, true); > > + changed_p =3D true; > > + *cfg_changed_p =3D changed_p; > > + } > > + > > + return changed_p; > > +} > > + > > /* Recognize for unsigned x > > x =3D y - z; > > if (x > y) > > @@ -5886,6 +5924,14 @@ math_opts_dom_walker::after_dom_children > > (basic_block bb) > > > > fma_deferring_state fma_state (param_avoid_fma_max_bits > 0); > > > > + for (gsi =3D gsi_after_labels (bb); !gsi_end_p (gsi); gsi_next (&gsi= )) > > + { > > + gimple *stmt =3D gsi_stmt (gsi); > > + > > + if (is_gimple_assign (stmt)) > > + match_saturation_arith (&gsi, stmt, m_cfg_changed_p); > > + } > > + >=20 > Hmm why do you iterate independently over the statements? The block below > already visits > Every statement doesn't it? >=20 > The root of your match is a BIT_IOR_EXPR expression, so I think you just = need to > change the entry below to: >=20 > case BIT_IOR_EXPR: > match_saturation_arith (&gsi, stmt, m_cfg_changed_p); > /* fall-through */ > case BIT_XOR_EXPR: > match_uaddc_usubc (&gsi, stmt, code); > break; >=20 > Patch is looking good! Thanks again for working on this. >=20 > Regards, > Tamar >=20 > > for (gsi =3D gsi_after_labels (bb); !gsi_end_p (gsi);) > > { > > gimple *stmt =3D gsi_stmt (gsi); > > -- > > 2.34.1