From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id CF347384640E for ; Thu, 25 Apr 2024 07:29:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CF347384640E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CF347384640E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714030168; cv=none; b=qgPTVGZF6mZWY7ozxkNl1pFwqqnxLnMv75WaKNwmRW5B2nfaFHBvckIrQd/mkPHOiNesoJX9ijQWA1nGii/E1Ejpf9P0vp0xFDRbSej5MT0clRa+atRBMdfVCLaumIBODu03GcZ1Mns230IbynyqfTEei/IVRooANbUPiwhB/Lo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1714030168; c=relaxed/simple; bh=xF8wZ/l33oQQ0bDY0B2/qWdOlvwp8Qls6ffzwusb6Z0=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=XMOalwU+KveB4+88MsW3TB8k7toDpra14KyFQoG2H6FLFhkhbunrUD4l4YFbwWOSEigngot9CQb+lgp0kjhG2foIGF5ZD1EFN9OGqRfhHu+XIDcbEy+Ze3+6IqIh7GHOdM13gjPWmpCy4kQpyxD7fE/mx3NMJKy8b/TNapCKtHg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 43P6bxxP008702 for ; Thu, 25 Apr 2024 07:29:26 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding; s=pp1; bh=LgcM9UP+Lwhbw1YkdsKW/KieArHdFerkF0MEnFyOvWI=; b=dg0k1f7+eh4tHum1SaVid3usMkqFaBbno1+O0H9bTf0xTbbOZqXNSvBYajXfAcYrHJdg OpP0vsk2k5bZb23Ea9MZvJGWaRsGDqTCXuxI7pdFxM33wbdSSsGUzEt/3+ZPLoovcW14 WtxDC3yy3HeTy6lQ5pwC/c9fsVnsm36SyvH8gzjSc55aNK9GbeFxiRQSTmSfQse7frda C2uMKY4cFB+44mgK63IldTjm3W7sErbHQ+qApYkitKdwIiWOGxoV2CfgolXjCHngKYGi b5pjkRnbRAdyvvLnMsKNB8dHeIzwshWQ99mmDNIHGyHXHy080yK3hTmzkE6cVQ8EQAt/ fg== Received: from ppma12.dal12v.mail.ibm.com (dc.9e.1632.ip4.static.sl-reverse.com [50.22.158.220]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3xqj2eg4m5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 25 Apr 2024 07:29:25 +0000 Received: from pps.filterd (ppma12.dal12v.mail.ibm.com [127.0.0.1]) by ppma12.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 43P7KCeS029929 for ; Thu, 25 Apr 2024 07:27:02 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma12.dal12v.mail.ibm.com (PPS) with ESMTPS id 3xmr1trqae-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 25 Apr 2024 07:27:02 +0000 Received: from smtpav07.fra02v.mail.ibm.com (smtpav07.fra02v.mail.ibm.com [10.20.54.106]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 43P7QvkP48693666 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 25 Apr 2024 07:26:59 GMT Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2DC442004D; Thu, 25 Apr 2024 07:26:57 +0000 (GMT) Received: from smtpav07.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1A94B20043; Thu, 25 Apr 2024 07:26:57 +0000 (GMT) Received: from a8345010.lnxne.boe (unknown [9.152.108.100]) by smtpav07.fra02v.mail.ibm.com (Postfix) with ESMTPS; Thu, 25 Apr 2024 07:26:57 +0000 (GMT) From: Stefan Schulze Frielinghaus To: gcc-patches@gcc.gnu.org Cc: Stefan Schulze Frielinghaus Subject: [PATCH] tree-optimization/110490 - bitcount for narrow modes Date: Thu, 25 Apr 2024 09:26:45 +0200 Message-ID: <20240425072645.2891385-1-stefansf@linux.ibm.com> X-Mailer: git-send-email 2.44.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: ryiSvkGSpsn3RghGKvoyysAMLFVCTIeM X-Proofpoint-GUID: ryiSvkGSpsn3RghGKvoyysAMLFVCTIeM X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1011,Hydra:6.0.650,FMLib:17.11.176.26 definitions=2024-04-25_06,2024-04-25_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=579 impostorscore=0 lowpriorityscore=0 spamscore=0 suspectscore=0 malwarescore=0 adultscore=0 mlxscore=0 bulkscore=0 priorityscore=1501 clxscore=1015 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2404010000 definitions=main-2404250052 X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Bitcount operations popcount, clz, and ctz are emulated for narrow modes in case an operation is only supported for wider modes. Beside that ctz may be emulated via clz in expand_ctz. Reflect this in expression_expensive_p. I considered the emulation of ctz via clz as not expensive since this basically reduces to ctz (x) = c - (clz (x & ~x)) where c is the mode precision minus 1 which should be faster than a loop. Bootstrapped and regtested on x86_64 and s390. Though, this is probably stage1 material? gcc/ChangeLog: PR tree-optimization/110490 * tree-scalar-evolution.cc (expression_expensive_p): Also consider mode widening for popcount, clz, and ctz. --- gcc/tree-scalar-evolution.cc | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc index b0a5e09a77c..622c7246c1b 100644 --- a/gcc/tree-scalar-evolution.cc +++ b/gcc/tree-scalar-evolution.cc @@ -3458,6 +3458,28 @@ bitcount_call: && (optab_handler (optab, word_mode) != CODE_FOR_nothing)) break; + /* If popcount is available for a wider mode, we emulate the + operation for a narrow mode by first zero-extending the value + and then computing popcount in the wider mode. Analogue for + ctz. For clz we do the same except that we additionally have + to subtract the difference of the mode precisions from the + result. */ + if (is_a (mode, &int_mode)) + { + machine_mode wider_mode_iter; + FOR_EACH_WIDER_MODE (wider_mode_iter, mode) + if (optab_handler (optab, wider_mode_iter) + != CODE_FOR_nothing) + goto check_call_args; + /* Operation ctz may be emulated via clz in expand_ctz. */ + if (optab == ctz_optab) + { + FOR_EACH_WIDER_MODE_FROM (wider_mode_iter, mode) + if (optab_handler (clz_optab, wider_mode_iter) + != CODE_FOR_nothing) + goto check_call_args; + } + } return true; } break; @@ -3469,6 +3491,7 @@ bitcount_call: break; } +check_call_args: FOR_EACH_CALL_EXPR_ARG (arg, iter, expr) if (expression_expensive_p (arg, cond_overflow_p, cache, op_cost)) return true; -- 2.44.0