From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR04-HE1-obe.outbound.protection.outlook.com (mail-eopbgr70048.outbound.protection.outlook.com [40.107.7.48]) by sourceware.org (Postfix) with ESMTPS id 6DD063858D20 for ; Fri, 11 Nov 2022 13:57:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 6DD063858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=T6FAusD7oBxnI3wY66uLflGnJAzUJiiwHH7hvcX8VjZvmQoya2jTPzOLEHu7njVgQmwbvSJPVUJ/6Cxu7GC1McrQSBduvEC/Id2QSD2AoTCQla2gJ+yfRpgZCBs1xBo03sECiEGtLXmVRZX+fL+6tvtuoFBjEVcqYl+jeSia6yu7nzg3KSIdE2OdPFRVM8nSmCPltagDFjsNKhDgkZGxfiwYXUrZHfdfRGerelpvWvVa7cAdZEoZP1unigTDct7Cb1qda74l1FFP10qGjoIDEV822p0xnXvYudaVDBy9veCv2UlWHdf9zwXG8s5DmOEQWnCRn5Smb+WTvtw8Tkj2Bw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GpoXgxFfjoaRAHB+7M637qxyQz9D0KN9c/lXMn2unCc=; b=VXIU607ieeEq/R9r1d9N8qTZhc1qlbyMWU5hUTnSBqpc9JbXL/pde2eNWfz/qGvZere8QvG783pgbYVSKm4gne8EiFs9nwm/EvcT5e+gv+lSqAOxrZxGmrcwZGiIJc/oRfPCTNt+pu8hzHNyjJQAshzq7ujfafV+WpZgVLUVbrAzxAqvcySBDa+ProGHLjcjSvMKvhiG9V9CXzs3NT1VrSQ/VfFkFlohbCpyufAQbFvjhwoDWYRHL8471L7DXeRR/EiM2ItB01pB7HR34yolhYNjbP3ktP27ChFMIZlKfWxWgPmUZignuEn1UH2UfaeTFaR6HzkvKijdJhXF8vCzsw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=gcc.gnu.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GpoXgxFfjoaRAHB+7M637qxyQz9D0KN9c/lXMn2unCc=; b=K7X0RtY7Z0NjPar/MRbWVf6/WtM7n4JlnBcs9KND69OdWis6AoXI9Dvk805+PBU8ox5zrXKWaUpeHGXSGLXi6Tk9zrWB5DWJjK16G4NnTWBf91YVbThpGwc2bo46c2ysTw+Hh8DFSlU7PyCtm3nX1Qs8FbYyW+QcQi/bUdynvmk= Received: from DB8PR06CA0039.eurprd06.prod.outlook.com (2603:10a6:10:120::13) by GV1PR08MB8260.eurprd08.prod.outlook.com (2603:10a6:150:a1::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5813.11; Fri, 11 Nov 2022 13:56:54 +0000 Received: from DBAEUR03FT027.eop-EUR03.prod.protection.outlook.com (2603:10a6:10:120:cafe::b0) by DB8PR06CA0039.outlook.office365.com (2603:10a6:10:120::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27 via Frontend Transport; Fri, 11 Nov 2022 13:56:54 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; pr=C Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by DBAEUR03FT027.mail.protection.outlook.com (100.127.142.237) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5813.12 via Frontend Transport; Fri, 11 Nov 2022 13:56:54 +0000 Received: ("Tessian outbound 0800d254cb3b:v130"); Fri, 11 Nov 2022 13:56:54 +0000 X-CheckRecipientChecked: true X-CR-MTA-CID: da308a74766085d0 X-CR-MTA-TID: 64aa7808 Received: from d180db94546b.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id C46E1B44-E4DE-4D69-95BB-D1048975A3AE.1; Fri, 11 Nov 2022 13:56:47 +0000 Received: from EUR05-AM6-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id d180db94546b.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Fri, 11 Nov 2022 13:56:47 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=NYP+mpTcaUtWcFSJeZYJZxUNXdAkvX0fTVrn5+x6ywspRJkYifzbgzzgqQYb7I3dCkfWnQLXRoUeZzHnfeVoV877WikBEMg7H+yhzuLIdUNho75Lrgcx1kgdsPQakbW6D+uXnaB/yA+gG6I/+8TjVwp/WYe8P2Mf1mzmCqYLejZ1265BjFpDe/kmm33ka2XR8nApMdgH9d6KdqYmJFwTx9nuEMZIuJKnADNceZiox6jszIBBCWsztFfQFqWbgRnUQLEQvmfvYU43Bctfpc5TSwKs3R23nQIR7d2gArUPSGlAwUq4N3Oq+47cluW48SJHQmrSQ2EQl9jQpzcpKC8Gtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=GpoXgxFfjoaRAHB+7M637qxyQz9D0KN9c/lXMn2unCc=; b=AqYrSr/Xyq6jEEJ9sSfPVsi1qqam4maKifnOnGyrtWFvW0NWE6SN/6BgvIqJSxZ71XuZThACTpDjWNnz6oc2v2v4NlqOF/8dJ245+VV1P8cNiYlrfMf63oiP5+8jXMXrH2/Z7B84qQjp+9q9P9QYhSyTtoXt4d7x7SYTV/TYKs3qOvUQIiJ/zNNSuFD3L9eoXr5j0Q+CwHIg8TDFboUEsTvKFSBefco3sw0Mcls5RA1WxtM9f809qGvo9F9sphVWbzZlS85OTJMsItnV70heIPnh7DSImKorE5YIWUjn3yv9rW+7hQlObSmHuU4b2Qq2ySgrZro4FqvzEwpT+fUZDw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=GpoXgxFfjoaRAHB+7M637qxyQz9D0KN9c/lXMn2unCc=; b=K7X0RtY7Z0NjPar/MRbWVf6/WtM7n4JlnBcs9KND69OdWis6AoXI9Dvk805+PBU8ox5zrXKWaUpeHGXSGLXi6Tk9zrWB5DWJjK16G4NnTWBf91YVbThpGwc2bo46c2ysTw+Hh8DFSlU7PyCtm3nX1Qs8FbYyW+QcQi/bUdynvmk= Authentication-Results-Original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; Received: from AS8PR08MB6678.eurprd08.prod.outlook.com (2603:10a6:20b:398::8) by DBBPR08MB6106.eurprd08.prod.outlook.com (2603:10a6:10:202::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5813.13; Fri, 11 Nov 2022 13:56:44 +0000 Received: from AS8PR08MB6678.eurprd08.prod.outlook.com ([fe80::8256:29ca:bcd8:b754]) by AS8PR08MB6678.eurprd08.prod.outlook.com ([fe80::8256:29ca:bcd8:b754%7]) with mapi id 15.20.5813.013; Fri, 11 Nov 2022 13:56:44 +0000 Date: Fri, 11 Nov 2022 13:52:40 +0000 From: Andrew Carlotti To: gcc-patches@gcc.gnu.org Subject: [PATCH 3/8] middle-end: Refactor number_of_iterations_popcount Message-ID: References: Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-ClientProxiedBy: LO6P123CA0002.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:338::6) To AS8PR08MB6678.eurprd08.prod.outlook.com (2603:10a6:20b:398::8) MIME-Version: 1.0 X-MS-TrafficTypeDiagnostic: AS8PR08MB6678:EE_|DBBPR08MB6106:EE_|DBAEUR03FT027:EE_|GV1PR08MB8260:EE_ X-MS-Office365-Filtering-Correlation-Id: c31ad061-c98d-4d7a-8dc9-08dac3ec9717 x-checkrecipientrouted: true NoDisclaimer: true X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: Fii8/UVOKgBmc48Hu8tqiYE0J2Uxt5ci3V+TTet6KlJ3x3nEJTguKAx9hU7CXhFjlPoUNNJ3+avPefvqj5TWicLR81IMAWwOz2Y9jBh4Wa6dqi2IGd2A0nW7MkBSQi0kPmKTIvlwrZKXeraP3mJ2sTEMskZT1QrHvoS5h4+NNJH4SJe8oGc05/dofvUPBSvT6EDhAzCrDvAf+nQXTJogGgzMJO3DIyApfWzbuynxy0qKCW0NB5Gm3dIViY5wh6TuWWaU/43Sk5Z+UzMJIxkZRsGANO3f98ULpg1XaP5F3SBe7JqawEEZ6m/xWPlO5mvHxJrYLt6QNhxauN1b2p1Kq1tYA1bCqIpiiA30moQVQ4WlSo4x/7JJCBLmQPZfwR9y8GeB9k0k+IlVnwGig7YjJyl3xW8gjMs9QqHQmxdKPKEzCaiM/MRXhPywNpMgxKuK8JH7j+LNZ5tP0fJ98TvPDPDDPwfHlUV2WEQ3uPhtA5PyfAkz5S5Ut78I9mTHGY+CTdSE7gFuqA6RdUoUpzuSclSj5QodF+Z/OdVhleNEPdATlN4oKtrAPudaF9wQSasru0pZJMUaMxJdZujyPkttAkINz4GoPvJ1BtF6v8y4WuUsRJrvA5SoWsnr3wHI0xX2OI1dwfN5j1bO/CHVf4U9UQ6gQrcd0tjMtCgDqn7hDKfFDV4f6BixwsvP35VBIh937fPPTrRlpT1htzNWCu4ePkBzuj4vyuPL8gufns7k7fY= X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:AS8PR08MB6678.eurprd08.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(39860400002)(346002)(396003)(366004)(376002)(136003)(451199015)(8936002)(5660300002)(83380400001)(41300700001)(26005)(66946007)(66476007)(8676002)(316002)(6512007)(6916009)(66556008)(6506007)(86362001)(2906002)(44832011)(30864003)(186003)(38100700002)(84970400001)(6486002)(6666004)(478600001);DIR:OUT;SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: DBBPR08MB6106 Original-Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: DBAEUR03FT027.eop-EUR03.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: d8175f79-60f0-4e86-1b4e-08dac3ec9057 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: LIp0iC/onMBEhqaP636z5kdXHPnUOV3JnaIWHQobCZU+sLbutO0rgrLYkRm6QaubSmej/fUHnXAz7siFy363jesD70p/XnR8aAgo+DOkOnlw2SrEREG2xTHgwTp+88FGA8ERcC8g/Zr8FtDguY1wKXVcRtsopNopphT6+OdimyrBzLU4nroECy+Z+4t4SJdGvQ/PbeGbatoNqGh6txnAxYcnxsxGa88n8zuQA2BEcrDP2PU+JLCKL1278Cl2to1H8ICY+bEFyKEo1AVwMIXMgYsGi1w3FhYKpNW7VdBT4+D0M3R3IHWjw3UjDgZR2cQmMYAhX9YZim2YIemmECZGPo7Va6jpjBhzbSH5MDHIQQp4HXSOepDgKac7IJVu/a/8DS0YU28yaYJ20MXy2kJ/Fb3KylaA+hHAerv5fuDlXhY/vg2CQeOM5QEmFE68JPMfIr8CG7mq5LJ64cLv0HJC1y9EGDGubq9R8KG5viuOePXMbge3zp5BzBx29FyeF2oFwGAGPEPh+HSHOpQ5TaQn7pea+3UpyvYTsplGiZjmjsV5hUGdBSpjDtCd8IsDUyOOmtmkng4jG//C4UbcakdUyW7Xkv4WL5k5HRXvLuv7Y+EMFfb2lqmEBc/XByIKXVGPvYESWkYjgboBbgYzbktOd4r4rIvb4UYv8MD7/Gjk9TKu1a+O6VFg6B8cttBPhu9euKb+pfZrsfR7SaLLefdGz7M0lX8iBVy3lZ67708WDZ+tSIEFm5qxK1QIWFOD9pSRSL51dhR9WOsVSk3iMDyDQfqPiCmeli7LTuejecq5yIU= X-Forefront-Antispam-Report: CIP:63.35.35.123;CTRY:IE;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:64aa7808-outbound-1.mta.getcheckrecipient.com;PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com;CAT:NONE;SFS:(13230022)(4636009)(39860400002)(396003)(376002)(346002)(136003)(451199015)(40470700004)(46966006)(36840700001)(8936002)(44832011)(6916009)(186003)(30864003)(40460700003)(6506007)(47076005)(5660300002)(316002)(41300700001)(2906002)(70586007)(6512007)(83380400001)(70206006)(26005)(6666004)(8676002)(336012)(81166007)(6486002)(478600001)(356005)(82310400005)(40480700001)(84970400001)(36860700001)(86362001)(82740400003);DIR:OUT;SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Nov 2022 13:56:54.0947 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: c31ad061-c98d-4d7a-8dc9-08dac3ec9717 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d;Ip=[63.35.35.123];Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: DBAEUR03FT027.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: GV1PR08MB8260 X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,FORGED_SPF_HELO,GIT_PATCH_0,KAM_DMARC_NONE,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE,TXREP,UNPARSEABLE_RELAY autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This includes various changes to improve clarity, and to enable the code to be more similar to the clz and ctz idiom recognition added in subsequent patches. We create new number_of_iterations_bitcount function, which will be used to call the other bit-counting recognition functions added in subsequent patches, as well as a generic comment describing the loop structures that are common to each idiom. Some of the variables in number_of_iterations_popcount are given more descriptive names, and the popcount expression builder is extracted into a separate function. As part of the refactoring, we also fix a bug where the max loop count for modes shorter than an integer would be incorrectly computed as if the input mode were actually an integer. We also ensure that niter->max takes into account the final value for niter->niter (after any folding and simplifying), since if the latter is a constant, then record_estimate mandates that the two values are equivalent. gcc/ChangeLog: * tree-ssa-loop-niter.cc (number_of_iterations_exit_assumptions): Modify to call... (number_of_iterations_bitcount): ...this new function. (number_of_iterations_popcount): Now called by the above. Refactor, and extract popcount expression builder to... (build_popcount_expr): this new function. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/popcount-max.c: New test. -- diff --git a/gcc/testsuite/gcc.dg/tree-ssa/popcount-max.c b/gcc/testsuite/gcc.dg/tree-ssa/popcount-max.c new file mode 100644 index 0000000000000000000000000000000000000000..ca7204cbc3cea636183408e24d7dd36d702ffdb2 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/popcount-max.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fno-tree-loop-optimize -fdump-tree-optimized" } */ + +#define PREC (__CHAR_BIT__) + +int count1 (unsigned char b) { + int c = 0; + + while (b) { + b &= b - 1; + c++; + } + if (c <= PREC) + return 0; + else + return 34567; +} + +int count2 (unsigned char b) { + int c = 0; + + while (b) { + b &= b - 1; + c++; + } + if (c <= PREC - 1) + return 0; + else + return 76543; +} + +/* { dg-final { scan-tree-dump-times "34567" 0 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "76543" 1 "optimized" } } */ diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc index 0af34e46580bb9a6f9b40e09c9f29b8454a4aaf6..fece876099c1687569d6351e7d2416ea6acae5b5 100644 --- a/gcc/tree-ssa-loop-niter.cc +++ b/gcc/tree-ssa-loop-niter.cc @@ -2026,6 +2026,48 @@ number_of_iterations_cond (class loop *loop, return ret; } +/* Return an expression that computes the popcount of src. */ + +static tree +build_popcount_expr (tree src) +{ + tree fn; + int prec = TYPE_PRECISION (TREE_TYPE (src)); + int i_prec = TYPE_PRECISION (integer_type_node); + int li_prec = TYPE_PRECISION (long_integer_type_node); + int lli_prec = TYPE_PRECISION (long_long_integer_type_node); + if (prec <= i_prec) + fn = builtin_decl_implicit (BUILT_IN_POPCOUNT); + else if (prec == li_prec) + fn = builtin_decl_implicit (BUILT_IN_POPCOUNTL); + else if (prec == lli_prec || prec == 2 * lli_prec) + fn = builtin_decl_implicit (BUILT_IN_POPCOUNTLL); + else + return NULL_TREE; + + tree utype = unsigned_type_for (TREE_TYPE (src)); + src = fold_convert (utype, src); + if (prec < i_prec) + src = fold_convert (unsigned_type_node, src); + tree call; + if (prec == 2 * lli_prec) + { + tree src1 = fold_convert (long_long_unsigned_type_node, + fold_build2 (RSHIFT_EXPR, TREE_TYPE (src), + unshare_expr (src), + build_int_cst (integer_type_node, + lli_prec))); + tree src2 = fold_convert (long_long_unsigned_type_node, src); + tree call1 = build_call_expr (fn, 1, src1); + tree call2 = build_call_expr (fn, 1, src2); + call = fold_build2 (PLUS_EXPR, integer_type_node, call1, call2); + } + else + call = build_call_expr (fn, 1, src); + + return call; +} + /* Utility function to check if OP is defined by a stmt that is a val - 1. */ @@ -2041,45 +2083,18 @@ ssa_defined_by_minus_one_stmt_p (tree op, tree val) && integer_minus_onep (gimple_assign_rhs2 (stmt))); } -/* See if LOOP is a popcout implementation, determine NITER for the loop +/* See comment below for number_of_iterations_bitcount. + For popcount, we have: - We match: - - goto + modify: + _1 = iv_1 + -1 + iv_2 = iv_1 & _1 - - _1 = b_11 + -1 - b_6 = _1 & b_11 - - - b_11 = PHI + test: + if (iv != 0) - exit block - if (b_11 != 0) - goto - else - goto - - OR we match copy-header version: - if (b_5 != 0) - goto - else - goto - - - b_11 = PHI - _1 = b_11 + -1 - b_6 = _1 & b_11 - - exit block - if (b_6 != 0) - goto - else - goto - - If popcount pattern, update NITER accordingly. - i.e., set NITER to __builtin_popcount (b) - return true if we did, false otherwise. + modification count: + popcount (src) */ @@ -2088,138 +2103,150 @@ number_of_iterations_popcount (loop_p loop, edge exit, enum tree_code code, class tree_niter_desc *niter) { - bool adjust = true; - tree iter; + bool modify_before_test = true; HOST_WIDE_INT max; - adjust = true; - tree fn = NULL_TREE; - - /* Check loop terminating branch is like - if (b != 0). */ - gimple *stmt = last_stmt (exit->src); - if (!stmt - || gimple_code (stmt) != GIMPLE_COND + + /* Check that condition for staying inside the loop is like + if (iv != 0). */ + gimple *cond_stmt = last_stmt (exit->src); + if (!cond_stmt + || gimple_code (cond_stmt) != GIMPLE_COND || code != NE_EXPR - || !integer_zerop (gimple_cond_rhs (stmt)) - || TREE_CODE (gimple_cond_lhs (stmt)) != SSA_NAME) + || !integer_zerop (gimple_cond_rhs (cond_stmt)) + || TREE_CODE (gimple_cond_lhs (cond_stmt)) != SSA_NAME) return false; - gimple *and_stmt = SSA_NAME_DEF_STMT (gimple_cond_lhs (stmt)); + tree iv_2 = gimple_cond_lhs (cond_stmt); + gimple *iv_2_stmt = SSA_NAME_DEF_STMT (iv_2); - /* Depending on copy-header is performed, feeding PHI stmts might be in - the loop header or loop latch, handle this. */ - if (gimple_code (and_stmt) == GIMPLE_PHI - && gimple_bb (and_stmt) == loop->header - && gimple_phi_num_args (and_stmt) == 2 - && (TREE_CODE (gimple_phi_arg_def (and_stmt, + /* If the test comes before the iv modification, then these will actually be + iv_1 and a phi node. */ + if (gimple_code (iv_2_stmt) == GIMPLE_PHI + && gimple_bb (iv_2_stmt) == loop->header + && gimple_phi_num_args (iv_2_stmt) == 2 + && (TREE_CODE (gimple_phi_arg_def (iv_2_stmt, loop_latch_edge (loop)->dest_idx)) == SSA_NAME)) { - /* SSA used in exit condition is defined by PHI stmt - b_11 = PHI - from the PHI stmt, get the and_stmt - b_6 = _1 & b_11. */ - tree t = gimple_phi_arg_def (and_stmt, loop_latch_edge (loop)->dest_idx); - and_stmt = SSA_NAME_DEF_STMT (t); - adjust = false; + /* iv_2 is actually one of the inputs to the phi. */ + iv_2 = gimple_phi_arg_def (iv_2_stmt, loop_latch_edge (loop)->dest_idx); + iv_2_stmt = SSA_NAME_DEF_STMT (iv_2); + modify_before_test = false; } - /* Make sure it is indeed an and stmt (b_6 = _1 & b_11). */ - if (!is_gimple_assign (and_stmt) - || gimple_assign_rhs_code (and_stmt) != BIT_AND_EXPR) + /* Make sure iv_2_stmt is an and stmt (iv_2 = _1 & iv_1). */ + if (!is_gimple_assign (iv_2_stmt) + || gimple_assign_rhs_code (iv_2_stmt) != BIT_AND_EXPR) return false; - tree b_11 = gimple_assign_rhs1 (and_stmt); - tree _1 = gimple_assign_rhs2 (and_stmt); + tree iv_1 = gimple_assign_rhs1 (iv_2_stmt); + tree _1 = gimple_assign_rhs2 (iv_2_stmt); - /* Check that _1 is defined by _b11 + -1 (_1 = b_11 + -1). - Also make sure that b_11 is the same in and_stmt and _1 defining stmt. + /* Check that _1 is defined by (_1 = iv_1 + -1). + Also make sure that _1 is the same in and_stmt and _1 defining stmt. Also canonicalize if _1 and _b11 are revrsed. */ - if (ssa_defined_by_minus_one_stmt_p (b_11, _1)) - std::swap (b_11, _1); - else if (ssa_defined_by_minus_one_stmt_p (_1, b_11)) + if (ssa_defined_by_minus_one_stmt_p (iv_1, _1)) + std::swap (iv_1, _1); + else if (ssa_defined_by_minus_one_stmt_p (_1, iv_1)) ; else return false; - /* Check the recurrence: - ... = PHI . */ - gimple *phi = SSA_NAME_DEF_STMT (b_11); + + /* Check the recurrence. */ + gimple *phi = SSA_NAME_DEF_STMT (iv_1); if (gimple_code (phi) != GIMPLE_PHI || (gimple_bb (phi) != loop_latch_edge (loop)->dest) - || (gimple_assign_lhs (and_stmt) - != gimple_phi_arg_def (phi, loop_latch_edge (loop)->dest_idx))) + || (iv_2 != gimple_phi_arg_def (phi, loop_latch_edge (loop)->dest_idx))) return false; - /* We found a match. Get the corresponding popcount builtin. */ + /* We found a match. */ tree src = gimple_phi_arg_def (phi, loop_preheader_edge (loop)->dest_idx); - if (TYPE_PRECISION (TREE_TYPE (src)) <= TYPE_PRECISION (integer_type_node)) - fn = builtin_decl_implicit (BUILT_IN_POPCOUNT); - else if (TYPE_PRECISION (TREE_TYPE (src)) - == TYPE_PRECISION (long_integer_type_node)) - fn = builtin_decl_implicit (BUILT_IN_POPCOUNTL); - else if (TYPE_PRECISION (TREE_TYPE (src)) - == TYPE_PRECISION (long_long_integer_type_node) - || (TYPE_PRECISION (TREE_TYPE (src)) - == 2 * TYPE_PRECISION (long_long_integer_type_node))) - fn = builtin_decl_implicit (BUILT_IN_POPCOUNTLL); + int src_precision = TYPE_PRECISION (TREE_TYPE (src)); - if (!fn) + /* Get the corresponding popcount builtin. */ + tree expr = build_popcount_expr (src); + + if (!expr) return false; - /* Update NITER params accordingly */ - tree utype = unsigned_type_for (TREE_TYPE (src)); - src = fold_convert (utype, src); - if (TYPE_PRECISION (TREE_TYPE (src)) < TYPE_PRECISION (integer_type_node)) - src = fold_convert (unsigned_type_node, src); - tree call; - if (TYPE_PRECISION (TREE_TYPE (src)) - == 2 * TYPE_PRECISION (long_long_integer_type_node)) + max = src_precision; + + tree may_be_zero = boolean_false_node; + + if (modify_before_test) { - int prec = TYPE_PRECISION (long_long_integer_type_node); - tree src1 = fold_convert (long_long_unsigned_type_node, - fold_build2 (RSHIFT_EXPR, TREE_TYPE (src), - unshare_expr (src), - build_int_cst (integer_type_node, - prec))); - tree src2 = fold_convert (long_long_unsigned_type_node, src); - call = build_call_expr (fn, 1, src1); - call = fold_build2 (PLUS_EXPR, TREE_TYPE (call), call, - build_call_expr (fn, 1, src2)); - call = fold_convert (utype, call); + expr = fold_build2 (MINUS_EXPR, integer_type_node, expr, + integer_one_node); + max = max - 1; + may_be_zero = fold_build2 (EQ_EXPR, boolean_type_node, src, + build_zero_cst (TREE_TYPE (src))); } - else - call = fold_convert (utype, build_call_expr (fn, 1, src)); - if (adjust) - iter = fold_build2 (MINUS_EXPR, utype, call, build_int_cst (utype, 1)); - else - iter = call; - if (TREE_CODE (call) == INTEGER_CST) - max = tree_to_uhwi (call); - else - max = TYPE_PRECISION (TREE_TYPE (src)); - if (adjust) - max = max - 1; + expr = fold_convert (unsigned_type_node, expr); - niter->niter = iter; niter->assumptions = boolean_true_node; + niter->may_be_zero = simplify_using_initial_conditions (loop, may_be_zero); + niter->niter = simplify_using_initial_conditions(loop, expr); - if (adjust) - { - tree may_be_zero = fold_build2 (EQ_EXPR, boolean_type_node, src, - build_zero_cst (TREE_TYPE (src))); - niter->may_be_zero - = simplify_using_initial_conditions (loop, may_be_zero); - } + if (TREE_CODE (niter->niter) == INTEGER_CST) + niter->max = tree_to_uhwi (niter->niter); else - niter->may_be_zero = boolean_false_node; + niter->max = max; - niter->max = max; niter->bound = NULL_TREE; niter->cmp = ERROR_MARK; return true; } +/* See if LOOP contains a bit counting idiom. The idiom consists of two parts: + 1. A modification to the induction variabler;. + 2. A test to determine whether or not to exit the loop. + + These can come in either order - i.e.: + + + iv_1 = PHI + if (test (iv_1)) + goto + else + goto + + + iv_2 = modify (iv_1) + goto + + OR + + + iv_1 = PHI + iv_2 = modify (iv_1) + + + if (test (iv_2)) + goto + else + goto + + The second form can be generated by copying the loop header out of the loop. + + In the first case, the number of latch executions will be equal to the + number of induction variable modifications required before the test fails. + + In the second case (modify_before_test), if we assume that the number of + modifications required before the test fails is nonzero, then the number of + latch executions will be one less than this number. + + If we recognise the pattern, then we update niter accordingly, and return + true. */ + +static bool +number_of_iterations_bitcount (loop_p loop, edge exit, + enum tree_code code, + class tree_niter_desc *niter) +{ + return number_of_iterations_popcount (loop, exit, code, niter); +} + /* Substitute NEW_TREE for OLD in EXPR and fold the result. If VALUEIZE is non-NULL then OLD and NEW_TREE are ignored and instead all SSA names are replaced with the result of calling the VALUEIZE @@ -2758,7 +2785,7 @@ number_of_iterations_exit_assumptions (class loop *loop, edge exit, tree iv0_niters = NULL_TREE; if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt), op0, &iv0, safe ? &iv0_niters : NULL, false)) - return number_of_iterations_popcount (loop, exit, code, niter); + return number_of_iterations_bitcount (loop, exit, code, niter); tree iv1_niters = NULL_TREE; if (!simple_iv_with_niters (loop, loop_containing_stmt (stmt), op1, &iv1, safe ? &iv1_niters : NULL, false))