From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpout30.security-mail.net (smtpout30.security-mail.net [85.31.212.36]) by sourceware.org (Postfix) with ESMTPS id 93DDB385842E for ; Mon, 17 Jul 2023 09:03:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 93DDB385842E Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=kalrayinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kalrayinc.com Received: from localhost (fx306.security-mail.net [127.0.0.1]) by fx306.security-mail.net (Postfix) with ESMTP id 8E71035CF20 for ; Mon, 17 Jul 2023 11:03:48 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kalrayinc.com; s=sec-sig-email; t=1689584628; bh=kwvyzmHUtX7OO7LfWdxQsUlTHZ9CdkMgnImUNI8o8yQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=MO62tyaPp9Lag8kVWT/H0bB3Na4pWP10x0ea5sa99KD/XVHMDRCpjE4clZbhwrj2N zwa/jLJu5T5C/YeOop+FDmhLFIo3A8z748RMOCLf29H3X5RmBXk9uiWIOoMjcsqbI+ rFAJmLgyjCEmi3mWFXNj/MdVHt/iL0mWEUHgr3zc= Received: from fx306 (fx306.security-mail.net [127.0.0.1]) by fx306.security-mail.net (Postfix) with ESMTP id 64DA635CF4C for ; Mon, 17 Jul 2023 11:03:48 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01lp0108.outbound.protection.outlook.com [104.47.24.108]) by fx306.security-mail.net (Postfix) with ESMTPS id D2D9F35CF14 for ; Mon, 17 Jul 2023 11:03:47 +0200 (CEST) Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) by PAZP264MB3040.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1e7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6588.32; Mon, 17 Jul 2023 09:03:46 +0000 Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9]) by MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9%4]) with mapi id 15.20.6588.031; Mon, 17 Jul 2023 09:03:46 +0000 X-Virus-Scanned: E-securemail Secumail-id: <12444.64b503f3.d1fcc.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OxijOG7Sv6kha6928zrt1vvv6x1wxV0yMpXnYCYpqQKrrLR625tmSa+2XyytV3NYPu3n2ydU1Lx4hB03eF8EwqAQfLNRcc+wbvyMXXCikR0hLj9vs9I8gcAetDo5pE9N1Cy6sSCG3qoieu/EZbZ0y6q33hhH9h5CKRQa0X525OCe0ErPDpM2ucA1d7SafhSHiZufUI/OzaP2PwWHMVYdNsL0mBsyC9M8u4cS2H2WENoqyzfw4H/ddErY5nMokzzuWNubs0r7pG8Vtf09FEkkqP34Wk1L9dX/SFJUdG8ir+L4M2P346zLlMAOmiv6jAer7uNSc707m5D7LI++DfcirQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=w+sqq1+ODwq36K3PxqKtX8PncLcQzkmp76ouzHJHuMQ=; b=H0wwcb5LFerVv6+ZKt782UfndpQO/6SEzz+t4vXsm1m+bmR4JR/Ktof/u4lygTmngAL11FBnI1T/zuvUNgQLZFutgBRtQwXAG9DEOcE/bwlkAAfer4B6dFnruEnowtxvfinxpkePP0YEYhghjxXUGtRiyZ711voUg3WCzhbG6LYRuh4vghegLoG9XpYb2zQREVeoTW0VPjmAieXTidRaV/DI8DfxWy+5SwbUSn8v/PRWxvvmUjWdEZqZYyqPeziVSThnihJuQdzSHstDWi89q7rTWa3mX5zf5zBJtjJdNSwH4dysku1QrVqrXIGYnig/I2HYYr9L6O2uZ5NAvQUv5Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kalrayinc.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=w+sqq1+ODwq36K3PxqKtX8PncLcQzkmp76ouzHJHuMQ=; b=aSZsSkWDHADvIDnKRHRTvF5moSoYGFL8cACO1ffGaqWVxFFCDXpi2JZwSBphP/V/yBaGfxk8RP1Z5JBNJX1fxrc2n92BlmgSzCfueEsEjUN4AlMMk0uxn6a9IUgmA7KOv4nKWrXT8Lp4bEHlX4TgbJWEpAAETohRx1+VSlxRfbR+RkjoRpFDlqE4/Z14ABuXRV9i6jc7omLB/ym+/edYP8mMz+EasXaEMR+L3d9YNuBdHt1a8wFuhPs3WZ+MMslkn+nnEGJe7/DakCq8ykKbDk2TjuPcelYn6Alleg6RspNjSQ2s8Tub52c7UZDx7CdYNBPkmLmWVWHQ7hFZ+We+Yw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=kalrayinc.com; From: Sylvain Noiry To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH 6/9] Native complex operations: Update how complex rotations are handled Date: Mon, 17 Jul 2023 11:02:47 +0200 Message-ID: <20230717090250.4645-7-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230717090250.4645-1-snoiry@kalrayinc.com> References: <20230717090250.4645-1-snoiry@kalrayinc.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-ClientProxiedBy: LNXP265CA0020.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:5e::32) To MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MR2P264MB0113:EE_|PAZP264MB3040:EE_ X-MS-Office365-Filtering-Correlation-Id: e1c474a4-402c-4cf4-27fb-08db86a4ba5f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: PMWm1J9SKfgf26tUrLuZrppYSWfa6QtOnKwMUhHiOvW5tojH8N2V4g3vzyqcz5IKvUCjbAj5wQxXBCjNwquqAol6wpM6TtggR03theTIFeaUfZ4f69ZmjFSnmUq1/IVzkBoPA1PMnqSaOfLHf++sV0P09UqfXMHYDbBIwAD6P2pc12gGV9tYF9z5MBh8L0+bnpSUwFh0eX6F7WoVtJ3olq+GcURfLLeT2CFrkLj5jPu+hILzP6AcsCDTLafNGqEGQtNIlMc3rVYD2UAxu/e6Y0Gt3VBMGdXlpV+z1CoNli1oTDCNH8CBjEel8T46SjC/DOFJBJdMDxpQJsbJxUZfSa8QnwVtBn/7UGB2JSx3g/jLR2O4GalcY6ZRfeM0qQ2tO3A8pOsZFe4QHp5kDd8DSI5GLYCu6Dw9GnSRBV7yIS2jnxv3MU47kIPYyWsc1haF1faDCb6pNNzXpiY8RqYeiOvrht7HWShUJvnbTss+zVod06EARkuVp/SC2fkb+wxjX3M9ygjxkTCiQNNoyaijqDb5fYfq4H/wUOPkSOQomIoU7JZbLaJFlWkp6zfMRyLD X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230028)(4636009)(396003)(366004)(346002)(136003)(39850400004)(376002)(451199021)(478600001)(6486002)(186003)(1076003)(6506007)(26005)(6512007)(107886003)(2906002)(15650500001)(30864003)(41300700001)(316002)(6916009)(4326008)(66476007)(66946007)(66556008)(5660300002)(36756003)(8676002)(8936002)(38100700002)(86362001)(2616005)(83380400001);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: u8XHuwVwMuIamgOBqztwFP6V7HjkGsKDUxyRDYuBwA0LYpJxQ7FR/DnjHru9hBlJRYj62w0JNofSFOjdnSFDg5RH7o14V7zS3KJq8F9VUNDGilKDaHZf3zLGbbcV3MK4RK744CI5JHyzEkr3T8Ac3Aw60/4xAzMGD+mIWui+p2E1vH2BmxSgC3vUncx/4q1q+KFlXm7Pr15KA4AcJZHThmqaOAQT4d4jykASSnd98XNEvJaM4C0Um+0nJYpkNzCdK9GxY4JLHunvThkCElbZcYU1j+bxE3lvJfW/4Ri1kt5VWFkzml4XSrLxErUDJe6ymraxIsNARoW/wLSIq8ieytDRpbA2iMcNFnW5MLQIBSbowHjZBis+cGlIooB3V9jhkOgLVkYlotiqCo0RuaDk17Emj6fA4FSt/KX743PUyuCJ/Qc7G30JZCTiVW2yKRo0HVX8fePKKTnUPos2SKen6ZtCJM4/FHV9EyozjPL2ELNC/DOtYXLplmNdL2JQBz+kpJQ/HjT18IupyemYoul5458nYq///kL8sIZk1/9z37KNdYT1i9+cliQ4rf8JRqgK4/6J665i3XpNTNGbZt7MtfoRgoXe3rva1dUqa179ObCDssAdBTL4kTKm2qdQOSr0/gBSr3Rw+/cz3lWbx9CSqP3wN2zuu9IFXfOwM5yW6l8uZPhhAWEWp+hw7DsBHE17klpWo219wVSWCHqHYBxBDgIprZyFbJb/YvuqNU/AS3H+C03/tGDcLh8NnfQww6wuBF2+6zruC6mIfhR5EJvBnvYyM5tQzVchTuRMrmV5fSglqIvlsKMMh4Qj6Z9a7xy/Ccd/n1FyOo/w3CMbfS1k8N5REEgar6TXfqy9zy8rVlHtMHdzE1+SLazanUavS8ievXGjZAS/ac2LjYwftzMVZlQN94DwQ37VnQohuQwav/R+0W+8cH9qZO+mIq4owvs9 8x6hGqUu4iPlYx3ggzSLYUroxDMyGCJN3UehUB+Kmstwo5f6rrIdyJqK+puGZcRgTigx5JAWK09jbbgbxD8I3Sq98ONw/IwSvE+5lspO1mjLA9DcU4Ed5dRGo6ZGySTpHHYho+3vmKj7kxEBhCCpY2UtpFyE7vwLyQUC8271b8mHQSGCE5nlr/YxLyYzUACtjhD4E9QIRLNPvc1a4GfVLTCq0c7KLz3s7xUQ4gD5XCGb8hXC0duMjL/SqWNwRHLW8MzepPqoW9fHz8Qy5v6ePFVMNu+BUzshtMDt36x/1J5DV/8mRfdpwAiexY4e9jMqTQ5ccGypeE06smcJO2S2TUSozUjPWwcowSb7zH/BSykOZhCRNnSC8/AAd7ylcCOGeP0eL6Y7HuQnawGabfKZHvv6ssIyHZHmE0KOF+zq143qMDVRNThQbUpKbEZA/bE5tDOhANptsk3zztRE3EKLuYj1FkkoFmSvgI3po0smmqdAdv3mKisKKZVlekxgwZ5dlvgn81f6tVTfJJQqP2eswVAN/9hGDnLrEraPIBwqRmt0Ld6MRKkcwb/mq0RYmdA4mjuo8Ozne9w8q9Qpq9hCdGROsuJwFActeSh6F6my41T+tqXcmyRzvflkVJ4PnAmK X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: e1c474a4-402c-4cf4-27fb-08db86a4ba5f X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jul 2023 09:03:46.3701 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: IFXQjYjFOcw99dcj00i/EpCtaYQD58pBxQTkvmaa/19fY6QqVhFB5dJ2ShvY3a9666kFTt2tVGpxB2zRucJwVQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAZP264MB3040 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Catch complex rotation by 90° and 270° in fold-const.cc like before, but now convert them into the new COMPLEX_ROT90 and COMPLEX_ROT270 internal functions. Also add crot90 and crot270 optabs to expose these operation the backends. So conditionnaly lower COMPLEX_ROT90/COMPLEX_ROT270 by checking if crot90/crot270 are in the optab. Finally, convert a + crot90/270(b) into cadd90/270(a, b) in a similar way than FMAs. gcc/ChangeLog: * internal-fn.def: Add COMPLEX_ROT90 and COMPLEX_ROT270 * fold-const.cc (fold_binary_loc): Update the folding of complex rotations to generate called to COMPLEX_ROT90 and COMPLEX_ROT270 * optabs.def: add crot90/crot270 optabs * tree-complex.cc (init_dont_simulate_again): Catch calls to COMPLEX_ROT90 and COMPLEX_ROT270 (expand_complex_rotation): Conditionally lower complex rotations if no pattern is present in the backend (expand_complex_operations_1): Likewise (convert_crot): Likewise * tree-ssa-math-opts.cc (convert_crot_1): Catch complex rotations with additions in a similar way the FMAs. (math_opts_dom_walker::after_dom_children): Call convert_crot if a COMPLEX_ROT90 or COMPLEX_ROT270 is identified --- gcc/fold-const.cc | 115 ++++++++++++++++++++++++++------- gcc/internal-fn.def | 2 + gcc/optabs.def | 2 + gcc/tree-complex.cc | 79 ++++++++++++++++++++++- gcc/tree-ssa-math-opts.cc | 129 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 302 insertions(+), 25 deletions(-) diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index a02ede79fed..f1224b6a548 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -11609,30 +11609,6 @@ fold_binary_loc (location_t loc, enum tree_code code, tree type, } else { - /* Fold z * +-I to __complex__ (-+__imag z, +-__real z). - This is not the same for NaNs or if signed zeros are - involved. */ - if (!HONOR_NANS (arg0) - && !HONOR_SIGNED_ZEROS (arg0) - && COMPLEX_FLOAT_TYPE_P (TREE_TYPE (arg0)) - && TREE_CODE (arg1) == COMPLEX_CST - && real_zerop (TREE_REALPART (arg1))) - { - tree rtype = TREE_TYPE (TREE_TYPE (arg0)); - if (real_onep (TREE_IMAGPART (arg1))) - return - fold_build2_loc (loc, COMPLEX_EXPR, type, - negate_expr (fold_build1_loc (loc, IMAGPART_EXPR, - rtype, arg0)), - fold_build1_loc (loc, REALPART_EXPR, rtype, arg0)); - else if (real_minus_onep (TREE_IMAGPART (arg1))) - return - fold_build2_loc (loc, COMPLEX_EXPR, type, - fold_build1_loc (loc, IMAGPART_EXPR, rtype, arg0), - negate_expr (fold_build1_loc (loc, REALPART_EXPR, - rtype, arg0))); - } - /* Optimize z * conj(z) for floating point complex numbers. Guarded by flag_unsafe_math_optimizations as non-finite imaginary components don't produce scalar results. */ @@ -11645,6 +11621,97 @@ fold_binary_loc (location_t loc, enum tree_code code, tree type, && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0)) return fold_mult_zconjz (loc, type, arg0); } + + /* Fold z * +-I to __complex__ (-+__imag z, +-__real z). + This is not the same for NaNs or if signed zeros are + involved. */ + if (!HONOR_NANS (arg0) + && !HONOR_SIGNED_ZEROS (arg0) + && TREE_CODE (arg1) == COMPLEX_CST + && (COMPLEX_FLOAT_TYPE_P (TREE_TYPE (arg0)) + && real_zerop (TREE_REALPART (arg1)))) + { + if (real_onep (TREE_IMAGPART (arg1))) + { + tree rtype = TREE_TYPE (TREE_TYPE (arg0)); + tree cplx_build = fold_build2_loc (loc, COMPLEX_EXPR, type, + negate_expr (fold_build1_loc (loc, IMAGPART_EXPR, + rtype, arg0)), + fold_build1_loc (loc, REALPART_EXPR, rtype, arg0)); + if (cplx_build && TREE_CODE (TREE_OPERAND (cplx_build, 0)) != NEGATE_EXPR) + return cplx_build; + + if ((TREE_CODE (arg0) == COMPLEX_EXPR) && real_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), TREE_OPERAND (arg0, 0)); + + if (TREE_CODE (arg0) == CALL_EXPR) + { + if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT90) + return negate_expr (CALL_EXPR_ARG (arg0, 0)); + else if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT270) + return CALL_EXPR_ARG (arg0, 0); + } + else if (TREE_CODE (arg0) == NEGATE_EXPR) + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT270, TREE_TYPE (arg0), 1, TREE_OPERAND(arg0, 0)); + else + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT90, TREE_TYPE (arg0), 1, arg0); + } + else if (real_minus_onep (TREE_IMAGPART (arg1))) + { + if (real_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), negate_expr (TREE_OPERAND (arg0, 0))); + + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT270, TREE_TYPE (arg0), 1, fold (arg0)); + } + } + + /* Fold z * +-I to __complex__ (-+__imag z, +-__real z). + This is not the same for NaNs or if signed zeros are + involved. */ + if (!HONOR_NANS (arg0) + && !HONOR_SIGNED_ZEROS (arg0) + && TREE_CODE (arg1) == COMPLEX_CST + && (COMPLEX_INTEGER_TYPE_P (TREE_TYPE (arg0)) + && integer_zerop (TREE_REALPART (arg1)))) + { + if (integer_onep (TREE_IMAGPART (arg1))) + { + tree rtype = TREE_TYPE (TREE_TYPE (arg0)); + tree cplx_build = fold_build2_loc (loc, COMPLEX_EXPR, type, + negate_expr (fold_build1_loc (loc, IMAGPART_EXPR, + rtype, arg0)), + fold_build1_loc (loc, REALPART_EXPR, rtype, arg0)); + if (cplx_build && TREE_CODE (TREE_OPERAND (cplx_build, 0)) != NEGATE_EXPR) + return cplx_build; + + if ((TREE_CODE (arg0) == COMPLEX_EXPR) && integer_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), TREE_OPERAND (arg0, 0)); + + if (TREE_CODE (arg0) == CALL_EXPR) + { + if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT90) + return negate_expr (CALL_EXPR_ARG (arg0, 0)); + else if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT270) + return CALL_EXPR_ARG (arg0, 0); + } + else if (TREE_CODE (arg0) == NEGATE_EXPR) + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT270, TREE_TYPE (arg0), 1, TREE_OPERAND(arg0, 0)); + else + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT90, TREE_TYPE (arg0), 1, arg0); + } + else if (integer_minus_onep (TREE_IMAGPART (arg1))) + { + if (integer_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), negate_expr (TREE_OPERAND (arg0, 0))); + + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT270, TREE_TYPE (arg0), 1, fold (arg0)); + } + } + goto associate; case BIT_IOR_EXPR: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index ea750a921ed..e3e32603dc1 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -385,6 +385,8 @@ DEF_INTERNAL_FLT_FN (SCALB, ECF_CONST, scalb, binary) DEF_INTERNAL_FLT_FLOATN_FN (FMIN, ECF_CONST, fmin, binary) DEF_INTERNAL_FLT_FLOATN_FN (FMAX, ECF_CONST, fmax, binary) DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_ROT90, ECF_CONST, crot90, unary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_ROT270, ECF_CONST, crot270, unary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT90, ECF_CONST, cadd90, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) diff --git a/gcc/optabs.def b/gcc/optabs.def index 31475c8afcc..afd15b1f30f 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -330,6 +330,8 @@ OPTAB_D (atan_optab, "atan$a2") OPTAB_D (atanh_optab, "atanh$a2") OPTAB_D (copysign_optab, "copysign$F$a3") OPTAB_D (xorsign_optab, "xorsign$F$a3") +OPTAB_D (crot90_optab, "crot90$a2") +OPTAB_D (crot270_optab, "crot270$a2") OPTAB_D (cadd90_optab, "cadd90$a3") OPTAB_D (cadd270_optab, "cadd270$a3") OPTAB_D (cmul_optab, "cmul$a3") diff --git a/gcc/tree-complex.cc b/gcc/tree-complex.cc index 63753e4acf4..b5aaa206319 100644 --- a/gcc/tree-complex.cc +++ b/gcc/tree-complex.cc @@ -241,7 +241,10 @@ init_dont_simulate_again (void) switch (gimple_code (stmt)) { case GIMPLE_CALL: - if (gimple_call_lhs (stmt)) + if (gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT90 + || gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT270) + saw_a_complex_op = true; + else if (gimple_call_lhs (stmt)) sim_again_p = is_complex_reg (gimple_call_lhs (stmt)); break; @@ -1727,6 +1730,67 @@ expand_complex_asm (gimple_stmt_iterator *gsi) } } +/* Expand complex rotations represented as internal functions + * This function assumes that lowered complex rotation is still better + * than a complex multiplication, else the backend would has redfined + * crot90 and crot270 */ + +static void +expand_complex_rotation (gimple_stmt_iterator *gsi) +{ + gimple *stmt = gsi_stmt (*gsi); + tree ac = gimple_call_arg (stmt, 0); + gimple_seq stmts = NULL; + location_t loc = gimple_location (gsi_stmt (*gsi)); + + tree lhs = gimple_get_lhs (stmt); + tree type = TREE_TYPE (ac); + tree inner_type = TREE_TYPE (type); + + + tree rr, ri, rb; + optab op = optab_for_tree_code (MULT_EXPR, inner_type, optab_default); + if (optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing) + { + tree cst_i = build_complex (type, build_zero_cst (inner_type), build_one_cst (inner_type)); + rb = gimple_build (&stmts, loc, MULT_EXPR, type, ac, cst_i); + + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + + gassign* new_assign = gimple_build_assign (lhs, rb); + gimple_set_lhs (new_assign, lhs); + gsi_replace (gsi, new_assign, true); + + update_complex_assignment (gsi, NULL, NULL, rb); + } + else + { + tree ar = extract_component (gsi, ac, REAL_P, true); + tree ai = extract_component (gsi, ac, IMAG_P, true); + + if (gimple_call_internal_fn (stmt) == IFN_COMPLEX_ROT90) + { + rr = gimple_build (&stmts, loc, NEGATE_EXPR, inner_type, ai); + ri = ar; + } + else if (gimple_call_internal_fn (stmt) == IFN_COMPLEX_ROT270) + { + rr = ai; + ri = gimple_build (&stmts, loc, NEGATE_EXPR, inner_type, ar); + } + else + gcc_unreachable (); + + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + + gassign* new_assign = gimple_build_assign (gimple_get_lhs (stmt), COMPLEX_EXPR, rr, ri); + gimple_set_lhs (new_assign, gimple_get_lhs (stmt)); + gsi_replace (gsi, new_assign, true); + + update_complex_assignment (gsi, rr, ri); + } +} + /* Returns true if a complex component is a constant */ static bool @@ -1843,6 +1907,19 @@ expand_complex_operations_1 (gimple_stmt_iterator *gsi) if (gimple_code (stmt) == GIMPLE_COND) return; + if (is_gimple_call (stmt) + && (gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT90 + || gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT270)) + { + if (!direct_internal_fn_supported_p (gimple_call_internal_fn (stmt), type, + bb_optimization_type (gimple_bb (stmt)))) + expand_complex_rotation (gsi); + else + update_complex_components (gsi, stmt, NULL, NULL, gimple_call_lhs (stmt)); + + return; + } + if (TREE_CODE (type) == COMPLEX_TYPE) expand_complex_move (gsi, type); else if (is_gimple_assign (stmt) diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc index 68fc518b1ab..c311e9ab29a 100644 --- a/gcc/tree-ssa-math-opts.cc +++ b/gcc/tree-ssa-math-opts.cc @@ -3286,6 +3286,119 @@ last_fma_candidate_feeds_initial_phi (fma_deferring_state *state, return false; } +/* Convert complex rotation to addition with one operation rotated + * in a similar way than FMAs */ + +static void +convert_crot_1 (tree crot_result, tree op1, internal_fn cadd_fn) +{ + gimple *use_stmt; + imm_use_iterator imm_iter; + gcall *cadd_stmt; + + FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, crot_result) + { + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); + tree add_op, result = crot_result; + + if (is_gimple_debug (use_stmt)) + continue; + + add_op = (gimple_assign_rhs1 (use_stmt) != result) + ? gimple_assign_rhs1 (use_stmt) : gimple_assign_rhs2 (use_stmt); + + + cadd_stmt = gimple_build_call_internal (cadd_fn, 2, add_op, op1); + gimple_set_lhs (cadd_stmt, gimple_get_lhs (use_stmt)); + gimple_call_set_nothrow (cadd_stmt, !stmt_can_throw_internal (cfun, + use_stmt)); + gsi_replace (&gsi, cadd_stmt, true); + + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Generated COMPLEX_ADD_ROT "); + print_gimple_stmt (dump_file, gsi_stmt (gsi), 0, TDF_NONE); + fprintf (dump_file, "\n"); + } + } +} + + +/* Convert complex rotation to addition with one operation rotated + * in a similar way than FMAs */ + +static bool +convert_crot (gimple *crot_stmt, tree op1, combined_fn crot_kind) +{ + internal_fn cadd_fn; + switch (crot_kind) + { + case CFN_COMPLEX_ROT90: + cadd_fn = IFN_COMPLEX_ADD_ROT90; + break; + case CFN_COMPLEX_ROT270: + cadd_fn = IFN_COMPLEX_ADD_ROT270; + break; + default: + gcc_unreachable (); + } + + + tree crot_result = gimple_get_lhs (crot_stmt); + /* If there isn't a LHS then this can't be an CADD. There can be no LHS + if the statement was left just for the side-effects. */ + if (!crot_result) + return false; + tree type = TREE_TYPE (crot_result); + gimple *use_stmt; + use_operand_p use_p; + imm_use_iterator imm_iter; + + if (COMPLEX_FLOAT_TYPE_P (type) + && flag_fp_contract_mode == FP_CONTRACT_OFF) + return false; + + /* We don't want to do bitfield reduction ops. */ + if (INTEGRAL_TYPE_P (type) + && (!type_has_mode_precision_p (type) || TYPE_OVERFLOW_TRAPS (type))) + return false; + + /* If the target doesn't support it, don't generate it. */ + optimization_type opt_type = bb_optimization_type (gimple_bb (crot_stmt)); + if (!direct_internal_fn_supported_p (cadd_fn, type, opt_type)) + return false; + + /* If the crot has zero uses, it is kept around probably because + of -fnon-call-exceptions. Don't optimize it away in that case, + it is DCE job. */ + if (has_zero_uses (crot_result)) + return false; + + /* Make sure that the crot statement becomes dead after + the transformation, thus that all uses are transformed to FMAs. + This means we assume that an FMA operation has the same cost + as an addition. */ + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, crot_result) + { + use_stmt = USE_STMT (use_p); + + if (is_gimple_debug (use_stmt)) + continue; + + if (gimple_bb (use_stmt) != gimple_bb (crot_stmt)) + return false; + + if (!is_gimple_assign (use_stmt)) + return false; + + if (gimple_assign_rhs_code (use_stmt) != PLUS_EXPR) + return false; + } + + convert_crot_1 (crot_result, op1, cadd_fn); + return true; +} + /* Combine the multiplication at MUL_STMT with operands MULOP1 and MULOP2 with uses in additions and subtractions to form fused multiply-add operations. Returns true if successful and MUL_STMT should be removed. @@ -5636,6 +5749,22 @@ math_opts_dom_walker::after_dom_children (basic_block bb) cancel_fma_deferring (&fma_state); break; + case CFN_COMPLEX_ROT90: + case CFN_COMPLEX_ROT270: + if (gimple_call_lhs (stmt) + && convert_crot (stmt, + gimple_call_arg (stmt, 0), + gimple_call_combined_fn (stmt))) + { + unlink_stmt_vdef (stmt); + if (gsi_remove (&gsi, true) + && gimple_purge_dead_eh_edges (bb)) + *m_cfg_changed_p = true; + release_defs (stmt); + continue; + } + break; + default: break; } -- 2.17.1