From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mxout.security-mail.net (mxout.security-mail.net [85.31.212.42]) by sourceware.org (Postfix) with ESMTPS id 3A09B3856DFE for ; Tue, 12 Sep 2023 10:08:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3A09B3856DFE Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=kalrayinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kalrayinc.com Received: from localhost (localhost [127.0.0.1]) by fx302.security-mail.net (Postfix) with ESMTP id 5F4DB4F8167 for ; Tue, 12 Sep 2023 12:07:59 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kalrayinc.com; s=sec-sig-email; t=1694513279; bh=dYXgkkJzDAQWorHeu1mePlZV2H0OSCcr6uDlIYZcYoo=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=DHMr9d93fTCNY0jZvCKgDakSfZuLjeJ2hn1oRLfEEmi740yTFlManZfNYdFJWVkvP pE0WlHD0Ogaq8NLUZRvOJ/BRJMc6HsmhDbgY/hHbk6KA2hvAtgI6slMr8HuBSY9vMV 6baABzP4cTlS9tx112Bn0oK96s0ri+Rw8Qeac45c= Received: from fx302 (localhost [127.0.0.1]) by fx302.security-mail.net (Postfix) with ESMTP id 3C61D4F8780 for ; Tue, 12 Sep 2023 12:07:59 +0200 (CEST) Received: from FRA01-MR2-obe.outbound.protection.outlook.com (mail-mr2fra01lp0102.outbound.protection.outlook.com [104.47.25.102]) by fx302.security-mail.net (Postfix) with ESMTPS id A76854F85B5 for ; Tue, 12 Sep 2023 12:07:58 +0200 (CEST) Received: from PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:184::6) by MRZP264MB2442.FRAP264.PROD.OUTLOOK.COM (2603:10a6:501:7::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6768.35; Tue, 12 Sep 2023 10:07:57 +0000 Received: from PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM ([fe80::ad9:f173:8a66:3461]) by PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM ([fe80::ad9:f173:8a66:3461%7]) with mapi id 15.20.6768.029; Tue, 12 Sep 2023 10:07:56 +0000 X-Virus-Scanned: E-securemail Secumail-id: <7dc5.6500387e.a6eea.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LdTMRK3yjoV/VefnwdYYhnpqblRULx7PfJuKY09BtmKiTVsuBVllDHdKlMhm5FXd3fVUz5vFnnT5gVPeOpBcheDWyTrlgUt2e/MHnmexEz3wbLiq5lOy2vEDydEsco7lkPMXrrWjGw2AQRtaf8lNWZM4a4LNEJuFl8BqamG3CchsSiH+7oFFD6kh9VyVYSkpToiU7HGTLgWP44g5052wjxLyuNXRk9f+VUgwxNSDL0dxDubD35P9AhDK4aEJs4seAadrD3TMWHO0AD6LEXYCLEGAntSCs+A5dCoBVEAHed0BdvwR9G3bUREWEHlyZDGkV3eF9I01FvzMaUFz1mRFtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=lmR+y6xVp/JFtZyFA0fhTIfwbaiuRwVm5NaIKa8WKw0=; b=KcNf//NYWwY69OYAHsn267rJsMCWQNJ3LteNQ7B1i2P51RGj6+29MKk2WZgQDHsoibz0m6tnRxvE9YHADnUvd7vFblU1e/dgSNKWXcpZqrq+qws+A1yTIWhJ22x+IsDhdlOVhRGEfZ8lP2T/J7tZO/8+KnSWyBGukOfzfpeH3jHIxLxaqTI5rGks8dmgzVQsfn4IlZIU1fWGAlNzLS4rj2n44VdqSvrZ8vBCQXSGj0ucVqgGQ9qA7ACZvdAJRE3SGvUqyvvk8569M11W2Si0Ug37+YXIXDNb5k99YjN2m6gCuu7Y9UQZ5JrcSEotgSydWAmB4X9Tmb7BpBjqT6X4NQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kalrayinc.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=lmR+y6xVp/JFtZyFA0fhTIfwbaiuRwVm5NaIKa8WKw0=; b=WHXkEqMRFJWMuuEWaBjsPITrTV1JEUDjx7fI5e/6L3ZA2yCVbzmeMYf7gJIQsNwOc3FVK8szpsI5dstXrNdxBxLkqnI2SUtvkRu2tiNCRwhPaGLYOVH0y7+yTAu+Os5J7RsVTvhJ8zKt9VGtmRlzxlDS7FmMXv1PfcKXYAjzDiWqro4tmHKhNP7ajZ1miZQVnqd+sc53+WHRZhX0p2yccp2xV+jyuXcpjPhO0+zMkDdtXbaZdhPzLn1EeyHcsNQFoAieCSWpZVUGorRM1oTEsVvBi0GOy2QjFrWMLhDUQr+FetC+hQNgGAaU1EN2Qtv4TdUM94QGcSEeBwEyEVQi8g== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=kalrayinc.com; From: Sylvain Noiry To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH v2 06/11] Native complex ops: Update how complex rotations are handled Date: Tue, 12 Sep 2023 12:07:08 +0200 Message-ID: <20230912100713.1074-7-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230912100713.1074-1-snoiry@kalrayinc.com> References: <20230717090250.4645-10-snoiry@kalrayinc.com> <20230912100713.1074-1-snoiry@kalrayinc.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-ClientProxiedBy: LO4P123CA0699.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:37b::6) To PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:184::6) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: PR1P264MB3448:EE_|MRZP264MB2442:EE_ X-MS-Office365-Filtering-Correlation-Id: ab0dfd41-07d9-4dbf-25ed-08dbb3782233 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: cOjHHOwjp84A7+8uNSqxIDjRZyWWqYNzYZ5hfV4d7CQ5Io7aXH1RiQ9hHLm3MIJMTd3KGOgMpZjG4lX/YZqyDNoBUayrx/JEXMS+0+9qDzHNcZaAoEsQaCqTyUurv6kSEtIyjK2U6H5d/+1H4ZvC5K48ASvsEiDxuPghklHW4AjQj7GTVff/Ux3c3sJ8u8+HzOf7cvyMJIaQemtXSWFLeNyRwN5ghhKDFNN67BoKqUPWVJWlWq7Yz+BRdIJPX/9ZQvwM0Z7MQZh2bMnQ8CTEffwana2INBrfGSAzunMo+kooMo9VGXHRsf+d62ow7uT3ztTNqqecil3SP6u0W6Dlit5yPJHz9o95I6eBIny0gHOnVCxbGskyiDg960Bi/3jiWGck8mqDLSKIRznpaYKQkmzXSNVg+aNH4siGpyPJDDeXup6s5HtwfyUEHUvvuVzou4mInFyC5dHKH+ycCKWcbbnTN81qVEVsYxPB7GHGGTZQVKAqf8uixt7oaaGH5pU0ZYVhXBFd7AReX4Yg4A2LK7uRFfewVcia87WBQPnz1hIgnuM80vYzzsTCVYpi56zG X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230031)(136003)(366004)(376002)(346002)(396003)(39860400002)(451199024)(186009)(1800799009)(1076003)(107886003)(2616005)(5660300002)(4326008)(8936002)(8676002)(6512007)(26005)(6486002)(6666004)(6506007)(83380400001)(66476007)(66556008)(66946007)(38100700002)(478600001)(41300700001)(6916009)(316002)(36756003)(86362001)(30864003)(2906002)(15650500001);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: gvwQiefuN7qXEBdcaZsM0NE5ysioGt9IfdmdRZZtMvOYvTw3xILCeRoC7zyVxtOfWFA15JYqH5vSCSHKLXwnS4rmXANHX/jT4a8ltFrr227f4sziRfZUP2prIBSRH38L9JqRBD+hOwu/M6D/Cd9PMwRGRWtoJIN4nugwRfCqAO5rYCF2D5IjyruKDEhNbrt6O4VDinltF5yZ2jUSL1nix2EB65NUeQDkaARTHaUi5QqSn+7hcfz2oChG9ofoeWAsGpQkoihWMqcmTmz3lLOHTzZZGVepBJy2hEcRm6lILCccS95DOHeCwYoXmYRiQ+fzY8MDmC+InKynz+xDTOvEdT78wZRFWMlu4dnhinicgZCXYhfYa3BfFGLdeFnQwYvJNqo5yDluFTM7tWOcZ2FAvHDILSgxjTT4aUqQv6wpR2wUsiAcUB1Upz4bGMyc9+d0iu4uG6xuDgiuFnEx3Nau5lEPrd2royeLy31DrMTBieVb0z2I1qf1kwl1eBpwH2d7NVBecu6rop0kPlKaWO1kj9xobg86IV8xLweGxVSaJA55vaiosO7Nq5H7SYBX3AYuirjmDeBciDH3UG+wHMfgXOGCYMDQ0WeH3dKjXRuWi/Kjt1shpnrzUCpJ0raXQDaoWV6br64Z0dxEfrlccyuR1V9R/OKIf8o11MWnEwicTrpHg/SvtK2XhiozuFpssSUfE5hOWBwt71bjCx/lE8MrsI/EFexZiND+LkbcR/msSMorgo0xaZBfKM7jV4DPx9oL8nl71pjky2qhzBPwky7P3Nlb0Lch757Vvrjp7yB2E3VxNxZEC3kiOm7cd4vVLdEdpf7pU8SuEEUpr/sBZkTd9Mcal9ZV5Wfr89X/LHiSnHXyeqhKUkT+JSYP+pLFxDdjL6gM6qH0z2p/gjMK7qjnRirSgvKdYhOPdHUTKpnRNLU+ftYQTQ1yPSflAMNN+Cdk SLHS/de1uk9kEqU3/S/VBIVAtuYLeI095vYeHmKGa3epvi1OFRb7PyXeG0XnHUi3l71JHg6+XyoS6h3xdMitaT8irkXFbBz/0E+lk8dnlNJiUIfeQFX2KYtdGYSadjxAM7o8HBkFW9iYenIBksHurl5uw+QQ6hfuJgnCm8ipBZVGF+Ulf+/y3Osh9cr4K+fFEMIaeQ0AV2IzRksEk8ZrFgAQniKgjosF2x6D+wp2Zn1GfPUh57evkQrV3pmYTKj62Vx6w0Pe/4GJbu3KALppkCP8XclEX1ZkbdREf8mH0I3HDBa9tTXy121ZwqvdXAG5VrgoYPe4Et66xInqmDNhhGyKzXcBtNulFjD2a6NrgS7c2OxZqKZNS90KlhkQ73TNEkPHb7nm4674QdDjER31u2W3eY4EjKtpL+fW6HtNFwb4uELbDhrTvCJ19ibx0dpB5o9qErvc8FcS5zPoXslIYib2SBbUGDBl6iSq2DXu/pNGPqIswY+HHiiloFXfVmpBJH79q3vFkEpIlZ3+8C+1F2h+Hzd8mIYPxEn2L5sakWNgDjk5Ph2N26HrnDoyLY122RyG8hufJJ2CGU6hptNhzohzzbYg0O46teLAxR2F+CIMUJHTMD0H+yshT0uQoZc725gePhm/a4zxY3rIyaDKuw== X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: ab0dfd41-07d9-4dbf-25ed-08dbb3782233 X-MS-Exchange-CrossTenant-AuthSource: PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Sep 2023 10:07:55.5437 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ivSd++ozXrbaI7eBVVwEHvT3tjT4Tp5gCgvbFQjFSRm8oW5Q1EofFGQ76ZzlqLzx9pulLGZq+Ew2+ueEX+gkuw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MRZP264MB2442 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Summary: Catch complex rotation by 90° and 270° in fold-const.cc like before, but now convert them into the new COMPLEX_ROT90 and COMPLEX_ROT270 internal functions. Also add crot90 and crot270 optabs to expose these operation the backends. So conditionnaly lower COMPLEX_ROT90/COMPLEX_ROT270 by checking if crot90/crot270 are in the optab. Finally, convert a + crot90/270(b) into cadd90/270(a, b) in a similar way than FMAs. gcc/ChangeLog: * internal-fn.def: Add COMPLEX_ROT90 and COMPLEX_ROT270 * fold-const.cc (fold_binary_loc): Update the folding of complex rotations to generate called to COMPLEX_ROT90 and COMPLEX_ROT270 * optabs.def: add crot90/crot270 optabs * tree-complex.cc (init_dont_simulate_again): Catch calls to COMPLEX_ROT90 and COMPLEX_ROT270 (expand_complex_rotation): Conditionally lower complex rotations if no pattern is present in the backend (expand_complex_operations_1): Likewise (convert_crot): Likewise * tree-ssa-math-opts.cc (convert_crot_1): Catch complex rotations with additions in a similar way the FMAs. (math_opts_dom_walker::after_dom_children): Call convert_crot if a COMPLEX_ROT90 or COMPLEX_ROT270 is identified --- gcc/fold-const.cc | 145 +++++++++++++++++++++++++++++++------- gcc/internal-fn.def | 2 + gcc/optabs.def | 2 + gcc/tree-complex.cc | 83 +++++++++++++++++++++- gcc/tree-ssa-math-opts.cc | 128 +++++++++++++++++++++++++++++++++ 5 files changed, 335 insertions(+), 25 deletions(-) diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index d19b4666c65..dc05599c7fe 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -11865,30 +11865,6 @@ fold_binary_loc (location_t loc, enum tree_code code, tree type, } else { - /* Fold z * +-I to __complex__ (-+__imag z, +-__real z). - This is not the same for NaNs or if signed zeros are - involved. */ - if (!HONOR_NANS (arg0) - && !HONOR_SIGNED_ZEROS (arg0) - && COMPLEX_FLOAT_TYPE_P (TREE_TYPE (arg0)) - && TREE_CODE (arg1) == COMPLEX_CST - && real_zerop (TREE_REALPART (arg1))) - { - tree rtype = TREE_TYPE (TREE_TYPE (arg0)); - if (real_onep (TREE_IMAGPART (arg1))) - return - fold_build2_loc (loc, COMPLEX_EXPR, type, - negate_expr (fold_build1_loc (loc, IMAGPART_EXPR, - rtype, arg0)), - fold_build1_loc (loc, REALPART_EXPR, rtype, arg0)); - else if (real_minus_onep (TREE_IMAGPART (arg1))) - return - fold_build2_loc (loc, COMPLEX_EXPR, type, - fold_build1_loc (loc, IMAGPART_EXPR, rtype, arg0), - negate_expr (fold_build1_loc (loc, REALPART_EXPR, - rtype, arg0))); - } - /* Optimize z * conj(z) for floating point complex numbers. Guarded by flag_unsafe_math_optimizations as non-finite imaginary components don't produce scalar results. */ @@ -11901,6 +11877,127 @@ fold_binary_loc (location_t loc, enum tree_code code, tree type, && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0)) return fold_mult_zconjz (loc, type, arg0); } + + /* Fold z * +-I to __complex__ (-+__imag z, +-__real z). + This is not the same for NaNs or if signed zeros are + involved. */ + if (!HONOR_NANS (arg0) + && !HONOR_SIGNED_ZEROS (arg0) + && TREE_CODE (arg1) == COMPLEX_CST + && (COMPLEX_FLOAT_TYPE_P (TREE_TYPE (arg0)) + && real_zerop (TREE_REALPART (arg1)))) + { + if (real_onep (TREE_IMAGPART (arg1))) + { + tree rtype = TREE_TYPE (TREE_TYPE (arg0)); + tree cplx_build = fold_build2_loc (loc, COMPLEX_EXPR, type, + negate_expr (fold_build1_loc + (loc, + IMAGPART_EXPR, + rtype, arg0)), + fold_build1_loc (loc, + REALPART_EXPR, + rtype, + arg0)); + if (cplx_build + && TREE_CODE (TREE_OPERAND (cplx_build, 0)) != NEGATE_EXPR) + return cplx_build; + + if ((TREE_CODE (arg0) == COMPLEX_EXPR) + && real_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), + TREE_OPERAND (arg0, 0)); + + if (TREE_CODE (arg0) == CALL_EXPR) + { + if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT90) + return negate_expr (CALL_EXPR_ARG (arg0, 0)); + else if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT270) + return CALL_EXPR_ARG (arg0, 0); + } + else if (TREE_CODE (arg0) == NEGATE_EXPR) + return build_call_expr_internal_loc (loc, IFN_COMPLEX_ROT270, + TREE_TYPE (arg0), 1, + TREE_OPERAND (arg0, 0)); + else + return build_call_expr_internal_loc (loc, IFN_COMPLEX_ROT90, + TREE_TYPE (arg0), 1, + arg0); + } + else if (real_minus_onep (TREE_IMAGPART (arg1))) + { + if (real_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), + negate_expr (TREE_OPERAND (arg0, 0))); + + return build_call_expr_internal_loc (loc, IFN_COMPLEX_ROT270, + TREE_TYPE (arg0), 1, + fold (arg0)); + } + } + + /* Fold z * +-I to __complex__ (-+__imag z, +-__real z). + This is not the same for NaNs or if signed zeros are + involved. */ + if (!HONOR_NANS (arg0) + && !HONOR_SIGNED_ZEROS (arg0) + && TREE_CODE (arg1) == COMPLEX_CST + && (COMPLEX_INTEGER_TYPE_P (TREE_TYPE (arg0)) + && integer_zerop (TREE_REALPART (arg1)))) + { + if (integer_onep (TREE_IMAGPART (arg1))) + { + tree rtype = TREE_TYPE (TREE_TYPE (arg0)); + tree cplx_build = fold_build2_loc (loc, COMPLEX_EXPR, type, + negate_expr (fold_build1_loc + (loc, + IMAGPART_EXPR, + rtype, arg0)), + fold_build1_loc (loc, + REALPART_EXPR, + rtype, + arg0)); + if (cplx_build + && TREE_CODE (TREE_OPERAND (cplx_build, 0)) != NEGATE_EXPR) + return cplx_build; + + if ((TREE_CODE (arg0) == COMPLEX_EXPR) + && integer_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), + TREE_OPERAND (arg0, 0)); + + if (TREE_CODE (arg0) == CALL_EXPR) + { + if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT90) + return negate_expr (CALL_EXPR_ARG (arg0, 0)); + else if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT270) + return CALL_EXPR_ARG (arg0, 0); + } + else if (TREE_CODE (arg0) == NEGATE_EXPR) + return build_call_expr_internal_loc (loc, IFN_COMPLEX_ROT270, + TREE_TYPE (arg0), 1, + TREE_OPERAND (arg0, 0)); + else + return build_call_expr_internal_loc (loc, IFN_COMPLEX_ROT90, + TREE_TYPE (arg0), 1, + arg0); + } + else if (integer_minus_onep (TREE_IMAGPART (arg1))) + { + if (integer_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), + negate_expr (TREE_OPERAND (arg0, 0))); + + return build_call_expr_internal_loc (loc, IFN_COMPLEX_ROT270, + TREE_TYPE (arg0), 1, + fold (arg0)); + } + } + goto associate; case BIT_IOR_EXPR: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index a2023ab9c3d..0ac6cd98a4f 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -390,6 +390,8 @@ DEF_INTERNAL_FLT_FN (SCALB, ECF_CONST, scalb, binary) DEF_INTERNAL_FLT_FLOATN_FN (FMIN, ECF_CONST, fmin, binary) DEF_INTERNAL_FLT_FLOATN_FN (FMAX, ECF_CONST, fmax, binary) DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_ROT90, ECF_CONST, crot90, unary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_ROT270, ECF_CONST, crot270, unary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT90, ECF_CONST, cadd90, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) diff --git a/gcc/optabs.def b/gcc/optabs.def index 8405d365c97..d146cac5eec 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -334,6 +334,8 @@ OPTAB_D (atan_optab, "atan$a2") OPTAB_D (atanh_optab, "atanh$a2") OPTAB_D (copysign_optab, "copysign$F$a3") OPTAB_D (xorsign_optab, "xorsign$F$a3") +OPTAB_D (crot90_optab, "crot90$a2") +OPTAB_D (crot270_optab, "crot270$a2") OPTAB_D (cadd90_optab, "cadd90$a3") OPTAB_D (cadd270_optab, "cadd270$a3") OPTAB_D (cmul_optab, "cmul$a3") diff --git a/gcc/tree-complex.cc b/gcc/tree-complex.cc index d889a99d513..d814e407af6 100644 --- a/gcc/tree-complex.cc +++ b/gcc/tree-complex.cc @@ -241,7 +241,10 @@ init_dont_simulate_again (void) switch (gimple_code (stmt)) { case GIMPLE_CALL: - if (gimple_call_lhs (stmt)) + if (gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT90 + || gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT270) + saw_a_complex_op = true; + else if (gimple_call_lhs (stmt)) sim_again_p = is_complex_reg (gimple_call_lhs (stmt)); break; @@ -1730,6 +1733,69 @@ expand_complex_asm (gimple_stmt_iterator *gsi) } } +/* Expand complex rotations represented as internal functions + This function assumes that lowered complex rotation is still better + than a complex multiplication, else the backend would have redefined + crot90 and crot270. */ + +static void +expand_complex_rotation (gimple_stmt_iterator *gsi) +{ + gimple *stmt = gsi_stmt (*gsi); + tree ac = gimple_call_arg (stmt, 0); + gimple_seq stmts = NULL; + location_t loc = gimple_location (gsi_stmt (*gsi)); + + tree lhs = gimple_get_lhs (stmt); + tree type = TREE_TYPE (ac); + tree inner_type = TREE_TYPE (type); + + + tree rr, ri, rb; + optab op = optab_for_tree_code (MULT_EXPR, inner_type, optab_default); + if (optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing) + { + tree cst_i = build_complex (type, build_zero_cst (inner_type), + build_one_cst (inner_type)); + rb = gimple_build (&stmts, loc, MULT_EXPR, type, ac, cst_i); + + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + + gassign *new_assign = gimple_build_assign (lhs, rb); + gimple_set_lhs (new_assign, lhs); + gsi_replace (gsi, new_assign, true); + + update_complex_assignment (gsi, NULL, NULL, rb); + } + else + { + tree ar = extract_component (gsi, ac, REAL_P, true); + tree ai = extract_component (gsi, ac, IMAG_P, true); + + if (gimple_call_internal_fn (stmt) == IFN_COMPLEX_ROT90) + { + rr = gimple_build (&stmts, loc, NEGATE_EXPR, inner_type, ai); + ri = ar; + } + else if (gimple_call_internal_fn (stmt) == IFN_COMPLEX_ROT270) + { + rr = ai; + ri = gimple_build (&stmts, loc, NEGATE_EXPR, inner_type, ar); + } + else + gcc_unreachable (); + + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + + gassign *new_assign = + gimple_build_assign (gimple_get_lhs (stmt), COMPLEX_EXPR, rr, ri); + gimple_set_lhs (new_assign, gimple_get_lhs (stmt)); + gsi_replace (gsi, new_assign, true); + + update_complex_assignment (gsi, rr, ri); + } +} + /* Returns true if a complex component is a constant. */ static bool @@ -1859,6 +1925,21 @@ expand_complex_operations_1 (gimple_stmt_iterator *gsi) if (gimple_code (stmt) == GIMPLE_COND) return; + if (is_gimple_call (stmt) + && (gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT90 + || gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT270)) + { + if (!direct_internal_fn_supported_p + (gimple_call_internal_fn (stmt), type, + bb_optimization_type (gimple_bb (stmt)))) + expand_complex_rotation (gsi); + else + update_complex_components (gsi, stmt, NULL, NULL, + gimple_call_lhs (stmt)); + + return; + } + if (TREE_CODE (type) == COMPLEX_TYPE) expand_complex_move (gsi, type); else if (is_gimple_assign (stmt) diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc index 95c22694368..74b9a993e2d 100644 --- a/gcc/tree-ssa-math-opts.cc +++ b/gcc/tree-ssa-math-opts.cc @@ -3291,6 +3291,118 @@ last_fma_candidate_feeds_initial_phi (fma_deferring_state *state, return false; } +/* Convert complex rotation to addition with one operation rotated + in a similar way than FMAs. */ + +static void +convert_crot_1 (tree crot_result, tree op1, internal_fn cadd_fn) +{ + gimple *use_stmt; + imm_use_iterator imm_iter; + gcall *cadd_stmt; + + FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, crot_result) + { + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); + tree add_op, result = crot_result; + + if (is_gimple_debug (use_stmt)) + continue; + + add_op = (gimple_assign_rhs1 (use_stmt) != result) + ? gimple_assign_rhs1 (use_stmt) : gimple_assign_rhs2 (use_stmt); + + + cadd_stmt = gimple_build_call_internal (cadd_fn, 2, add_op, op1); + gimple_set_lhs (cadd_stmt, gimple_get_lhs (use_stmt)); + gimple_call_set_nothrow (cadd_stmt, !stmt_can_throw_internal (cfun, + use_stmt)); + gsi_replace (&gsi, cadd_stmt, true); + + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Generated COMPLEX_ADD_ROT "); + print_gimple_stmt (dump_file, gsi_stmt (gsi), 0, TDF_NONE); + fprintf (dump_file, "\n"); + } + } +} + + +/* Convert complex rotation to addition with one operation rotated + in a similar way than FMAs. */ + +static bool +convert_crot (gimple *crot_stmt, tree op1, combined_fn crot_kind) +{ + internal_fn cadd_fn; + switch (crot_kind) + { + case CFN_COMPLEX_ROT90: + cadd_fn = IFN_COMPLEX_ADD_ROT90; + break; + case CFN_COMPLEX_ROT270: + cadd_fn = IFN_COMPLEX_ADD_ROT270; + break; + default: + gcc_unreachable (); + } + + + tree crot_result = gimple_get_lhs (crot_stmt); + /* If there isn't a LHS then this can't be an CADD. There can be no LHS + if the statement was left just for the side-effects. */ + if (!crot_result) + return false; + tree type = TREE_TYPE (crot_result); + gimple *use_stmt; + use_operand_p use_p; + imm_use_iterator imm_iter; + + if (COMPLEX_FLOAT_TYPE_P (type) && flag_fp_contract_mode == FP_CONTRACT_OFF) + return false; + + /* We don't want to do bitfield reduction ops. */ + if (INTEGRAL_TYPE_P (type) + && (!type_has_mode_precision_p (type) || TYPE_OVERFLOW_TRAPS (type))) + return false; + + /* If the target doesn't support it, don't generate it. */ + optimization_type opt_type = bb_optimization_type (gimple_bb (crot_stmt)); + if (!direct_internal_fn_supported_p (cadd_fn, type, opt_type)) + return false; + + /* If the crot has zero uses, it is kept around probably because + of -fnon-call-exceptions. Don't optimize it away in that case, + it is DCE job. */ + if (has_zero_uses (crot_result)) + return false; + + /* Make sure that the crot statement becomes dead after + the transformation, thus that all uses are transformed to FMAs. + This means we assume that an FMA operation has the same cost + as an addition. */ + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, crot_result) + { + use_stmt = USE_STMT (use_p); + + if (is_gimple_debug (use_stmt)) + continue; + + if (gimple_bb (use_stmt) != gimple_bb (crot_stmt)) + return false; + + if (!is_gimple_assign (use_stmt)) + return false; + + if (gimple_assign_rhs_code (use_stmt) != PLUS_EXPR) + return false; + } + + convert_crot_1 (crot_result, op1, cadd_fn); + return true; +} + /* Combine the multiplication at MUL_STMT with operands MULOP1 and MULOP2 with uses in additions and subtractions to form fused multiply-add operations. Returns true if successful and MUL_STMT should be removed. @@ -5839,6 +5951,22 @@ math_opts_dom_walker::after_dom_children (basic_block bb) cancel_fma_deferring (&fma_state); break; + case CFN_COMPLEX_ROT90: + case CFN_COMPLEX_ROT270: + if (gimple_call_lhs (stmt) + && convert_crot (stmt, + gimple_call_arg (stmt, 0), + gimple_call_combined_fn (stmt))) + { + unlink_stmt_vdef (stmt); + if (gsi_remove (&gsi, true) + && gimple_purge_dead_eh_edges (bb)) + *m_cfg_changed_p = true; + release_defs (stmt); + continue; + } + break; + default: break; } -- 2.17.1