From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpout30.security-mail.net (smtpout30.security-mail.net [85.31.212.37]) by sourceware.org (Postfix) with ESMTPS id 285A03853D14 for ; Tue, 12 Sep 2023 10:08:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 285A03853D14 Authentication-Results: sourceware.org; dmarc=pass (p=quarantine dis=none) header.from=kalrayinc.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kalrayinc.com Received: from localhost (localhost [127.0.0.1]) by fx301.security-mail.net (Postfix) with ESMTP id 38E57587371 for ; Tue, 12 Sep 2023 12:08:05 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kalrayinc.com; s=sec-sig-email; t=1694513285; bh=4tTndj5iJj0MgehadMZz7nHUkRXaJ87IZc9mPtzF8aY=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=cp32eTiCvlZY4QFCBG+QJ/ipGCreA5pB0W6AVY1tQTLTeBYws0UoOBleKWLu4js++ O0WXJyYT6sfp/xC8J9OfdIBcvIcn2kad2vqy3KXCxoCW3VDlJh3DR6yaaRbJYU74Px lZC2LcX8acpqjNGgo4xRp4u/wIDp/+Mhy4x1J8WU= Received: from fx301 (localhost [127.0.0.1]) by fx301.security-mail.net (Postfix) with ESMTP id 150FF58726E for ; Tue, 12 Sep 2023 12:08:05 +0200 (CEST) Received: from FRA01-MR2-obe.outbound.protection.outlook.com (mail-mr2fra01lp0107.outbound.protection.outlook.com [104.47.25.107]) by fx301.security-mail.net (Postfix) with ESMTPS id 95F60587164 for ; Tue, 12 Sep 2023 12:08:04 +0200 (CEST) Received: from PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:184::6) by MRZP264MB2442.FRAP264.PROD.OUTLOOK.COM (2603:10a6:501:7::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6768.35; Tue, 12 Sep 2023 10:08:03 +0000 Received: from PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM ([fe80::ad9:f173:8a66:3461]) by PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM ([fe80::ad9:f173:8a66:3461%7]) with mapi id 15.20.6768.029; Tue, 12 Sep 2023 10:08:03 +0000 X-Virus-Scanned: E-securemail Secumail-id: <55d.65003884.95631.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kNgbd14l2OE470aycJIldQfAdkn+HAkR0Bii3mmRVkEwfceLnFyKz96RuMx3ZaumOW9c+1tfcBhUNdJQf0jPV5n7Ncc9jtNKEPRIz7uXX8EQke/SPCoXIC6zHKIXx/QYO7XeNaz/dGyk8eta9ReZ8GWPLmiC0rd27D4Ea62okpOlHQqM7S22pBT00rBbMmJXDv5O5V9yygqIINO4E6Vse2BB08GTnLjYb3Ucf1iiWudoyKfePtXuTf94rEvtHbLTQ49dBFdFcyx13FpGejYe2WEKBKalMtOZynI92hO2c83NjEjcT4o43i8c/JI5hpS/RaKemPlTqwTelx0cvHaZpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Ate9Smc4NFU7RbVSk6j0YoSrU+yy/SNcnrYErPFVDpw=; b=mTeTtgUyM3RLerbJ03zuhjAN9RmDw0udmCGJPP6tcy4o5ulsHYwV2VczOG9OaUhQOQAegtnFlDopqvivxqmhIAzcwkPH4AbzC8ro2xpOvdu49GxQ9qEUrIbHiu3d0WL8MRRAOA/qNKDtYDC39guHvzRNcugI+rp5ybpkJUNgBuBs6wCBmHcI9A5kvRk/Q3IXalpa56JhJ/2Kz6UIyWvBGPxhhZ5ViGV1UYjFCzVu6qhsLUFTGhn5DHCOeVFbk8TPcwYJ2xWX645RFZTPHx9ow05tiwBJu3ceZyQN27mwTOmBb+Eto4DNdc+jIaUzmQTc6bkc+dpFm4is8xt20Wn9/A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kalrayinc.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Ate9Smc4NFU7RbVSk6j0YoSrU+yy/SNcnrYErPFVDpw=; b=L7v1bkTYCCi2Ne7JtLZ+0XPhMlr2gHRCTjkMfMtHjvyQQjKtb/1Ujy0GzkFxNh/YmSG9vWVVHry5hkI8H1hiVZdRkcw4oe3WNpShuV+8VjX0NWibUokQKdSdrQxE6DOxhNDJIuybtThRnWqKpbQbEMtTpEYjwma6i1vZv3gsMLYOmni9oWvYkMRjLi+h8NNFVkPaKUaNgnJryUT/5lHYULrSW4iTTA6A56Ow8DqDi94BNRY+qUPQAwOXQsqNADICOVNlCDNDFBu1Iztge9FPRafziuLr3fnvyaVsMMkCDTxmx2bVP/8ZhyKgAA5iGpTD2NYTRumpwazWT5LtvNG5yw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=kalrayinc.com; From: Sylvain Noiry To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH v2 10/11] Native complex ops: Add a fast complex multiplication pattern Date: Tue, 12 Sep 2023 12:07:12 +0200 Message-ID: <20230912100713.1074-11-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230912100713.1074-1-snoiry@kalrayinc.com> References: <20230717090250.4645-10-snoiry@kalrayinc.com> <20230912100713.1074-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LO4P123CA0187.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:1a4::12) To PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:184::6) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: PR1P264MB3448:EE_|MRZP264MB2442:EE_ X-MS-Office365-Filtering-Correlation-Id: 880eb860-5d86-40f8-a286-08dbb37826d8 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: npYNgHUDOOer9i+hNJKKy621VQcBT35/FkLeIqe/ZMwfmSmsh/Za16L2rS/wgCUD4+zQniZejDGJdnLvcAe8mWvgdEWQwrIr/kG95hNeUCdcoIK+o1lY2Z+zJXKxbD7IXLUg6yu4uwJenrG+8z8cax+manxL4U3R1BHDp3UrbZk8M0JbUyfPZN86T1CYUjIexajvuLmucBRou38nLi8wwZiu0kPRcYWp6LXd8GpVozbxwOXIRiOpy/28Fp8+Xcq7lAmfBBC7Y0iWZEUPeCliS+7CxoBYjLCbrE6jkFeq5ZJl7bC7gXMngJYXW7/0AE/hjjTxWO9NcEOYZFzLhXlAN70ChNHSoS73kbZVeKZyosI4KfjmOoJ0hVqFH3Og8Da0cWdUrmvWv5nHzcZOjM44K8zgVKMw/O1yhopuJ6lTJiDSUPJ8yDol6Oqy4V+myCdcXwjS01EiJjWC4kjp1JhHiavIowgexPlXqfnx2vXEgVnIFRQQghICUFme2uJGuKwS7Cf62d6T0gQ4R3+Q0+tIrD5w+sL+ZcIBRzXio3lH5ZIntOEXTiGeW6rfw5/n6qy1 X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM;PTR:;CAT:NONE;SFS:(13230031)(136003)(366004)(376002)(346002)(396003)(39860400002)(451199024)(186009)(1800799009)(1076003)(107886003)(2616005)(5660300002)(4326008)(8936002)(8676002)(6512007)(26005)(6486002)(6666004)(6506007)(83380400001)(66476007)(66556008)(66946007)(38100700002)(478600001)(41300700001)(6916009)(316002)(36756003)(86362001)(2906002);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: DWTXTY+Q1/GbG7f9yq21VZqtU+pzHXlCkhQ4q5Wt5W4FaHSzz8iQ4kfPxjk9PlvUF2RJJsjHy1SW2/FErR2ocFysazNBzuH7cHVSVhJBXKrfRDw8M5A7jX96rV5Yq18MwlibuZqq28dcz+9NkQTANyIoUzI4Rt3VHVdjbv2GZHeYEUO89rHd0SVRmKYfeic6A47v2fORVtfI1mXWOJYTYOORmRR3CdmS47BXe4czwvae6yAQc8WRXzPeY+r5GjbFy/wlIq1o+wmAfkUP78SRi8WGtaQogX+rbJiX0ST1CoUR13LAvSR+Cht5k345GHhmnNwcEph04Vu0LZzUtkDIcaVgXWrurpmYyUzRLJWuTazsPUKCWZXudQpfMd0iGIlxHl/3uwfuVgu98Jb/LJOW6wqpLL84DK6UG7EbZvwYtp7BSTCcmOiTKhFJEy3AYYp2H1v2NhQtwbC/klQGQTdmQKMb/Bqb6Uc0KjU4gqetBPBbucNlEm3fkNuTTDRyyozCVXjkp8zpaOruPuxUtkBnXcFeOqcGf63r3auZybu2VI7M8/w3G1Xfdh+GogSVRfxP9JcEc9ff/6/SKrwYFsWfF6b6mSl/BeUjV7aB0NyHmPkA+zt0lwDy+KcBPjBF7NdYI/rr+ChLSbGTMnLWmVW+N27jPh6LHGnYOQHDxApKA2j8NzcdHcZ7/ixHjvPC5c+J0wbhFLzzjzLiDBZEaXx2OQ3IUffNy9TPfEHNMWSUPyjsw4squgmzYHFuFh7B1l55sumOzQZnFt3x3KkGPI6PkTql7cfoAc46DdJIqM/3ZSOCRvmB/52XsKsmE6Fx0rla18JzsB/zqj622aLk+YK1sb0O4dX4pHIeGwvt5DF1jxX3Q3mrYasCP9iv+ZyHNYOYiwK9e8YwCySdAmXonOXh5p8ViW88xJOzzUzhX+HpGXxnbAgaOUJ4E+EcR3Zk8oWN 5Kh3kKl+xnmU2iOVmQIWC6kbH6NBp05HU0YD5NkjJTVEtkusHUGBksjOmUqcoDFCmG/ZQ5h4CGciLMB9kPa458gmk8ed3YwjVI4felEEa2Wfi3L5HQQbgXJnARyPmKsphbJzZLafS1B/cDhcjw0ZVfuAFEm8i04Gjg/xntEeMu1VhYMzVLOJJuBUbL9JVMFkOgnw/5bJXXwK3XzL17rJlJuRH39VATbHoeOWoZafaKxJPwrwVg69gZDNrNmBDXFp03MQQIvfeYeM/ncVuOGkiJnk40usqf86pcvmdsj4nx9Y/2s3Oj6y/yIPP824k9dSPokYWy5Nwx98y7iz+bDli6sVJ+LldD/nOckSigJs+YoWrxvhInf6rCjA+Rjyy6HFIOaRvosgvs4FJnpgIHhU23rPaA7j4Do2CgkYFR35IrfGug1aXVenyGEkkUFYujZC4gAWxJFMdxGJ43emjqOhqK/3nH+dN1ohEZiE8L24Tpcvz6yozIg3HEHORYba08tzzSHgGK4rtZAA+LSV8nF/Y+2WzlHM03kyylLQfV8l3Sa7BRfP4YlfytQx7wc00Q5EDN0l9lfNiJeK+Qv1xMaynBBoNuAEZVSwATP2CwCkBYOqSnf3nEbDdbcYYXjm1ya0 X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: 880eb860-5d86-40f8-a286-08dbb37826d8 X-MS-Exchange-CrossTenant-AuthSource: PR1P264MB3448.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 12 Sep 2023 10:08:03.3135 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: F5a94PfFi7YsDoR1ME8K9vSIk+O+VWr9i0qXYkVJVo3M2yqu3R7GhpChFHKVwRVqnSZ6Gz5Wkk1inqDbRN9SoA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: MRZP264MB2442 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Summary: Add a new fast_mult_optab to define a pattern corresponding to the fast path of a IEEE compliant multiplication. Indeed, the backend programmer can change the fast path without having to handle manually the IEEE checks. gcc/ChangeLog: * internal-fn.def: Add a FAST_MULT internal fn * optabs.def: Add fast_mult_optab * tree-complex.cc (expand_complex_multiplication_components): Adapt complex multiplication expand to generate FAST_MULT internal fn (expand_complex_multiplication): Likewise (expand_complex_operations_1): Likewise --- gcc/internal-fn.def | 1 + gcc/optabs.def | 1 + gcc/tree-complex.cc | 70 +++++++++++++++++++++++++++++---------------- 3 files changed, 47 insertions(+), 25 deletions(-) diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 0ac6cd98a4f..f1046996a48 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -396,6 +396,7 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT90, ECF_CONST, cadd90, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) +DEF_INTERNAL_OPTAB_FN (FAST_MULT, ECF_CONST, fast_mul, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) DEF_INTERNAL_WIDENING_OPTAB_FN (VEC_WIDEN_PLUS, ECF_CONST | ECF_NOTHROW, diff --git a/gcc/optabs.def b/gcc/optabs.def index d146cac5eec..a90b6ee6440 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -344,6 +344,7 @@ OPTAB_D (cmla_optab, "cmla$a4") OPTAB_D (cmla_conj_optab, "cmla_conj$a4") OPTAB_D (cmls_optab, "cmls$a4") OPTAB_D (cmls_conj_optab, "cmls_conj$a4") +OPTAB_D (fast_mul_optab, "fast_mul$a3") OPTAB_D (cos_optab, "cos$a2") OPTAB_D (cosh_optab, "cosh$a2") OPTAB_D (exp10_optab, "exp10$a2") diff --git a/gcc/tree-complex.cc b/gcc/tree-complex.cc index d814e407af6..16759f1f3ba 100644 --- a/gcc/tree-complex.cc +++ b/gcc/tree-complex.cc @@ -1138,25 +1138,36 @@ expand_complex_libcall (gimple_stmt_iterator *gsi, tree type, tree ar, tree ai, static void expand_complex_multiplication_components (gimple_seq *stmts, location_t loc, - tree type, tree ar, tree ai, - tree br, tree bi, - tree *rr, tree *ri) + tree type, tree ac, tree ar, + tree ai, tree bc, tree br, tree bi, + tree *rr, tree *ri, + bool fast_mult) { - tree t1, t2, t3, t4; + tree inner_type = TREE_TYPE (type); + if (!fast_mult) + { + tree t1, t2, t3, t4; - t1 = gimple_build (stmts, loc, MULT_EXPR, type, ar, br); - t2 = gimple_build (stmts, loc, MULT_EXPR, type, ai, bi); - t3 = gimple_build (stmts, loc, MULT_EXPR, type, ar, bi); + t1 = gimple_build (stmts, loc, MULT_EXPR, inner_type, ar, br); + t2 = gimple_build (stmts, loc, MULT_EXPR, inner_type, ai, bi); + t3 = gimple_build (stmts, loc, MULT_EXPR, inner_type, ar, bi); - /* Avoid expanding redundant multiplication for the common - case of squaring a complex number. */ - if (ar == br && ai == bi) - t4 = t3; - else - t4 = gimple_build (stmts, loc, MULT_EXPR, type, ai, br); + /* Avoid expanding redundant multiplication for the common + case of squaring a complex number. */ + if (ar == br && ai == bi) + t4 = t3; + else + t4 = gimple_build (stmts, loc, MULT_EXPR, inner_type, ai, br); - *rr = gimple_build (stmts, loc, MINUS_EXPR, type, t1, t2); - *ri = gimple_build (stmts, loc, PLUS_EXPR, type, t3, t4); + *rr = gimple_build (stmts, loc, MINUS_EXPR, inner_type, t1, t2); + *ri = gimple_build (stmts, loc, PLUS_EXPR, inner_type, t3, t4); + } + else + { + tree rc = gimple_build (stmts, loc, CFN_FAST_MULT, type, ac, bc); + *rr = gimple_build (stmts, loc, REALPART_EXPR, inner_type, rc); + *ri = gimple_build (stmts, loc, IMAGPART_EXPR, inner_type, rc); + } } /* Expand complex multiplication to scalars: @@ -1165,13 +1176,18 @@ expand_complex_multiplication_components (gimple_seq *stmts, location_t loc, static void expand_complex_multiplication (gimple_stmt_iterator *gsi, tree type, - tree ar, tree ai, tree br, tree bi, + tree ac, tree ar, tree ai, + tree bc, tree br, tree bi, complex_lattice_t al, complex_lattice_t bl) { tree rr, ri; tree inner_type = TREE_TYPE (type); location_t loc = gimple_location (gsi_stmt (*gsi)); gimple_seq stmts = NULL; + bool fast_mult = direct_internal_fn_supported_p (IFN_FAST_MULT, type, + bb_optimization_type + (gimple_bb + (gsi_stmt (*gsi)))); if (al < bl) { @@ -1232,9 +1248,10 @@ expand_complex_multiplication (gimple_stmt_iterator *gsi, tree type, { /* If we are not worrying about NaNs expand to (ar*br - ai*bi) + i(ar*bi + br*ai) directly. */ - expand_complex_multiplication_components (&stmts, loc, inner_type, - ar, ai, br, bi, - &rr, &ri); + expand_complex_multiplication_components (&stmts, loc, type, + ac, ar, ai, bc, br, + bi, &rr, &ri, + fast_mult); break; } @@ -1245,8 +1262,9 @@ expand_complex_multiplication (gimple_stmt_iterator *gsi, tree type, tree tmpr, tmpi; expand_complex_multiplication_components (&stmts, loc, - inner_type, ar, ai, - br, bi, &tmpr, &tmpi); + type, ac, ar, ai, + bc, br, bi, &tmpr, &tmpi, + fast_mult); gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); stmts = NULL; @@ -1297,10 +1315,11 @@ expand_complex_multiplication (gimple_stmt_iterator *gsi, tree type, } else /* If we are not worrying about NaNs expand to - (ar*br - ai*bi) + i(ar*bi + br*ai) directly. */ + (ar*br - ai*bi) + i(ar*bi + br*ai) directly. */ expand_complex_multiplication_components (&stmts, loc, - inner_type, ar, ai, - br, bi, &rr, &ri); + type, ac, ar, ai, + bc, br, bi, &rr, &ri, + fast_mult); break; default: @@ -2096,7 +2115,8 @@ expand_complex_operations_1 (gimple_stmt_iterator *gsi) break; case MULT_EXPR: - expand_complex_multiplication (gsi, type, ar, ai, br, bi, al, bl); + expand_complex_multiplication (gsi, type, ac, ar, ai, bc, br, bi, al, + bl); break; case TRUNC_DIV_EXPR: -- 2.17.1